Given that Piraha (and Warlpiri) do not allow embedded clauses, and Piraha does not allow recursion at all, clause embedding is not universal to all languages. It is reasonable to suppose that it does not appear in full blown form in pre-literate societies. So unlimited embedding of clauses is something that post-dates writing and evolved along with it. This whole question presupposes this thesis.
To support this, I will give the evidence of the Hebrew bible (since the only ancient language I can read is Hebrew). From reading a few books of the old testament, and translating them to English, I can attest that the grammatical complexity of the old testament does not give any clear evidence of a grammar which allows Charles Dickens or Henry James recursive styles. The oldest parts never go more than one embedded clause deep, and there are grammar errors which suggest that the authors had a problem with complex sentences.
(EDIT: I was told by a classicist that this is not so pronounced in ancient greek and latin. A classics student I asked told me that Homer's work is recursive, although not as complex as later writing. He claimed that the later latin writers, he named Cicero and Thucidedes, are about equally recursive as Shakespeare and Dickens. This was interesting, because their recursive sentences occur in heavily cased languages, in ancient times.)
I am not asking about this, I just ask you to accept this hypothesis for the sake of making sense of this question. Assuming that grammatical recursion evolved in the historical period, one can ask: what effects did the introduction of recursive grammar have?
I believe that the major effect of the introduction of recursion is to disfavor the complex case systems typical of ancient and pre-literate languages, in favor of using separate words like "to" and "of". While it is easy enough to add a case marker to a stand-alone word, like house, so as to say "I walked house-ward" (in an approximation to a case in English), this construction does not work when you replace "house" by "place where my mother in law was born". You can say "I walked to the place where my mother in law was born", but you can't say "I walked (place where my mother in law was born)-ward".
Cases and clause recursion don't play well, for the same reason that free word order and recursion don't play well. If you can rearrange the words in a sentence, you can't make clauses nest, because you just can't rearrange words outside of clause boundaries without wrecking the embedded clause boundaries, and you need clear start and end markers for clauses, which are not provided without the appropriate stand-alone function words "to","in", "of", "which","that", and so on.
This hypothesis makes the following predictions, and these are the questions I have:
Cases should be most diverse and pronounced in non-embedding languages, like Piraha. This is true of Piraha. Is it true of other non-embedding languages?
Languages that embed early in history should shed their cases gradually, as embedding takes over. This means that the most cased languages today should have a recent writing system, and the least cased languages should have an ancient writing system. For Russian and English, this works. Are there counterexamples?
Cases should be gradually more disfavored with time, as speakers internalize embedding, replacing the case markers with the function words. Modern Hebrew is a stunning example: in the Bible, you would alwayssay "Halachti le-beyto" to mean "I went to his house", using a possessive marker on the word "bayit" (house). But modern Hebrew speaker will always say "Halachti la-bayit shelo", "I went to the house of him", precisely because the word "shel" can be used to embed, as in "Halachti la-bayit shel ha-ach hagadol sheli" "I went to the house of my big brother". The Bible doesn't embed very much. Are there opposite events where the case system was strengthened? Perhaps after a loss of literacy?
The mainstream view of linguists is that case-shedding is a kind-of degenration of language with time, and that it occurs as speakers lose the quality of excellence of the early language. You hear this piffle from classicists when they discuss the Latin case system. I am saying that the cases disappear because the language is developing recursion, not because the speakers are getting dumber. They are getting smarter.
Is there any recursion in ancient latin? How many levels deep? How does it work?
Is there any evidence for this degenration idea?
(EDIT: It seems that the answer for ancient Latin is that the recursion is as developed as in modern languages.)
I am sorry for not separating the questions, but they are all about the same thing, and feel free to answer only one, any insight is welcome. I am not sure if it is in the linguistics literature, or if it is original to me. I think this idea is original, but please disabuse me of this belief if I am wrong.
EDIT: In response to comments
I got a downvote and weird comments, so I would like to make the hypothesis clearer. Consider the Hebrew sentence "and he walked to Shechem". In the Bible, it is "Ve-Halach Shchem-a" (and-(walked-he) Shechem-ward). In modern Hebrew, nobody would use this expression. They say "Ve-Halach le-Shchem" (and-(walked-he) to-Shechem).
What happened? The post-modifier "a" attaching to Shechem is replaced by a pre-modifier. Why does it happen? I know why--- because if you wanted to say "And he walked to the large mountain by the sea", you would say "Ve-halach la-har hagadol leyad ha-yam" (And-walked-he to-the-mountain the-big by the-sea). You could not say "Ve halach har-a hagadol leyad ha-yam", using the post modifier, without cluttering up the clause "har hagadol leyad ha-yam" with an extra syllable in the middle. By the way, if the Bible ever said such embedded things, it would say it this way:
"Vehalach har-a, ha-har hagadol leyad hayam". (And walked-he to-the-mountain, the-mountain the-big beside the sea.--- And he walked to the mountain, the big mountain beside the sea.)
It would use the post-modifier to make the mountain into the proper case, but it would then clarify what the mountain meant in a separate phrase that isn't really embedded. The case system is mucking up the embedding.
Nobody uses "mountain-ward" in modern Hebrew, except for some idiomatic phrases, like "halachti ha-baita" (I walked home-ward). The case modifier is replaced with what is essentially a stand-alone function word "le" or "la" (meaning "to" or "to the").
Why do you prefer the pre-modifier rather than the post-modifier? I strongly believe that the reason is that if you are doing Chomskian transformations which move phrases to different roles, you want to be able to move the whole phrase somewhere else as a unit, without mucking around inside to remove stupid dangling case-modifiers. So you prefer the function word "le" or "la", which functions exactly the same as the English "to". It comes at the beginning, and you just lop it off to make a stand-alone phrase that can be moved to another position in a different sentence.
To make the example clear, I will pretend that English can do the same thing. Suppose that English was able to say "I walked mountain-ward", using a modifier "ward" on words that tells you that you are walking to the mountain. Then you could say
"I walked mountain-ward with the statues of presidents carved into it"
or you could say
"I walked to a mountain with the statues of presidents carved into it"
The first form has a stupid case modifier in the middle of the clause that needs to be excised if you want to do a transformation. For example, if you want to say
"The mountain with the statues of presidents carved into it is pretty"
You would need to change mountain-ward into mountain, which requires mucking around inside the clause. This is much more intuitive when the function stuff occurs before the word, as a pre-modifier, rather than after the word, as a post-modifier.
So languages that deal with embedding on a regular basis like to have the syllable signifying "to" before the word, not after. In this position, it can drop off and become a separate word without any difficulty, and after this happens, you would say that the case has been lost, and a function word is gained.
In hebrew "le" is not a separate word in writing, but it might as well be, considering how it is used. It has the same exact function as the "to" in English, but it took over the "a" (-ward, as in homeward) post-modifier so much, that it is jarring to read the Bible and read "Lech Shchema ve-tagur sham." (Go Shchem-ward and live there). It seems obvious to me, considering how fast this happened (in the last 100 years) that the reason is that recursive clause embedding is common in modern Hebrew, and essentially nonexistent in Biblical hebrew.
The possessive 's of English is perhaps another reasonable example, but it is strange, in that you can apply it to clauses. I don't care about this weird exception. It seems that the rule is that when you have embedding, you prefer modifiers to happen at the beginning, and then they might as well drop off and become stand-alone function words.
This phenomenon of replacing post-modifiers or word-modifiers for case with a universal function word should happen to languages as they acquire recursive clause embedding. I want to know if this is true elsewhere than Hebrew.
EDIT: Explaining this in English
I will make up a case system for English, by replacing function words, so as to not have to write Latin examples.
verb modifiers will be as follows:
I walked to the store : I walked the-store-ward
I walked from the store : I walked the-store-from-ly
I walked on the beach : I walked the-beach-on-ly
I walked with my friend: I walked friend-of-mine-with-ly
etc., using the function word plus "ly" as the case word for a verb-attaching adjective-like (or argument like) phrase, except for "to-ly" which is "ward".
My bag with a zipper: Bag-of-mine zipper-with-ish
My coat on the desk : Coat-of-mine the-desk-on-ish
etc., using "ish" and the function word as the case marking for a adjective phrase
Then taking a sentence with embedding:
I walked to the store by the sea
I walked the-store-to-ly the-sea-by-ish
The second expression is easy to permute:
I walked the-sea-by-ish the-store-to-ly
But when doing a transformation, like asking
Is the-store by the sea green?
Is the-store the-sea-by-ish green?
I have to take the phrase "the store-to-ly the-sea-by-ish" and transform it by scanning to remove the "to-ly" from the store. In the phrase "the store-to-ly the-sea-by-ish", "the store" is the leading noun, and it gets the case marker.
the This doesn't look too hard in this example, but if the leading noun is buried deep in the noun phrase, it becomes more difficult:
I walked (sea-by-ish, which is next Greece-to-ish, store-to-ly which the Italy-of-ish health-department closed in Lord-of-ours-of-ish the-year 1999).
I walked (to the store by the sea which is next to Greece, which the Italian health department closed in the year of our Lord 1999).
I enclosed the relevant phrase in parentheses in both examples (the intended parsing is that the sea is next to Greece, not the store), and italicized the cased leading noun, which in the example is "store-to-ly".
Now if I want to transform the phrase "the store by the sea which is next to Greece, which the Italian Health department closed in the year of our Lord 1999." into a question about whether this store is still closed, I have to take the phrase in parentheses, scan inside this phrase past "sea-by-ish, which is next Greece-to-ish," to find "store-to-ly" and then transform that token into "store". The transformed question is
Is the (sea-by-ish, which is next Greece-to-ish, store which the Italy-of-ish health department closed in Lord-of-ours-of-ish the-year 1999) still closed?
You need to scan the phrase to remove the case markers from the leading noun. In standard English, no scan is ever required for a transformation--- you just move the entire phrase:
Is (the store by the sea which is next to Greece, which the Italian health department closed in the year of our Lord 1999) still closed?
Notice that the parenthesized NP is exactly the same as before, just dropping the initial "to the". This makes transformations easy, even with deep embedding, and it is also the definition of what makes a language uncased. So uncased is recursion friendly, and cased is not.
I believe that the desire to make phrases movable without modification during transformations is the driving force for the phenomenon of case-shedding in languages.
The examples above, applying the case-markings to the main noun in the noun phrase, is how it works in Latin.
Why do you think it "... is reasonable to suppose that [recursion] does not appear in full blown form in pre-literate societies..." ?
Because, as a Hebrew speaker, the language was dormant for many thousands of years, then revived. The revived version is fully recursive, and shed its case system within one generation, in favor of more recursive-friendly structures. One of the most annoying prescriptivist Hebrew things is "shmi Ron" (name-mine (is) Ron) taught to all beginner Hebrew learners. Nobody says this anymore. They say "Hashem sheli Ron" (name of mine is Ron), and I know this is because "shel" allows recursion and the ending does not. Ancient Hebrew doesn't embed very well, and Piraha not at all.
Basque can do things like "I walked (place where my mother in law was born)-ward" just fine.
I don't believe you. Please give example.
No, no and no. It's not true that Warlpiri has no embedded clauses. Finnish has recursion and embedding and a complex case system. There are very many non-literate societies that have languages without case systems. Finally, it is not 'the mainstream view of linguists that case-shedding is a kind of degeneration of language with time [etc]'.
I don't agree, and if you would actually support your wrong statements, I would accept it as an answer.
My linguistics teacher pointed out that English has a possessive "case" marker (not sure if I'm using the word "case" right) that can apply to phrases: in "the king of France's crown", the crown belongs to the king, not to France. It is confusing, however, to use the possessive marker on longer phrases like "the best darned analyst in the whole wide world of personal finance's ugly yellow house". I can't answer your question but I guess I see how cases and recursion don't play well IF the cases "need" to apply to a phrase rather than just a word.
Yes, this is exactly what I was getting at. In english you would normally say "The ugly yellow house of the best darned analyst in the whole world of personal finances", replacing the case marker by "of" (and rearranging the order to the recursion friendly order). This de-casing is obviously required for clear embedding, but most linguistic examples use absurdly short sentences with at most one level of embedding, so the effect is invisible. Only very complex sentences reveal interesting things about case structure and recursion.
I don't think that's what Qwertie (cool name btw!) meant. It is precisely the fact that the possessive 's in English is not the most typical example of a case (if any at all) that makes it hard to embed things with 's. It is dead simple with regular cases: poetarum Graecorum ex saeculo Periclis magni opera non placent mihi. Or das Haus des guten Mannes meiner Schwester.
No, that's not what Qwerty meant--- he meant what I said. If you provide a latin example, please provide a word-for-word gloss so that I can understand the latin--- I am curious about this. I will give the Hebrew examples in the question, with a gloss for non-Hebrew speakers.
sorry about not glossing the Latin: Johhannes Rulinc occid-it [[vir-um].N [qui nomin-at-us fu-it Bokelere].RelCl].NP: J. R. kill-3sg.sbj man-acc.sg who.masc.sg.acc name-pst.ptcpl-masc.sg.acc was.pst-3sg.sbj B. There is no problem with case-marking a heavy NP in Latin because the case marker just has to be on the head noun, not attached to either edge of the full NP.
It may be useful to look at WALS (World Atlas of Linguistic Typology). Here's WALS feature 49A, 'Number of cases'. In the chapter text and look at the map you'll see where case-marking languages are found and how many cases they have. There are many non-literate (or only recently literate) societies with no case marking. You'll also see languages which are Western, with long histories of literacy, and much case-marking. You could also look at WALS 23A to see how grammatical relations may be marked other than by case.
Thanks for the gloss. Your example is too trivial. There are cases where the "head noun" in a nounphrase is far away (pehaps not in latin)--- "I walked from purple pointy other-worldly mountains, the kind you see once in a lifetime, to wherever it is that condemned people go." The nounphrases are "purple pointy other-worldly mountains, the kind you only see once in a lifetime" and "wherever it is that condemned people go". What is the latin equivalent? How would it transform case?
I think I am starting to see the issue here. @RonMaimon is assuming that case markers are clitics attaching to full noun phrases, so it would be difficult to parse a very heavily embedded NP that is case-marked. But I think the more usual situation is that a case markers affixes to a head noun. In this random Latin sentence "Johannes Rulinc occidit virum qui nominatus fuit Bokelere" (J.R. killed a man who was named Bokelere) the NP "virum qui nominatus fuit Bokelere" bears accusative case, but the case marking just attaches to the head noun.
You're probably right: he is looking at the possessive 's, which is a clitic that comes after the entire noun phrase, which is one reason not to consider it a case. (Note: normally, cases are not limited to the head noun: any adjective or apposition to the head noun generally agrees with the case of the head noun.)
I was not looking only at possessive s, I was also looking at Hebrew post-modifiers, which attach at the end. Even when you attach it to the leading word, if you do a post-attachement, it is not convenient for transformations to do the attachement. If you do a pre-attachment then you easily can break off the modifier to be a stand-alone word.
Yes, I think you are broadly right. He thinks that the fact that the case marker is bound to one word in the NP makes it difficult for it to apply to the whole NP. There are several strategies: case markers can be affixed to every word in the NP (many Australian languages do this); case marker affixes to head of NP and so the whole NP is treated as being in that case.
The OP also seems unaware that case markers can be prefixes so may not be that different to having a preposition (it can be hard to distinguish them, espec if the preposition is a proclitic). Also, he maybe doesn't know that some languages have postpositions, which would come after the NP, and so perhaps not much different from the problem he sees for case markers.
Post-positions do not work with recursion, how would you embed multiple levels? As in: "I went to the room where she went to the corner where the fly crawled to the wall". It is absurd to think this becomes "I went the room where she went the corner where fly crawled wall totototo", so the to must occur next to the head noun of the NP, in which case I would not distinguish such a thing from a case marker in postfix position. Your comments are off base. I should add that casing every word in an NP proves a language is not recursive.
BTW, Biblical Hebrew did not have a complex case system. In fact, it did not have case marking, apart from the allative (or 'to/towards') that occurs in your example, and fossilised case marking on pronouns (much like what we still have in English). Case marking occurred in Proto-Semitic but was lost by Biblical times.
The solution of casing every word of a clause is manifestly incompatible with recursion. The solution of casing the head-noun requires cases for things that are not nouns, like whatever, and makes transformations difficult. I never said Hebrew had a complex case system, it has a simple case system, which essentially just a post-modifier for "to" and a few others. But even this simple case-system was shed when full recursion was adopted. I am aware that case-markers can be prefixes, I mentioned it in my question, and in that case I do not distinguish them from prepositions.
I have written a completely new answer below, based on the latest version of your question, which I think better addresses your perspective.
This seems like a great question, why does it not have more upvotes?
It has few upvotes because it started out with about 4 downvotes! It is an original idea, and people downvote original ideas as a rule, because of their stupidity and hostility to ideas coming from outsiders. It's a form of jealousy, from those who can't think any longer, having traded in their ability to do original thinking in exchange for a limited amount of authority.
I edited this question in response to Karlsson's paper, "Constraints on Multiple Center-Embedding of Clauses" (Journal of Linguistics 43 (2), 2007, 365-392), linked here: http://www.ling.helsinki.fi/~fkarlsso/ceb5.pdf .
Given that Piraha (and Warlpiri) do not allow embedded clauses, and Piraha does not allow recursion at all, clause embedding is not universal to all languages. As Karlsson persuasively argues, full blown recursion developed in the historical period. I will assume this here, and I will not repeat Karlsson's arguments.
In the Hebrew bible, there is no more than one embedded clause in any given sentence, consistent with Karlsson's bound for pre-literate societies, and there are grammar errors which suggest that the authors had a problem with complex sentences. As Karlsson explains, Greek and Latin writers developed full blown center-recursion explicitly somewhere around the 1st and 2nd centuries BC, and there is no evidence that recursion existed before then in any form
As recursive embedding becomes common, it is possible that speakers prefer constructions which allow embedded clauses to take on different roles with no internal modifications. If you say
I walked home-ward quickly
and you try to modify home to the embedded noun-phrase "the wide-open field where John slaughtered the goat", you say
I walked wide-open field-ward where John slaughtered the goat
this requires inserting a syllable in the middle of the noun-phrase "wide-open field where John slaughtered the goat". If you want to move the noun-phrase to the subject position of a different sentence, you need to scan the interior of the noun-phrase to remove the case-modifier:
The wide-open field where John slaughtered the goat is pretty.
If you have a stand-alone preposition, no scan is required
I walked to the wide open field where John slaughtered the goat
the words past "to" are exactly the same as when the noun-phrase is in the subject role. If the preposition can appear at the beginning of the phrase, it gives a clear "push" indicator for a mechanical parser, and it allows for trivial transformation of phrases to different roles.
compare the cased:
I walked store-from-ly the wide open sheep grazing field-ward where John slaughtered the goat.
Store from-ly means "from the store" in a case approximation in English.
I walked the wide open sheep grazing field-from-ly where John slaughtered the goat store-ward.
to the uncased:
I walked from the store to the wide open sheep grazing field where John slaughtered the goat.
I walked to the store from the wide open sheep grazing field where John slaughtered the goat.
There is no internal scan of the noun-phrases required in order to change their function. the words stay the same as the noun-phrase takes on different roles, only the preposition changes.
The case-introducing/case-removing noun-phrase scan is really annoying. It is computationally taxing, and I believe that it creates a linguistic pressure to remove cases from a language, and replace these with prepositions. When there are no cases, you have effortless noun-phrase embedding--- you just change the preposition, which always appears at the beginning.
I personally witnessed case-shedding events, in modern Hebrew. In the Bible, you would always say "Halachti le-beyto", "I went to-his-house" to mean "I went to his house", using a possessive marker on the word "bayit" (house). But modern Hebrew speaker will always say "Halachti la-bayit shelo", "I went to the house of him", precisely because the word "shel" can be used to embed, as in "Halachti la-bayit shel ha-ach hagadol sheli" "I went to the house of my big brother".
In addition to a possessive marker, which was shed, Biblical Hebrew also has a "to" case, so that it says "I walked Schem-ward" ("Halachti Schem-a"). Modern Hebrew, despite prescriptivist admonitions, also dropped this case entirely, so that all modern speakers use the recursion friendly: "I walked to shchem" ("Halachti le-Schem").
From my own native speaker intuition, I know that this is a consequence of the ubiquitous embedding in modern Hebrew. You don't say "I walked city-ward by the sea" in modern Hebrew, it is ungrammatical. you say "I walked to the city by the sea", with the exact same form as in English.
The Bible doesn't embed very much, and if it wanted to do this, it would say it in a pre-literate way that suggests recursion is completely alien to the author: "Halachti Schema, zu ha-'ir le-yad ha-yam"/"I walked city-ward, this is the city by the sea."
So the hypothesis is that recursive languages shed their cases as soon as most speakers begin to produce and transform multiply embedded sentences on a regular basis. This happens at different times for different languages.
This hypothesis makes the following predictions, and these are the questions I have:
Cases should be most diverse and pronounced in non-embedding languages, like Piraha. This is true of Piraha. Is it true of other non-embedding languages?
The most cased languages today should have a recent writing system, and the least cased languages should have an ancient writing system. For Russian and English, this works. Are there counterexamples?
Are there opposite events where the case system was strengthened? Do these correspond to a loss of literacy?
I have read the latest edited version of your question, and I think I understand what you mean now.
It is important to make a distinction between an ordinary suffix and a case ending (which could be considered a special kind of suffix). Consider the following two examples:
Magni viri liber est in mensa.
"The large man's book is on the table."
The Latin sentence means exactly the same as the English one. The word liber ("book") is the subject of the sentence, and so it is in the nominative; viri ("of the man") is in the genitive, as indicated by the case ending -i (the nominative would be vir). The adjective magni ("large") is also in the genitive (nominative magnus). The head word (viri) of the noun phrase (magni viri) determines the case, number, and sex of the whole noun phrase; the adjective must always be in the exact same case, number, and sex.
This is a central principle in most of the case-heavy languages, including Latin, Greek, and German. Perhaps it would be better to characterise them as heavy in noun inflection, as this includes number and sex too. This principle is called concord or agreement, as you may know.
Consider the contrast with the English phrase: only the head of the noun phrase, the noun itself, is marked (with 's). Because the English suffix does not involve concord, I would not call it a case ending. The principal remaining concord in English is between subject and verb (I am v. he is). How about Biblical Hebrew? Is only the head marked, or adjectives as well?
Can you see now how easy it is to add a relative clause to an inflected noun phrase?
Pater magnum virum in urbe captum, qui pius erat, servavit.
[Father large man in city having-been-captured, who pious was, saved.]
"Father saved a large man captured in the city, who was pious."
This time virum ("man") is in the accusative, because it is object. I have marked all words agreeing in case/number/sex with virum in bold. The word pater ("father") is marked as subject by the nominative. The finite verb of the main clause is servavit ("saved").
The parsing difficulties in your the wide open sheep grazing field-from-ly are not present in the Latin, simply because the first word of the noun phrase (magnum) is already marked as object. Imagine if the word the in your field phrase were already marked as "from-ly". It would look like this:
I walked the-from-ly wide-open-from-ly sheep-grazing-from-ly field-from-ly, where John slaughtered the goat, store-ward.
See how the parsing problem disappears? You don't need to hold your breath for the function of the noun phrase to appear.
German works the same way, and Ancient Greek too. The latter does have a few suffixes that serve as case endings but are only attached to the head word; and you are right, those can normally only be used when the noun phrase consists of a single word, no dependencies or embedding. But I can only think of two such suffixes, and they are rare and/or semi-archaic.
[house-ward to return]
The word οἶκός (/oi.kos/) means "house"; the suffix δε (/de/) means "towards"; νέεσθαι (/ne.e.stai/) means "to return". (The accusative ending -όν (/on/) is used because the suffix -δε requires the accusative.) So the whole means "to return home". This suffix cannot be used with complex noun phrases, as you suggested.
As to your suggestion that embedding a clause in the middle of another clause did not happen until the Hellenistic period, that is just not correct. It took me a whole two minutes to find a counter example, in lines 11 and 12 of the Odyssey. Homer's work comprises the oldest European literature in existence, probably composed in the 8th century BC.
ἔνθ᾽ ἄλλοι μὲν πάντες, ὅσοι φύγον αἰπὺν ὄλεθρον,
οἴκοι ἔσαν, [πόλεμόν τε πεφευγότες ἠδὲ θάλασσαν]
"then all others, whoever escaped sheer destruction,
were home, [having escaped the war and the sea]"
The word ὅσοι is a relative pronoun referring back to ἄλλοι μὲν πάντες, "all the others"; it introduces a subordinate clause with the finite verb φύγον, "escaped". The main clause has ἄλλοι μὲν πάντες as its subject, and ἔσαν ("were") as its finite verb.
The Homeric poems were meant to be recited out loud in the correct rhythm (metre). In fact, nearly all literature of Antiquity was meant to be read out loud up until the emergence of novels, around the first century AD. So people were able to use and understand (centrally) embedded clauses in speech throughout European history. I agree that they are generally less easy to parse than embedding at the end of a sentence, so there were obviously some restrictions, just as now; but central embedding was used. It is probably very, very old.
One thing to note is that relative pronouns were relatively new: it can be observed in Homer that relative pronouns were still in the process of developing out of demonstrative/personal pronouns of the third person (there never was a strong distinction between demonstrative and personal pronouns of the third person in Latin and Greek throughout Antiquity). However, participles are much older, as are conjunctions; and both can be used to embed verbs with complements just as well, and were in fact so used (e.g. in Homer).
I am not offended. I don't speak latin, so please provide a gloss, like you did in the comments. I am interested in multiple levels of recursion only, so I prefer highly embedded artificial examples, like "I know of Mary that she knows of John that he knows of Martha that she knows that I am not dead." I disagree with you from personal experience with languages that I happen to personally know, and Hebrew has some cases, so I know how they work. The issue is that word markers that muck up clauses don't play well with transformations.
That kind of side-branch embedding can repeat many times and still be pretty comprehensible. I think centre-embedding is much more interesting, as well as harder to process--it's usually said that humans can't cope with more than three levels. Eg: 'This is the bus that the car that the professor that the girl kissed drove hit'.
yes, center embedding is more interesting, but for the purpose of casing annoyance, it doesn't make much difference, so I didn't bother with it.
But surely centre embedding is more relevant to your mention of limits to embedding depth as it's much harder to comprehend deep centre embedding than deep side-branching embedding.
I didn't mention limits to embedding depth--- only in Piraha is there a clear limit--- at most one embedding (and that's because there is no actual embedding). In real embedding languages there is no upper limit to center embedding, other than good writing style, which is identified by native speakers. With effort, and time, a native speaker of a true embedded language will eventually identify arbitarily nested center embeddings are grammatical.
No, that's incorrect. Humans cannot cope with arbitrarily deep centre embedding. This is a much studied area, see (eg) "A usage-based approach to recursion in sentence processing" by Christiansen and MacDonald, in "Language as a complex adaptive system". Also you'll find there mention of centre-embedding in Finnish, a strongly case--marking language.
All common languages have center embedding! Cased or uncased. That's universal grammar. I am not saying you can't embed with cases, only that embedding drives cases to disappear, because cases are inconvenient for embedding. it is correct that humans have a hard time coping with deep center embedding, so what. By the way, third sentence below is a 5 level center embedding that's easy to parse: "I wanted you to read me the book. You took it upstairs. If you tell me what you took the book I wanted to be read to out loud from upstairs for, in your own words, I'll forgive you."
This answer is not correct--- according to Karlsson's analysis, which seems to be correct, Greek and Latin writers seem to have invented deep center recursion, and the reason it works the same way in all European languages is that the Latin speakers transferred it through their style manuals and translations of these to all other local languages. Center recursion was not restricted to upper-class folks: (John 2:9) When the ruler of the feast had tasted the water that was made wine, and knew not whence it was: (but the servants which drew the water knew;) the governor of the feast called...
Reading that passage of John aloud, I cannot help but using different intonations, in particular the bracketed part about the servants with a different voice. Maybe some sort of concurrency rather than recursion serves to memorize such hypotactical constructions.
It's not just the bracketed part--- an ancient Hebrew style for this sentiment would read like this: When the ruler of the feast had tasted the water, that water which was made wine, and the ruler knew not whence it was. But the servants, those servants who drew the water knew. Then the governor of the feast called....", it would not have the long flowing complex sentence structure as it does above, with modern implicit attachment rules. The bracketed part is nothing special, the whole of John is flowing and recursive, and later Hebrew texts like Ecclesiastes are recursive in similar ways.
Hi Ron! When you look at the examples I gave, notably Homer, do you still feel later writers invented hypotaxis / subordinate clauses in the middle of a sentence? Homer reflects oral culture.
Greek culture invented this early, Homer is different from other pre-literate cultures. I don't read Greek, but others have pointed this out. It is possible that the historical authors identified as Homer were the first to invent this style, leading to the golden age of Greek literature, I don't know. But it's definitely unique, and it biases people who study ancient stuff, because they study Homer and not other things. Look at the epic of Gilgamesh or the Hebrew Bible for non-recursive texts (I don't know Gilgamesh, from hearsay).
And how do you feel about my suggestion that using cases for nouns and adjectives and gerunds and participles makes it easier to embed such phrases? I'm sure it is very different from (and probably much easier to do than in) Hebrew.
Cases make it a little harder, and anyway, Ecclesiastes is in perfect Hebrew and sometimes is as recursive as Greek, so it's not the language, it's the author. The stuff I did is what they do in Genesis and Exodus, they don't do that in Ecc. Even Genesis has number recursion and tail recursion, it's just not very center embedded.
The question deals with two ideas: the possibility of recursive embedding of clauses, and the use of case systems. We are asked to entertain a hypothesis that has the following basic parts:
There was some kind of early state in prehistory at which no language had embedding, and where all languages had elaborate case systems.
The advent of writing encouraged the development of embedding.
Embedding developed at the expense of case-marking.
All languages in the world are at some point in the continuum between having an elaborate case system and having recursive embedding.
To be able to begin to entertain this idea, we need to show that case marking and embedding are strategies which can serve the same purpose. I think that this is false.
Case-marking is a strategy for marking dependency relations between heads and dependents, where the marking falls on the dependent. Alternate ways of indicating these relations are by marking the head, or by marking neither, but using certain word order patterns to indicate different types of grammatical relations. So if a language lacks case-marking, then it would be reasonable to suppose that it is either a head-marking language, or it has relatively rigid constituent order. We could also suppose that case-marking languages will have more flexible constituent order, and will be less likely to mark grammatical relations. The choice between case-marking and some parallel strategy affects the organization of the language at the clause level.
Clause embedding, however, is a strategy that is reflected primarily at the level above the basic clause. When two clauses are linked, the linking strategy could involve coordination (no embedding), subordination (one clause is embedded in the other), or cosubordination (both clauses are embedded in some higher structure).
Basically, I can't see now how this theory has any a priori plausibility right now. I don't think it should be explored further until we can get an explanation for why, on linguistic grounds, case-marking as an organizational strategy in grammar should compete with, rather than work in conjunction with, clause embedding.
This is reasonable, +1, but I am not saying that cases compete with embeddings, or that they serve the same purpose. Rather, I am saying that embedding noun-clauses in a language that has noun cases is difficult, because you need to know how to case clauses. The different interpretations of nesting are not particularly relevant--- just use a simple noun-phrase and try to case it, and you'll see that it is not so simple, or if it is, that it requires some scanning to do a transformation.
so I think the point you are interested in is much narrower than the question suggests. You are interested in how relative clauses are formed in languages with elaborate case-marking systems. Btw, what do you mean by "requires some scanning to do a transformation" ?
"requires scanning" means that if you have a cased head noun, and you want to do a transformation, you have to decase the head noun, which means scanning the interior of a lexical unit. For example: "I walked the-mountain-ward with the presidents heads carved in" the clause is "the-mountain with the president's heads carved in", and the "ward" is sitting right in the middle of the clause. This means I need to look inside the clause to get rid of the "ward" marker, and this is inconvenient for embedding.
I disagree that "embedding noun clauses in a language that has noun cases is difficult." In Modern Russian there are six (structural) cases (semantically much more) and they embed noun clauses just fine.
If you need examples with glosses, give me an English example and I'll post it in Russian.
I am not doubting that it can be done, I am just wondering whether it is convenient: make Russian: "I walked to the ugly white room where I walked to the corner where I walked to the wall." and "The ugly white room where I walked to the corner where I walked to the wall was hot." The nounphrase in both cases is "the ugly white room where I walked to the corner where I walked to the wall", and the question is whether the case-markings on "room" have to annoyingly change between the first and second version of the same embedded phrase, in a way that requires looking inside the clause.
your examples don't have embedded "noun clauses', all I can see is relative clauses.
See Huddleston and Pullum 2005: 176 for examples of what is sometimes known as a noun clause in non-generative theories of grammar.
I explained what I meant by "noun clauses", I meant NPs, and there was no confusion, since I explained the term precisely. I will use the correct term from now on, although I hate linguistics jargon, since it seems to only be used as a barrier to entry for linguists to protect their field from outsiders. It mostly is describing trivial stuff.
If you make the sentences above Russian, it would really clarify how the case system works with NPs in Russian. In my experience, people tend to introduce separate function words that mean "to" in addition to and separate from the case markings, to deal with embedded NPs. Then the cases disappear as the embedding words take over from the case markings, even for those single-word situations where the cases would work with no problem.
[I walked to the ugly white room] [where I walked to the corner]. These bracketed constituents are the only two clauses in this sentence. [where I walked to the corner] is not an NP.
Not at all: "I walked to the ugly white room where Sue had an epileptic seizure, quickly". The NP is [the ugly white room where Sue had an epileptic seizure], the "where Sue had an epileptic seizure" attaches at a lower level to "ugly white room". This is the right parenthesization, and the bracketed constituents you identify are not useful because they don't respect the parse-tree. In the example I gave [the ugly white room where I walked to the corner where I walked to the wall] is the embedded NP.
[I walked to the ugly white room] [where Sue had an epileptic seizure]. I'd argue that the relative clause [where Sue had an epileptic seizure] attaches to [room] etc. Now, that's not an issue here. When you say there is an embedded NP [[the white ugly room [where Sue had an epileptic seizure]], what is it embedded into?
It is embedded into "I walked to NP quickly". This is the standard parsing of "I walked to the ugly white room where sue had an epileptic seizure quickly". The words "ugly" "white" and the "where sue had an epileptic seizure" all attach to "room" to make the NP "the ugly white room where sue had an epileptic seizure" and this is embedded into "I walked to NP quickly", taking the place of NP. This is not "my theory" of sentence processing (if only!), it is the standard theory of generative grammar.
Have you not seen my comment? I explicitly asked everyone something. If you need to extend your discussion, please move to the Linguistics Chat. By the way, your question has a close-vote already. If your question gets closed, do not edit your question randomly, please post a question on the Linguistics Meta so we can see if we can fix this.
If my question is closed, I will take it as an attempt by the linguists on this site to censor it, which is certainly what it will be, and I will not cooperate with this site any further. I will certainly not modify the question, because I believe it is a correct and important original insight. As far as your "extended discussion nonsense" I was ignoring that, as I always do, and as everyone should. You can always delete the comments later.
Closing is not censoring. It's a pause state that you can use at your advantage to improve your own question. I can't force you to edit it, as it is in your best interest. If you decide not to do it, I don't have much choice.
Closing is a preliminary step to deleting, and I will not edit my question, as it is fine as it stands. It has as many upvotes as downvotes right now, but even if it were at -70, it is a legitimate question that is motivated by genuine curiousity, and the thesis it is promoting explains several otherwise mysterious language phenomena. You have a choice--- you can not close it. There is nothing in it that merits closure. If you open a discussion on meta, I'll be happy to comment there.
Why should I open the meta discussion? You should open it, as you did on EL&U. You're right in one thing "Closing is a preliminary step to deleting", but it's incomplete, because it actually is: "Closing is a preliminary step to deleting, unless you fix it according to the suggestions". You said you're self-taught, but here there are real linguists, that do this as their profession, so maybe you can at least suspect that you made some mistakes. Let's stop commenting here, I'm currently in the Linguistics Chat, you can answer there.
I might have made some small mistakes, like misusing the phrase "noun clause" but the mistakes the professionals make are big and of principle. I only opened EL&U after closing BTW.
I don't understand what a 'noun-clause' is, nor what it means to 'embed a noun clause'?? Nor am I sure what it means to 'case clauses'
a "noun-clause" is a collection of words that make up a unit which acts the same as a noun in a sentence. For example "I dropped the egg on the floor", vs. "I dropped the round semispherical oblong object with whites and yolks, which can sometimes turn into a chicken, on the floor." "the round semispherical oblong object with whites and yolks, which can sometimes turn into a chicken," is a noun-clause. Casing it means making the whole phrase accept a case marker, the same way "egg" would. This is all basic computational linguistics, but I might be using nonstandard terminology.
Yes, that's non-standard terminology. What you're calling a 'noun-clause' is normally called a noun phrase (or 'NP' for short). This is the term standardly used in linguistics (and in computational linguistics). It's weird to talk about making a NP accept a case marker. Rather, a NP is used in a particular grammatical role in a clause, and therefore requires the appropriate marking. In some languages this marking of the grammatical role is done by case inflections, in others by indexing on the verb, in still others by word order.
I think it was clear enough. I know how languages work, you don't need to explain. Case markings need to modify the NP in a nontrivial way, to attach the function to the NP, and I guess they modify the "leading noun" (although the "leading noun" might be "whatever", as in "I walked to whatever that thing is". So you need to learn to case leading nouns and "whatever"s in NPs, and to search for leading nouns in transformations, so as to de-case them and re-case them when you change the function of the phrase. This is annoying, and easier if you drop the case for function words.
your knowledge of how languages work seems to be at variance with that of most other participants. You are going to have to interpret your insights in terms of standard vocabulary or point people to suitable references when you employ non-standard terminology, lest your insights be lost on people who cannot understand them correctly.
There was no confusion to anyone reading this, everyone knows what I was talking about. You should not exclude people because you don't like the words that they use. I will try to use standard terms in the future. Although, I notice that I used the phrase "noun-phrase" right in the first comment correctly, I just conflated it with "noun-clause". Sorry. It won't happen again.
sorry, I don't want to discourage you from participating (nor do I want to inflate this comment thread). "Noun-clause" is not a big deal, but what is eluding me is your theory of sentence processing, which I cannot recognize, and which most of your claims are hinging upon.
As I was commenting above, English has a possessive "case" marker that can apply to phrases and it exhibits an "allergy" to recursion.
On the other hand, if "cases" just refer to context-sensitive form variations like "be is am are was were", I see no clash with recursion there (likewise in Spanish, which has plenty of recursion but vastly more verb forms than English.) Esperanto has a simple two-noun-case system and a very productive system of affixes, but when you're used to it, it feels very natural even though the affixes can't mark phrases.
If we're simply talking about morphology that changes word meaning, I don't see a real issue there either. For example, in Esperanto I can say manĝaĵujo, a food container, with the "container" suffix -ujo, but ruĝa fruktujo "red fruit container" should mean "a fruit container that is red", not "a container of red fruit"... basically what I'm getting at is that affixes ordinarily apply only to words, not phrases, and this is no different from English affixes. Yet this doesn't feel like an incompatibility between recursion and morphology, merely a limitation on the complexity of morphological changes. If I want to say "container of red fruit" then I simply give up on affixes and say "ujo de ruĝa frukto", "container of red fruit".
Now I can only talk about the languages I know about (English, Spanish and Esperanto, all in the same family I'm afraid) but it seems to me that word-level variations such as case markers and affixes can be thought of as providing shortcuts for standalone words like "of", "that", "female" (stewardess) etc. Yes, word-level changes don't play well with recursion but that doesn't necessarily encourage them to disappear. As long as they provide some utility that the stand-alone alternatives don't, they will tend to stay. For example, English will keep "'s" for the foreseeable future because it is more concise than "of" (and because "of" is more ambiguous, having more meanings than just the possessive); and this would probably be true even if "'s" were strictly applicable onto to words and not phrases. If Esperanto ever became really popular, the accusative case "-n" might disappear for most sentences, yet would probably remain whereever it provides information concisely, e.g. "hejmen" = homeward, to home.
Anyway, I'm a linguistic 'newb' myself. I suspect there are lots of forces acting on languages that neither of us appreciate very well, forces that probably outweigh the hypothesis of this question.
The reliance on Esperanto (an artificial language) makes me nervous, as it did not develop organically. It doesn't seem relevant.
+1: Your point about the ambiguity of "of" is important. If you use enough cases, I walked store window-with-ish hat-with-ly (I walked to the store with a hat with a window) loses the ambiguity of whether the "with a hat" applies to me or to the store. This is one of the central annoyances in natural language formal descriptions--- the verb argument/adverb-like-phrases and the subject modifiers/adjective-like-phrases commute and are usually all ambiguous.