Beech Reading

NIL 2-4 treats the PIE Etymon *bhah2g-ó- "beech". They mention that some scholars reconstruct long /a:/ and some (not always the same) scholars link it to the previously discussed *bhag-. In general I don't see any reasonable link between a tree name and a root meaning "share" etc. But there is a possible connection for the Germanic cognates meaning "book, letter".
NIL also mentions (FN 2 on p.3) that there are doubts about the relationship between the words meaning "book, letter" and the continuants of this root meaning "beech". The main formal problem is that the words meaning "letter / rune" seem to go back to a root noun, that actually is attested in Old English, while the beech words are eh2 stems (or n-stems derived from them). So we have derived morphology for what is supposed to be the original meaning ("beech") and a root noun for what is supposed to be the derived meaning ("book, letter").
Elmar Seebold, addressing this issue in his "Etymologie: eine Einführung am Beispiel der deutschen Sprache", (München : Beck 1981, quoted herafter as "Et."), also mentions that the proposed writing on beech tables that is supposed to be behind the change of the meaning from "beech" to "letter" is actually not attested, neither archaeologically, nor in written sources; Germanic runes are attested only on bark, stone, and various household objects (Et., pp. 290-291). It is also clear that the original meaning was "letter", not "book" - the oldest attestations mean "letter" in the singular and "document, book" in the plural, an obvious calque from Latin (littera - litterae) and Greek (gramma - grammata) (Et. p. 290). Seebold argues that the meaning "letter" is derived from the compound Norse bókstafr, Old Saxon bōkstaf, OHG. buohstap, whose second member means "staff". The writing of runes on staffs is widely attested (here Seebold is undermining his previous argument somewhat, as these staffs could of course have been made of beech wood, but the written source he quotes actually mentions ash wood).  
He then adduces a parallel from Welsh, coelbren "sign-wood, lot-wood", composed of coel "sign, omen" and bren "wood", designing a piece of wood covered with signs used to throw lots, a custom also attested for the Germanic people. He takes this parallel as an indication that the first element of bókstafr etc. originally meant "sign, omen, lot", and links it with our old acquanitance, the root *bhag-, reconstructing a root noun Proto-Germanic *bōk-s "lot, portion" (Et. pp. 291-291). That would mean that the word family of "book" is not related to "beech", but that the purported writing on beech tablets is only a folk etymology. (A shorter Version of these arguments can also be found in Kluge(-Seebold), "Etymologisches Wörterbuch der deutschen Sprache", s.v. Buch).
In general, I find this argumentation attractive. A formal problem is that the proposed root noun is attested only in Vedic, as a second part of compounds, with an active meaning "enjoying" (NIL p.1), but not as an independent word with a resultative meaning "allotment, lot", and that it would be the only continuation of *bhag- in Germanic. The same problems woud also arise if, as I proposed,  we eliminate the root bhag- and take its purported continuations as derived from the root bheg- "break"; that root also is not continued in Germanic (at least according to LIV p. 66/67 and NIL, p. 6; the forms with nasal infix mentioned in IEW p.115 look onomatopoetic, for which reason Pokorny himself states that they don't belong to *bheg- ). There is also no root noun formed from *bheg- attested in NIL (p. 6); if we eliminate *bhag-, we would have at least the Vedic root noun mentioned above, but still with the same problems. On the other hand, an isolated root noun from a root that otherwise doesn't have any cognates in Germanic is a perfect candidate for the kind of folk etymology discussed.   

PIE *bhag- and Armenian bak

This is a follow-up on my thoughts on PIE *bhag. I’ve come across an article by Hrach Martirosyan (“The place of Armenian in the Indo-European Language family: the relationship with Greek and Indo-Iranian”, Journal of Language Relationship, No. 10 / August 2013, p. 85 - 138), PDF here, where he adduces Armenian bak “courtyard; sheep pen, sun or moon halo” (missing in NIL) as a cognate of Indo-Iranian *bha:gá-: Sanskrit bha:gá- m. “prosperity, good fortune, property, personified distribution”, Old Avestan ba:ga- “part”, the descendants of which took on the meaning “landed property, fief, garden” (p.99, §5.1.3). Martirosyan admits the possibility that this is not a cognate, but an old loan from Iranian; he names one argument for it being a loan, namely the fact that the Armenian word is an a-stem, while the Indo-Iranian correspondences are o-stems; incorporation as an a-stem seems to be the expected outcome for an Iranian *ba:ga-; as another argument for a loan I would also see the fact that there seem to be no other formations from a root *bhag- in Armenian. On the other hand, it would have to be an old loan from before the Armenian consonant shift, but Martirosyan admits that there are other such loans.
If this is not a loan, but a cognate, it would require a proto-form *ba:g-a:-, which could be explained as a Vrddhi-formation from *bhag- or point to a PIE *bheH2g-eH2- (Martirosyan’s reconstruction). Therefore, accepting bak as a cognate would in any case require us to posit a root PIE *bhag- or bheH2g- separate from *bheg- “break” (continuants of the latter root are well-attested in Armenian).

Thoughts on PIE *bhag-

In my haul of presents this year there was a copy of NIL, so I embarked on reading it root-by-root. The first one is *bhag (NIL 1-2), and looking at the evidence for nominal derivations listed, I got a few ideas, which I’ll share below.

1)    The root has abundant nominal derivation only in two families, Indo-Iranian and Greek. These are the same families where, according to LIV 65, verbs formed from said root are attested. Interestingly, there are no matching derivations shared by both Indo-Iranian and Greek, except the o-stem *bhago- (m.): Sanscr. bhaga- “wealth”, Iranian baga- “god, allotment“, Greek phagos “eater” (originally only found as last element of compounds).

2)    Outside these families, the only attested formation is the above mentioned o-stem *bhago- (m.), found in Slavic bogъ “god” and the adjective compounds nebog- (and ubog-, not mentioned in NIL) meaning “poor”, and in Tocharian B pa:ke A pa:k “share”. Slavic also has a secondary derivation bogat- from *bagho-, formed with the productive suffix *-eH2to-. On the surface, therefore, we have three branches (Indo-Iranian, Slavic, Tocharian) showing a meaning “share, allotment, wealth”, and one branch (Greek) showing a meaning “eat”. Both NIL and LIV, following IEW and the communis opinio, take the meaning “share” to be the basic one and the Greek meaning to be a later development.

3)    According to footnote 1 in LIV, the Tocharian cognates are the main reason for positing *bhag, not **bheg with a schwa secundum as the source for Greek  ephagon ("ate" - suppletive aorist to esthio: "eat"). But as per footnote 8 in NIL, at least Adams in his Dictionary of Tocharian B classifies pa:ke as an Iranian loanword due to it having a plural in -nt-. Now, as, NIL states in footnote 6, it is widely assumed that Slavic bogъ loaned the meaning „god“ from Iranian. But it is also possible that the word itself with all its meanings is a loan from Iranian; after all, both meanings “god” and “wealth, allotment” are present in Iranian as well. The sound laws of Slavic don’t allow us to decide between loaned or inherited. But the fact that there are no old verbal formations based on bog- in Slavic and the absence of any cognates in Baltic, together with the identical dual semantics as in Indo-Iranian, speak, in my opinion, for bog- being a loan, not a cognate, in Slavic.

4)    In that case, the Tocharian forms could not be used as evidence for the existence of  /a/ as the root vowel. And instead of a three-to-one preponderance for the meaning “share”, we would have two different meanings in two different branches, as the Tocharian and Slavic correspondences to the Indo-Iranian formation, being due to loaning, not inheritance, should not be taken into account for reconstructing the original meaning.

5)    If, accordingly, there is no need to reconstruct a root containing /a/, it is possible to trace both the Greek and the Indo-Iranian words back to the root *bheg “break” (LIV 66 / IEW 114-115 / NIL 6). The development “break” to “share out” in Indo-Iranian is straightforward; in the verbal system of Indo-Iranian, we would have a neat case where the meaning “break” became associated with the nasal present which, as in Baltic, was spread also to the non-present stems (at least in Vedic), while the non-nasal forms took on the meaning “share”; in Greek, the meaning changed from break” to “eat”, either via the idea of sharing food or via the idea of cutting / chewing it; in any case, the assumed development in this case is not more tortuous than the assumed development “share” > “eat”. In Greek, the family of phag- would seem to be the sole continuant of *bheg.

6)    In summary, it is possible to eliminate the root *bhag “share” from the reconstruction of Indo-European, if one assumes that the Slavic and Tocharian cognates are actually loans from Iranian and that the Indo-Iranian and Greek cognates actually continue *bheg “break”.

I can’t say whether anything of the above is truly original, as I don’t have the means and the time to chase up even the references mentioned in NIL and LIV in order to see whether these thoughts have been discussed before. But as neither NIL nor LIV even mention such a possibility, I’d appreciate my readers to tell me if this has been addressed before and to point out any flaws in my reasoning.

Pullum on the world roles of English and Chinese

Over at Language Log, Geoffrey Pullum observes how English is currrently the world's lingua franca (obviously correct) and on how Chinese will not become the world's lingua franca "Not in fifty years, and perhaps not ever." His reasons?
First, there is no such thing as the Chinese language: Chinese is a language family, and there are far fewer people who are fluent in the politically dominant member, Mandarin, than the Chinese authorities would like you to think. Second, the Chinese languages share a writing system that is simply not fit for purpose: taking years to learn, and incredibly hard to adapt to many purposes, it is holding China's progress back by many decades. And third, nowhere in the world is there a country outside China where Chinese is used by non-Chinese to communicate with other non-Chinese.

Yeah, right. The existence of language varieties that are not mutually understandable (Scots, anyone?) and a crazy orthography sure have prevented the rise of English. Yes, it won't happen in the next fifty years, but Pullum's stance (although, of course, not his reasoning) looks a bit like that of an 18th century Frenchman regarding the possibility that the language of that rising merchant power from the neighbouring island would ever be able to challenge the dominance of French as the lingua franca of the civilised world. I doubt that orthography, writing systems, or the existence of non-standard varieties play any role in determining whether a language attains the status of lingua franca - it's all about the political, commcercial, and cultural influence of its speakers. It's fairly well possible that China will never reach the degree of political, commercial, and cultural influence that today's English-speaking nations (first and foremost the U.S.) have, but that (and not the writing system or whether Putonghua can crowd out the other Sinitic languages) will determine the status of Chinese in the future.

Longest Word in German Abolished

Maybe I ought to have added a few exclamation marks and used a bigger font. Maybe I ought to have added a few titillating pics. After, all, I just used a misleading headline in order to draw attention to this post. The rest of the post will try to be accurate, I promise.
Der Spiegel reports that the longest word in Germany that is actually in use has been "retired". What actually happened (and like with this post, you learn that when you read the article) is that lawmakers in the state of Mecklenburg-Vorpommern abrogated the Rinderkennzeichnungs- und Rindfleischetikettierungsüber- wachungsaufgabenübertragungsgesetz ("Law on the transfer of responsibilities for supervision of the marking of cows and the labelling of beef"). The second part of this (Rindfleisch- etikettierungsüberwachungsaufgabenübertragungsgesetz) was on record as the longest German word actually in current use. German has the ability to form theoretically endless compounds, which are also pronounced and written as one word. So it's easy to make up crazily long compunds to illustrate this, but the ones in actual use are normally quite short - normally the longer ones contain three- to four words. Longer compounds are mostly found in legal terms and scientific and technical terms. The longest one that's sufficiently frequent to be recorded in the Duden (the standard and standard-setting dictionary for German) is Kraftfahrzeug-Haftpflichtversicherung ("third-party motor insurance"), which is a combination of a three-element compound Kraftfahrzeug ("motor vehicle") that nobody uses outside of technicalese and legalese - in everyday language, you'd say Auto or Wagen "car" or Fahrzeug "vehicle", or shorten it to Kfz -, and a four-element compound Haftpflichtversicherung "third-party insurance" that, in everyday language, is often shortened to Haftpflicht (which, by itself, normally means "third-party liability").
Now, in the article it is stated the mentioned law was the longest compound in actual use since the Grundstücksverkehrsgenehmi- gungszuständigkeitsübertragungsverordnung ("Ordinance on the transfer of responsibilities for the approval of real-estate transactions") was abrogated in 2007 (in the article, length is measured by the number of letters). That reasoning is a bit curious - after all, a law can be referred to even after it is abrogated. The word can still be found in various collections of legal acts. And there's a meta-discussion out there about long German compunds where it's bandied about (going by the first four pages of a Google search I did, most attestations of the word are in discussions about long words and compounds, not in legal contexts). So the G-word is still being used, and the same reasoning is valid for Rindfleischetikettierungsüber- wachungsaufgabenübertragungsgesetz. To argue that a word is not used anymore because the thing it describes is not used anymore is a curious mix-up between signifier and signified. That notwithstanding, the article is worth a read.

Earl Grey

As someone interested both in languages and in tea, I liked the feature about the origin of the designation "Earl Grey Tea" in this week's World Wide Words Newsletter. Short version - the name seems to have been "Grey Tea" originally, and the association with the 2nd Earl Grey a late 19th century marketing ploy. It's not clear where the "Grey" in the name orginates, but it's possible that it refers to a tea trader of that name.

Burushaski and Indo-European?

In the past few months, I've seen several mentions of Ilija Časule’s theory that Burushaski is related to Indo-European. See Memiwayanzi for some discussions, comments and links. I repost here a short summary, which I originally posted on the ZBB, of an article by John D. Bengtson and Václav Blažek in the “Journal of Language Relationship (JLR)” No. 6 (August 2011) in which they look at the evidence presented in Časule’s 2003 JIES article. Bengtson and Blažek dismiss the evidence as not satisfactory and argue for including Burushaski in the putative Dene-Caucasian group. The bigger part of the article is taken up by demonstrating the Burushaski-DC relationship and arguing that there is much more evidence vor the latter than for a Burushaski – IE relationship. As I know almost nothing about both Burushaski and the languages of the DC macrofamily, I cannot really judge the quality of that evidence. I'll give a short summary of the points Bengtson and Blažek make concerning Časule’s evidence for a Burushaski – IE relationship; judging by what they adduce, the evidence is pretty feeble and Časule’s argumentation is not up to basic comparatistic standards. (All following quotes are from Bengtson & Blažek's article):
Phonology At first glance Časule’s comparison of IE and Burushaski phonology seems impressive. An ample number of examples is cited, and superficially it seems that Časule (henceforth “Č”) has made a good case for a correspondence between IE and Burushaski phonology. However, on closer examination a number of problems appear. (a) Some “Bur” words cited for comparison are actually loanwords from Indo-Aryan or Iranian languages Thus, dumáṣ “cloud of dust, smoke, water” (p. 31) is clearly borrowed from Old Indic dhūmáḥ “smoke, vapor, mist” (even the accent is the same); púrme “beforehand, before the time” (p. 34) is isolated in the Bur lexicon and looks like a derivative of OI *purima- > Pali purima- "earlier" (CDIAL 8286, cf. Eng. former); badá “sole, step, pace” (p. 40) appears to be from OI padám “step, pace, stride” (CDIAL 7747), and perhaps others. (b) Some comparisons adduced in support of the correspondences are sematically tortuous if not utterly dubious. For example, IE *dheu- “to die, to lose conscience (sic)” ~Bur diú “lynx” (p. 36); IE *h2erĝ-ṇt-om “white (metal), silver” ~Bur hargín “dragon, ogre”, etc. (c) The proposed correspondences are not consistent and do not form a coherent system. For example, IE ĝ, ĝh are said to correspond to Bur g (voiced velar stop) or ġ (voiced uvular fricative) (p.39), apparently in free variation, but in Bur bérkat “summit, peak, crest; height” (pp. 30, 35) IE ĝ is matched with Bur k (voiceless velar stop), in Bur buqhéni “a type of goat” (p. 31) IE ĝ is matched with Bur qh (aspirated uvular stop or affricate), and in Bur je, já “I” (p. 72) IE ĝh is matched with Bur j [ʒ´ = ]. IE *kw is said to correspond to Bur k (voiceless velar stop) (p. 38), but in Bur –śóġut “the side of the body under the arm, bosom” (p. 30) it is matched with Bur ġ (voiced uvular fricative), while in Bur waq “open the mouth, talk” (p. 38) it is matched with Bur q (voiceless uvular stop). PIE *w becomes Bur w in waq “open the mouth, talk” (p. 38), but b in budóo “rinsing water, water that becomes warm in the sun” (p. 31). For Č the Bur uvulars (q, qh, ġ) are merely variants of the velars and do not form an historical class of their own. (…) (d) Č totally overlooks (or minimizes) many distinctive features of the Burushaski phonological system. These features include (1) the retroflex stops, (2) the phoneme /y./, the uvular consonants, (3) the tripartite sibilant contrast /ṣ ~ ś ~ s/, and the cluster -lt-, and the t- ~ -lt- alternation (corresponding, we think to Dene-Caucasian lateral affricates).
B&B adduce a table of the Bur. consonant system (p. 27), which is indeed more complex than the PIE system, so any hypothesis would need to explain how we get from the simpler PIE system to the more complicated Bur system. Especially, as Časule seems to derive Bur from Phrygian, a daughter language of PIE – if Časule would assume a Proto-IE-Burushaski (PIB), he could of course postulate the more complicated system for PIB and derive PIE from there. He doesn’t seem to do either. B&B over the next 16 pages go into the distinctions shown in (d), make some remarks how Časule ignores them and adduce material for showing how well the Bur material fits into a DC reconstruction.
Some extracts from the Morphology part of the article:
Nouns In the Burushaski nominal system the case endings, as admitted by Č himself, are the same for both singular and plural. Bur therefore has an agglutinative morphology, not the inflected morphology typical of IE. We find the Bur case endings far more compatible with those of Basque and Caucasian, including the compound case endings found in all three families.
Now, in principle it’s possible that a IE language would develop towards an agglutinative morphology – IIRC, Modern Armenian has tendencies in that direction, but there it can be due to the influence of neighboring Caucasian languages. AFAIK, the Indo-Iranian languages in the Burushaski area don’t show such morphology, so it’s more likely that this morphology is inherited than an areal feature, which would argue against an IE origin. B&B then point out that many Bur nouns are bound forms that can only occur with a possessive prefix – again not an IE feature, but they argue that Yeniseian (a language they include in the DC macrofamily) nouns are similar in this regard and even reconstruct the possessive prefixes 1Sg. (Bur a-, Ket ab-, PDC *aƞa) and 2Sg. (Bur gu-, Ket ug-, PDC *uxGu-). They continue:
This type of construction is totally alien to IE patterns, as is the enormous number of different plural suffixes: about 70, as noted by Č (p. 23). So is the multiple class system of Bur., which is far more similar to class systems in Caucasian and Yeniseian than to gender in PIE.
B&B then show the Bur personal pronouns for 1st and 2nd person, here indeed the only pronouns resembling IE are the 1st person plural pronouns featuring a stem mi- / me-. They write:
Here we see that the Bur system is suppletive, with different sets for direct forms and oblique forms, in both first and second person. Č (p. 72) attempts to connect Bur je, já with PIE *H1eĝ(H)- but he can do so only by violating the sound correspondence discussed above (PIE *ĝ, ĝh = Bur g, ġ)! He further tries to connect Bur un (~ um, uƞ ) (Etymolist note: direct form of the 2nd sg.) with PIE tuHxom, emphatic form of tuHx = tū-, but again by requiring another unprecedented change: t > d > 0!
For good measure, they adduce the pronoun systems of the neighbouring IE languages (Dardic, Indo-Aryan, Iranian), which look very different from Bur.
Interrogative pronouns As stated correctly by Č (p. 74), Bur interrogative pronouns are built on bases containing the labials /m/ and /b/. Č connects the Bur interrogatives with the rare IE interrogation stem *me/o-, attested only in Anatolian, Tocharian, and Celtic.
That actually doesn’t look too bad to me – a feature attested only in these three languages has a good chance of being old.
We must point out, however, that the *mV- interrogative is much more richly attested in DC than in IE, and furthermore the m ~ b alternation is attested in DC, but not in IE:
(Examples from DC follow)
Verb In the verb the Bur variance from IE is just as pronounced as in the noun. The “typological similarity” claimed by Č (p.75) is only in regard to vaguely similar systems of aspects and tenses, without any material parallels pointing to common genetic origin. The verbal endings (Č, pp. 75-77) are similar only in that both Bur and IE have endings containing n and m, thogh there are no real correspondences between them. Most striking is the existence of the Bur template verbal morphology with as many as four prefix positions preceding the verb stem.
Table 14 on p. 48)
It is well known that Proto.IE had few verbal prefixes. The Bur prefixal template is far more compatible with languages such as those of the Yeniseian family, especially the well-documented verbal morphology of Ket, and of the extinct Kott; Basque, Caucasian (especially West Caucasian), and Na-Dene also seem to preserve distinctive features (multiple noun classes, polysysthesis, extensive verbal prefixing of pronominal and valence-changing grammemes) of the postulated Dene-Caucasian proto-language: …
Again, I’m not really able to judge the DC evidence, but even at first glance it looks much more similar to Bur than what Časule seems to adduce for a Burushaski-PIE relationship. At least anyone who judges the DC evidence as too weak would even more have to reject Časule’s hypothesis. Next B&B look at the numerals. They show the Burushaski system, which doesn’t look much like an IE derived system. Only the number “one” is a possible candidate for an IE link:
Now as to Č’s proposed material correspondences between Bur and IE numerals: the first, comparing PIE H1oi-no-s “one” with Bur hen / hin (class I, II) ~ han (class II, IV) ~ hek / hik (counting form) “one” is almost plausible, except that the form is characteristic of Western IE (Italic, Celtic, Germanic, Balto-Slavic), while forms with different suffixes H1oi-ko-s and H1oi-wo-s gave rise to the Indic and Iranian words for “one” shown above.
This looks indeed plausible, and it isn’t unthinkable that an IE language with Western characteristics would show up in NW India. After all, before the discovery of Tocharian, no IE scholar would have assumed that a centum language would be found in the Tarim Basin. But one numeral looking similar could be chance. And the word for “1” seems to be the only plausible candidate among the numerals:
For Bur *alto “two” Č suggests comparison with IE *H2al- “other” + ordinal suffix *-to-, in spite of the fact that this is not an ordinal but a cardinal number, and that the “suffix”-to- appears nowhere else in the Bur numerals.
B&B argue that “Bur /lt/ is a distinctive cluster that can be traced back to PDC lateral affricates” and compare alto to numerals in various DC languages. Whatever the merit of these comparisons, Č’s proposal is certainly weak.
Next, Č attempts to derive Bur altámbo “8” from PIE *oḱtō(u) “8”, “with a change of ak > al under the influence of the Bur numerals for 2 and 4” (p.75) In view of the holistic relationship of the Bur words for 2, 22, and 23, …, it seems unlikely to us that all the other IE numerals would be discarded and only “8” retained, with this odd change.
(Here B&B refer back to an earlier discussion where they showed how the numerals 2, 4, and 8 build on each other in Burushaski.)
Finally Č (p.75) tries to connect Bur hunti “9” with PIE *H1newṇ “9”, “with dissimilation”, presumably to eliminate the first nasal.
B&B argue that the numeral can be etymologized internally in Burushaski as “one from ten” and that the Burushaski numeral system has features that link it with DC languages. In any case, IMO it would be strange if Burushaski would have replaced its IE numerals from 3 to 7 and kept 8 and 9; normally, the lower numerals are more stable than the higher ones. In a final chapter B&B look at the lexicon, showing that almost none of the basic lexicon of Burushaski looks plausibly IE and arguing that many items can be linked to DC. As they don’t refer to any etymological proposals by Časule here, it seems that he hasn’t made any in the area of the basic lexicon, although it is in this area where one needs to look for evidence for genetic relationship, as the basic lexicon is normally the most stable. In total, from B&B’s argumentation it appears that Č totally ignores the phonological, morphological, and other systems of Burushaski; that he just picks elements that look IE, and even for those needs to assume many ad-hoc phonological developments and far-fetched semantic developments. Some of these might be acceptable if there otherwise were a solid bedrock of systematic relationships – after all, even in clearly IE languages there are phonological irregularities and strange semantic shifts -, but they are not a sufficient fundament to prove a genetic relationship between Bur and IE.