Indo-European Language Family
Indo-European is a family of languages that first spread throughout Europe and many parts of South Asia, and later to every corner of the globe as a result of colonization. The term Indo-European is essentially geographical since it refers to the easternmost extension of the family from the Indian subcontinent to its westernmost reach in Europe. The family includes most of the languages of Europe, as well as many languages of Southwest, Central and South Asia. With over 2.6 billion speakers (or 45% of the world’s population), the Indo-European language family has the largest number of speakers of all language families as well as the widest dispersion around the world.
The cradle of the Indo-Europeans may never be known but an ongoing scholarly debate about the original homeland of Proto-Indo-European (PIE), may some day shed light on the ancestors of all Indo-European languages as well as the people who spoken it. There are two schools of thought:
- Some scholars (e.g., Marija Gimbutas) propose that PIE originated in the steppes north of the Black and Caspian Seas (the Kurgan hypothesis). Kurgan is the Russian word of Turkic origin for a type of burial mound over a burial chamber. The Kurgan hypothesis combines archaeology with linguistics to trace the diffusion of kurgans from the steppes into southeastern Europe, providing support for the existence ot a Kurgan culture that reflected an early presence of Indo-European people in the steppes and southeastern Europe from the 5th to the 3rd millenium BC.
- Other scholars (e.g., Gamkrelidze and Ivanov) suggest that PIE originated around 7,000 BC in Anatolia, a stretch of land that lies between the Black and Mediterranean seas. It lies across the Aegean Sea to the east of Greece and is thus usually known by its Greek name Anatolia (Asia Minor). Today, Anatolia is the Asian part of modern Turkey.
It would not have been possible to establish the existence of the Indo-European language family if scholars had not compared the systematically recurring resemblances among European languages and Sanskrit, the oldest language of the Indian subcontinent that left many written documents. The common origin of European languages and Sanskrit was first proposed by Sir William Jones(1746-1794). Systematic comparisons between these languages by Franz Bopp supported this theory and laid the foundation for postulating that all Indo-European languages descended from a common ancestor, Proto-Indo-European (PIE), thought to have been spoken before 3,000 B.C. It then split into different branches which, in turn, split into different languages in the subsequent millennia.
Since PIE left no written records, historical linguists construct family trees, an idea pioneered by August Schleicher, on the basis of the comparative method. The comparative method takes shared features among languages and uses procedures to establish their common ancestry. It is not the only method available but is one that has been most widely used. The examples below show how this method actually works with some Indo-European languages.
PIE *dekm > | Proto-Germanic *texun > Old English teon > Modern English ten Proto-Italic *dekem > Latin decem > Modern Italian dieci Old Church Slavonic desenti > Modern Bulgarian deset Sanskrit dáça > Hindi/Urdu das Greek deka |
- proto means ‘old’ in Greek
- * means the form was reconstructed, not attested.
- > means ‘became’
Indo-European languages are classified into 11 major groups, 2 of which are extinct, comprising 449 languages (Ethnologue).
Baltic
This conservative group has preserved many archaic features thought to have been present in PIE. Some scholars think that Baltic languages share a common ancestral language with the Slavic languages. This hypothetical language is called Balto-Slavic.
Language
|
Number of speakers
|
Where spoken primarily
|
---|---|---|
Latvian | 1.5 million | Latvia |
Lithuanian | 3.1million | Lithuania |
Celtic
Celtic languages were largely unknown until the modern period. They were once spread over Europe in the pre-Christian era. The oldest records of these languages date back to the 4th century AD.
Language
|
Number of speakers
|
Where spoken primarily
|
---|---|---|
Breton | 533,000. | France |
Irish | 355,000 | Ireland |
Scottish (Scots Gaelic) | 62,175. | Scotland |
Welsh | 575,000 | Wales |
West Germanic | ||
---|---|---|
Language
|
Number of speakers
|
Where spoken primarily
|
Afrikaans | 6 million | South Africa |
Dutch | 17 million | Holland |
English | 309 million | UK, US, Australia, Canada |
German | 95 million | Germany |
Yiddish | 50,000 | Germany, Israel |
North Germanic | ||
Language
|
Number of speakers
|
Where spoken primarily
|
Danish | 5.3million | Denmark |
Icelandic | 240,000 | Iceland |
Norwegian | 4.6 million | Norway |
Swedish | 8.8 million | Sweden |
Language
|
Number of speakers
|
Where spoken primarily
|
---|---|---|
Catalan | 6.7 million | Spain |
French | 65 million | France |
Italian | 61.5 million | Italy |
Portuguese | 178 million | Portugal, Brazil |
Romanian | 23.5 million | Romania |
Spanish | 322 million | Spain, Latin America |
Slavic
West Slavic | ||
---|---|---|
Language
|
Number of speakers
|
Where spoken primarily
|
Czech | 11.5 million | Czech Republic |
Polish | 43 million | Poland |
Slovak | 5 million | Slovakia |
Sorbian | 70,000 to 110,000 | Germany |
East Slavic | ||
---|---|---|
Language
|
Number of speakers
|
Where spoken primarily
|
Belarusian | 9 million | Belorusia |
Russian | 150 million L1 speakers | Russia |
Ukrainian | 37.1 million | Ukraine |
South Slavic | ||
---|---|---|
Language
|
Number of speakers
|
Where spoken primarily
|
Bosnian | 4 million | Bosnia & Hercegovina |
Croatian | 6.2 million | Croatia |
Macedonian | 1.6 million | Macedonia |
Serbian | 11.1 million | Serbia |
Slovenian | 2 million | Slovenia |
Indo-Aryan (Indic) | ||
---|---|---|
Language
|
Number of speakers
|
Where spoken primarily
|
Balochi | 1.8 million | Pakistan |
Bengali | 100 million 1st language; 211 million 1st & 2nd language speakers | Bangladesh |
Bhojpuri | 26.6 million | India |
Hindi | 180.8 million | India |
Gujarati | 46.1 million | India |
Kashmiri | 4.6 million | India |
Marathi | 68 million | India |
Nepali | 17.2 million | Nepal |
Maithili | 24.8 million | India |
Oriya | 31.7 million | India |
Punjabi | 60.8 million | India |
Romani | 1.5 million | Romania & elsewhere |
Sanskrit | 194,000 2nd language speakers | India & elsewhere |
Sindhi | 21.3 million | Pakistan |
Sinhalese | 13.2 million | Sri Lanka |
Urdu | 60.5 million | Pakistan |
Iranian | ||
---|---|---|
Language
|
Number of speakers
|
Where spoken primarily
|
Dari | 7.6 million | Afghanistan |
Farsi (Persian) | 24.3 million | Iran |
Kurdish | 11 million | Iraq & elsewhere |
Pashto | 19 million | Afghanistan & elsewhere |
Tajik | 4.3 million | Tajikistan |
Language
|
Number of speakers
|
Where spoken primarily
|
---|---|---|
Albanian | 5 million | Albania |
Armenian | 6.7 million | Armenia |
Hellenic Greek is the only surviving language of this group. |
12.3 million | Greece |
Tocharian (extinct) Attested by texts dating to 500-1000 AD that were found in early 20th century in Chinese Turkestan |
||||
Anatolian (extinct) Unknown until the 20th century when it was discovered during excavations in Turkey. Texts written in cuneiform date to 13th-7th centuries BC. |
In addition to these main groups, there are fragmentary records of other Indo-European languages. These records, mostly in the form of inscriptions, do not provide sufficient material for the reconstruction of PIE.
Dialects
Structure
Sound system
There have been numerous attempts to reconstruct the vowels and consonants of PIE, all of which encountered serious problems due to the uneven nature of the written records and to the huge differences in the age of the records. As a result, the reconstruction of PIE phonology continues to be a matter of scholarly debate and speculation. Among the most notable reconstructions are those by August Schleicher, Karl Brugmann, Winfred Lehmann, Oswald Szemerènyi, and Jacob Grimm.
First Germanic Sound Shift (Grimm’s Law)
You probably know of Jacob Grimm as the author of fairy tales. But he was also one of the great linguists of the 19th century. He found evidence for the unity of all the modern Germanic languages in the phenomenon known as the First Germanic Sound Shift (also known as Grimm’s law ), which set the Germanic branch apart from the other branches of the Indo-European family. This shift occurred before the 7th century when records started to be kept. According to Grimm’s law, the shift occurred when /p, t, k/ in the classical Indo-European languages (Latin, Greek, and Sanskrit) became /f, t, h/ in Germanic languages. For example, Latin pater > English father, Latin cornu > English horn.
You can easily see the resemblances among four common words across five Indo-European languages.
English
|
Greek
|
Latin
|
Sanskrit
|
---|---|---|---|
father | pater | pater | pita |
brother | phrater | frater | bhratar |
foot | poda | pedem | pada |
three | tris | tres | trí |
Click here for an amusing illustration of Grimm’s Law and of words for family, plants, animals, sky, and counting in nine Indo-European languages.
Centum-Satem division
The Centum-Satem division explains the evolution of PIE labiovelar, velars, and palatovelar consonants.
- Labiovelar consonants include [kw, gw, xw, ngw] which are pronounced like [k, g, x, ng] but with rounded lips.
- Velars are consonants articulated with the back part of the tongue (the dorsum) against the soft palate (the back part of the roof of the mouth, known also as the velum). They include [k, g, x, ng].
- Palatovelar consonants are articulated with the back part of the tongue against the hard palate. They include [k’, g’, x’, ng’]. For example, [k’] is pronounced as the k in keen.
The terms centum-satem come from the words for ‘one hundred’ in representative languages of each group. Please note that not all languages fall neatly into these categories.
- Satem languages include Baltic, Slavic, Albanian, Armenian, and Indo-Iranian languages. For example, Sanskrit satam, Lithuanian simtas, Russian sto.
Click here to see the complete satem language tree. - Centum languages include Romance, Celtic, Germanic, and Greek. For example, Latin centum, Irish cead, English hundred, Greek.
Click here to see the complete Centum language tree.
Stress
It is believed that PIE had a pitch accent system. All words had only one accented syllable which received a high pitch. Stress could fall on any syllable of a word.
Grammar
Unevenness of existing records and huge gaps in the chronology among Indo-European languages make the reconstruction of PIE grammar a difficult task. Discoveries of Hittite, Tocharian and Mycenaean Greek in the 20th century have made changes in the data base on which the reconstruction of PIE is based that in turn have modified existing views of PIE. .
Many of the older well-documented languages, such as Sanskrit, Greek, and Latin, have rich morphologies with clearly marked gender and number, as well as elaborately marked case systems for nouns, pronouns, and adjectives. Verbs in these languages also have elaborately marked systems of tense, aspect, mood, and voice, in addition to person, number, and gender. Reconstructed PIE is based on the assumption that it contained all the features found in attested languages. If a given language lacks a particular feature, it is assumed that the feature was lost or that it had merged with other features.
Modern Indo-European languages reflect the rich morphology of PIE to various degrees. For instance, Sanskrit, Greek, Latin, Baltic, Slavic, Celtic, Armenian have extremely rich morphologies. On the other hand, Germanic, Romance, Albanian, and Tocharian do not possess quite as many finely differentiated morphological features.
Nouns, pronouns and adjectives
- Case
Sanskrit had the most cases (8), followed by Old Church Slavonic, Lithuanian, and Old Armenian (7), Latin (6), Greek, Old Irish, Albanian (5), Germanic (5). - Gender
The three genders (masculine, feminine, neuter) have survived in a number of Indo-European languages. - Number
The three numbers (singular, dual, plural) survived in Sanskrit, Greek, and Old Irish. Vestiges of the dual number can be found in many other Indo-European languages. - Adjective-Noun agreement
Adjective-noun agreement has survived in many Indo-European languages.
Verbs
Reconstructed PIE verbs had different sets of endings tense/aspect, voice and mood in addition to person and number. :
- Tense and aspect
It is thought that the PIE verb system was aspect-based, although traditionally, aspect has been confused with tense. Although tense was not formally marked in PIE, most Indo-European languages define their verbal systems in terms of tense, rather than aspect. . - Voice
PIE had two voices: active (e.g., The child broke the glass) and medio-passive which combined reflexive and passive voices (e.g., The child washed himself and The child was washed by his mother). In addition to the active voice, various Indo-European languages use the middle or the passive voices. - Mood
It is hypothesized the PIE had four moods: indicative, optative, subjunctive, and imperative. Most of these moods exist in all Indo-European languages. - Person and number
PIE verbs were marked for person (1st, 2nd, 3rd) and number (singular, dual, plural).
Word order
Less is know about the syntax of PIE than about its morphology. What is known about PIE word order, therefore, is a subject of conjecture and debate. It is thought likely that word order in PIE sentences was Subject-Object-Verb. This word order is found in Latin, Hittite, Vedic Sanskrit, Tocharian, and to some extent in Greek.
Vocabulary
The comparative method enables linguists to reconstruct a basic PIE vocabulary referring to many common elements of their culture. This basic vocabulary is not uniformly attested across all Indo-European languages which suggests that some words may have developed later or were borrowed from other languages. Among words that are reliably reconstructed are words for day, night, the seasons, celestial bodies (sun, moon, stars), precipitation (rain, snow), animals (sheep, horse, pig, bear, dog, wolf, eagle), kinship terms (father, mother, brother, sister, son, daughter), tools (axe, yoke, arrow).
Click here to explore cognates in different Indo-European languages
Writing
Written records for various Indo-European languages have different date lines. The table below shows when the first written records appeared, what writing system was used, and which writing systems are used by the languages today.
Branches
|
Earliest written records
|
Earliest writing system
|
Current writing system(s)
|
---|---|---|---|
Armenian | 500 AD | Armenian alphabet | Armenian alphabet |
Albanian | 15th century AD | Greek alphabet | Modified Latin alphabet |
Greek | 1,400 BC | Linear B | Greek alphabet |
Celtic | 4th century AD | Ogham alphabet | Modified Latin alphabet |
Baltic | 16 th century AD | Modified Latin alphabet | Modified Latin alphabet |
Romance | 6th century BC | Latin alphabet, adapted from Etruscan | Modified Latin alphabet |
Germanic | 3rd century AD | runic Futhark | Modified Latin alphabet |
Slavic | 9th century AD | Old Church Slavonic alphabet | Cyrillic and Latin alphabets |
Indo-Aryan | 3rd century BC | Br�?hmī script | Bengali, Devan�?garī, Gujarati, Oriya, Gurmukhi, Sinhala, Kaithi,modified Perso-Arabic |
Iranian | 9th century AD | Perso-Arabic script | Modified Perso-Arabic, Arabic, modified Cyrillic, modified Latin. |
Tocharian | 500-1,000 AD | Br�?hmī script |
Difficulty
Indo European Languages range from Category I to Category II in terms of difficulty for speakers of English.