Kish tablet

Anthropology: Origin of the World’s Languages

Language has been the cornerstone of cultural identity throughout most civilizations in human history. This article attempts to give an overview of the myriad languages in human history to modern day in order to give the reader a general outline of language and how it has evolved. It also aims to show how it might be possible to trace languages to a common ancestral language or root language. New insights into origins of the world’s languages

Linguists from the Linguist Society of America have for a long time thought that languages from English to Greek to Hindi, known as the “Indo-European languages”, is the descendants from languages spoken by a common ancestor thousands of years ago.

Linguists from the University of California, Berkeley used data from over 150 languages and theorize that this ancestor language originated about 5,500 to 6,500 years ago, on the Pontic-Caspian steppe stretching from Moldova and Ukraine to Russia and western Kazakhstan.

Significance in anthropology. Indo-European expansions and reasons for cultural diversity.
Indo-European expansions, based on Anthony (2007), Nordqvist & Heyd (2020)

Linguists were even able to classify the Indo-European languages into a tree

Anthropology: Indo-European Phylogenetic Tree.
Credit: Mandrak

An important article in linguistics “Ancestry-constrained phylogenetic analysis supports the Indo-European steppe hypothesis”, by Will Chang et al proposes that Indo-European languages first spread with cultural developments in breeding animals and farming around 4500 – 3500 BCE. The other common theory is that they diffused earlier around 7500 – 6000 BCE, in Anatolia in modern-day Turkey.

Chang et al. examined over 200 sets of words from past and present Indo-European languages and then determined how these words changed over time through statistical modeling and found that the rate of change suggests that the languages diverged around 6,500 years ago, supporting the steppe hypothesis.

With more advances in computational phylogenetic methods a recent article “Indo-European phylogenetics with R” used R for statistical modeling. R is a statistical programming language built on the S language (Wickham 2014). R offers the foremost of which is that it is free, general purpose software. It has over 4,000 libraries, which include a wide array of packages for phylogenetic analysis. The article attempts to describe the strengths and weaknesses of new computational methods in describing the phylogenetic tree.

Significance in anthropology. Indo-Europoean Linguistics.
Nuclear IE star phylogeny
Citation: Indo-European Linguistics 8, 1 (2020) ; 10.1163/22125892-20201000

Furthermore, there are what linguists call “dead languages” and “living languages”. A dead language is one that is no longer the native language of any community, even if it is in use. Latin is a good example.

What is the oldest dead language on Earth?

The oldest dead language that we know about is the Sumerian Language dating about 3500 BC and the proof of that is the Kish Tablet.

Limestone tablet engraved with pictographic writing. It comes from the mesopotamic city of Kish (Iraq), dated from 3 500 BC. It is drawn in real size, approximately. Probably, it is the earliest known evidence of writing, and contains pictographs of heads, feet, hands, numbers and threshing-boards. Department of Antiquities, Ashmolean Museum, Oxford (United Kingdom).

“…The Kish tablet is inscribed with proto-cuneiform signs. It has been dated to ca. 3500 BC , although some scholars believe it may be from somewhat later – Uruk IV period (ca. 3350–3200 BC).[1] Several thousands of proto-cuneiform documents dating to Uruk IV and III periods (ca. 3350–3000 BC) have been found in Uruk. It is considered the world’s oldest known written document.”

The Kish Tablet is now a collection in the Ashmolean Museum. Sumerian is considered to be the first language of the world.

Other very old languages that are now extinct are:

  • Hurrian – oldest proof of written Hurrian dates back to the 21st century BC;
  • Palaic – attested in cuneiform tablets in Bronze Age Hattusa – circa the 16th century BC;
  • Egyptian – its earliest known complete written sentence has been dated to about 2690 BC;
  • Akkadian – first attested texts from around the mid-3rd-millennium BC;
  • Elamite – the earliest Elamite writings use a pictographic script and date from the middle of the 3rd millennium BC;
  • Hittite – the oldest known text in the Hittite language was written by Anitta, a king that reigned in the 17th century BC;
  • Mycenaean Greek – the most ancient attested form of the Greek language with the earliest writings dating back to 1450 BC.

10 oldest living languages in the world

Sanskrit (cc. 3500 years old)

First attested: 2nd millennium BC

Spoken in: India

Current number of speakers: 5 million

Although many believe Sanskrit to be an extinct language, 24,800 people have registered Sanskrit as their mother tongue.

Greek (cc. 3400 years old)

First attested: 1450 BC.

Spoken in: Greece

Current number of speakers: 13 million

Greek was also mentioned in the “extinct languages” category because Mycenaean Greek is the precursor of Modern Greek. However, since its roots lie in Mycenaean Greek, we can all agree that Greek is indeed one of the oldest living languages in the world likely 3400 years old.

The Greek language holds an important place in history due to its rich literature that includes works of literature like the Iliad and Odyssey. Furthermore, many philosophical dialogues and writings such as the works of Aristotle and Plato, were in Greek.

Coptic Egyptian (cc. 2200 years old)

First attested: 2nd century BC

Spoken in: Egypt

Current number of speakersunknown

Sometime in the 2nd century BC, Egyptian began to be written in the Coptic alphabet (an adaptation of the Greek alphabet).

Hebrew (cc. 3000 years old)

First attested: 10th century BCE

Spoken in: Israel

Current number of speakers: 9.3 million

The earliest known precursor to Hebrew is the Khirbet Qeiyafa inscription in Ancient Hebrew discovered in 2007, near the Israeli city of Beit Shemesh, 30 km from Jerusalem.

Hebrew ceased to be an everyday spoken language somewhere between 200 and 400 AD. Then, it continued to be used throughout the medieval period as the language of Jewish liturgy, rabbinic literature and poetry. It was revived as a spoken and literary language with the rise of Zionism, becoming the main language of the Jewish community and subsequently of the State of Israel.

Chinese (cc. 3200 years old)

First attested: 1250 BC

Spoken in: China

Current number of speakers: 1.3 billion

Old Chinese is the oldest attested stage of Chinese and the ancestor of all modern varieties of Chinese. The earliest examples of Chinese are divinatory inscriptions on oracle bones from around 1250 BC, during the late Shang dynasty.

Aramaic (cc. 3100 years old)

First attested: 11th century BC

Spoken in: Middle East and Western Asia

Current number of speakers: 2 million

During its approximately 3,100 years of history, Aramaic has served as a language of divine worship and religious study, administration of empires and as the mother tongue of a number of Semitic people from the Near East.

Historically, Aramaic was the language of the Arameans, the Semitic-speaking people from the region between the northern Levant and the northern Tigris valley.

Arabic (cc. 2800 years old)

First attested: 1st century BC

Spoken in: there are 25 countries that have Arabic as an official or co-official language

Current number of speakers: 335 million

Old Arabic is the ancestor of the Arabic language and it is believed that its earliest inscription is a prayer to the three gods of the Transjordanian Canaanite kingdoms dated to the early 1st millennium BC. Arabic has also infuenced other languages such as the Mediterranean Lingua Franca or Sabir.

Farsi (cc. 2500 years old)

First attested: 522 – 486 BC

Spoken in: Iran, Tajikistan, Uzbekistan, Iraq, Russia, Azerbaijan and Afghanistan

Current number of speakers: 65 million

The ancestor of Farsi or Persian is Old Persian, a language that is first attested in the inscriptions of Darius I who ruled between 522 and 486 BC.

Examples of Old Persian have been found in what is now Iran, Romania, Armenia, Bahrain, Iraq, Turkey and Egypt. However, the most important attestation by far is the Behistun Inscription which is a multilingual inscription that was crucial to the decipherment of cuneiform script because it includes three versions of the same text, written in Old Persian, Elamite, and Babylonian (a variety of Akkadian).

Tamil (over 2500 years old) – oldest living language in India

First attested: widely debated; proposals range between 5320 BC and the 8th century CE

Spoken in: India

Current number of speakers: 83 million

The earliest Tamil writing is attested in inscriptions and potsherds from the 5th century BC. However, with the discovery of Tolkāppiyam, the most ancient Tamil grammar text and the oldest surviving work of Tamil literature, scholars began to debate the true age of Tamil. The author of Tolkāppiyam often mentions “they say so” (or something similar) indicating a rich grammar and literature tradition even before him. Naturally, linguists began to wonder whether we should be dating the Tamil language at least a couple of thousand years before Tolkāppiyam.

Unfortunately, at the moment there is no archaeological evidence to support this claim, so experts stick to the original findings.

Irish Gaelic (1500 years old)

First attested: 4th century AD

Spoken in: Ireland

Current number of speakers: 1.2 million users

The earliest Irish Gaelic writings date to 4th century AD, in the form of the linear Ogham scripts, in a stage of the language known as Primitive Irish. Then, Primitive Irish transitioned into Old Irish through the 5th century. During this time, the Irish language absorbed some Latin words and by the 10th century, it evolved once again into Middle Irish, which was spoken throughout Ireland and in Scotland and the Isle of Man.

Starting the 12th century, Middle Irish began to develop into Modern Irish in Ireland, Scottish Gaelic in Scotland, and into the Manx language in the Isle of Man.

Newest Languages in the World

9. Afrikaans (340 years)

Afrikaans is considered to be the newest national language. When South Africa became independent in May 1910, Afrikaans as well as ten other languages, became South Africa’s national language.

Afrikaans began when the Dutch first settled in South Africa (the Boers). Over time, the language evolved separately from European Dutch. Afrikaans began to drop letters off the end of Dutch words and simplify them and their spellings

Afrikaans became a separate language from Dutch around 340 years ago (roughly 1680) after the Dutch first settled South Africa in 1600.

Today, Afrikaans is spoken by 17.5 million people. Afrikaans currently has 7.2 million native speakers and 10.3 million people speak Afrikaans as a second language. Afrikaans is considered to be one of the easiest languages.

8. Esperanto (133 years)

Esperanto is perhaps the most famous language on this list. Esperanto was designed to be the world’s primary language of commerce.

In 1887, Polish linguist L. L. Zamenhof devised a language that would take parts of many major European languages. This language would have borrow both grammar and vocabulary from Romance, Germanic and Slavic languages.

Zamenhof designed the language to be as easy to understand for the masses as possible. He did this so that everyone would prefer to learn this language, over French, Spanish and English that were used at the time.

Today, Esperanto is spoken by 2 million people. Whilst no internationally recognized country has Esperanto as its official language, Esperanto does have it own native speakers.

7. Lingala (120 years)

Lingala is perhaps one of the newest languages on the planet. In 1900, Lingala didn’t exist. Nobody could speak Lingala, nobody recognized it as a language and no one wanted to learn it.

Before the invention of the Congo Free State, there was a group of Bantu tribes around the Congo River. These tribes needed to band together in order to fight the Belgians.

These tribes combined their languages, and named their language after themselves in their language. They named themselves ‘river people’ as what defined them was the river they depended on for survival.

As such, they called themselves the ‘Bangala’. To the Belgians and the other foreign explorers, this sounded a lot like “Langala” and then “Lingala” to the Belgians and other explorers. Thus, they became known as the Lingala.

Lingala is spoken by 40 million people. Lingala is spoken by 15 million native speakers and 25 million non-native speakers. Most of these people are in the Democratic Republic of the Congo and the Republic of the Congo.

There are also a minority of Langala speakers in Angola and in certain parts of the Central African Republic.

6. Ido (113 years)

Following in the steps of Esperanto, Ido too is a construct language. Ido is derived from Reformed Esperanto. Just as with Esperanto, Ido is one of the newest languages on the planet.

Reformed Esperanto came into existence in 1894, after L. L. Zamenhof (the founder of Esperanto) heard the complaints about various parts of Esperanto. As a result, he developed Reformed Esperanto which would address them.

Zamenhof soon began to teach this variant of Esperanto just as he had the previous variant before.

Over time, Reformed Esperanto began to develop away from the Standard Esperanto. This eventually developed into its own language, which was renamed as Ido.

Today, Ido has around 500 speakers. Currently, none of these are native speakers. There are several website and online forums that are solely in Ido. They’re usually used in order to help people practice their Ido skills.

5. Albanian (108 years)

Albanian is perhaps a surprise on this list. Albanian isn’t one of the newest languages in the way that Ido or Esperanto is. Albanian has been spoken by ethnic Albanian for centuries.

This naturally begs the question, how is Albanian one of the newest languages? Albanian only became a recognized language in 1912, when Albania became independent.

Although it had been officially called a minority language when Albania was apart of the Ottoman Empire, there were many strings attached to it. Albanian was often considered to be a dialect of Serbian (or even German!) under Ottoman rule.

However, the unsuccessful uprisings in 1910 and 1911, forced the Ottomans to recognize the language as official and separate from Serbian and German (despite the fact that neither languages are similar!)

Currently, 7.6 million people speak Albanian, mostly in Albania and Kosovo. However, small groups of Albanian speakers are also present in the UK, US, Serbia, Montenegro and Romania.

4. Sona (85 years)

Sona too is perhaps one of the newest languages on the planet. Sona was constructed to be what Esperanto couldn’t be. Sona was designed to be the language that the whole world used.

Sona was designed by Kenneth Searight in 1935. Searight constructed a language that he believe could become the international auxiliary language.

His book, which he released in 1935, detailed all of the words and grammar of Sona. Sona is constructed based on English, Arabic, Turkish, Chinese and Japanese.

Sona was constructed to counteract the Eurocentrism of Esperanto and Ido, which were both made up of solely European languages.

Sona is currently spoken by roughly 100 people. Just as with other construct languages, no internationally recognized country speaks Sona as an official language.

Today, most Sona speakers are spread throughout the world, and use the internet to communicate in Sona.

3. Israeli Hebrew (72 years)

Israeli Hebrew is very similar to Albanian, in terms of its age. In fact, Hebrew is one of the oldest languages in the world, however, it is very different from the Hebrew that is spoken today.

The Hebrew spoken 2000 years ago, is almost unintelligible when compared to the Hebrew that is spoken today. Hebrew is famous for having gone extinct several times, but has always seen a revival over the years.

With these revivals has often come a remake in the language. Israeli Hebrew is the most recent revival of the Hebrew language, and has greatly modernized the language.

Biblical Hebrew had only 8000 words in the entire language, Israeli Hebrew, added almost 12,000 more words to the language. Israeli Hebrew borrowed many of these words from German, French and English.

Today, Israeli Hebrew is the most prevalent form of Hebrew. Today, 7 million speak Israeli Hebrew.

2. Guniyandi (38 years)

Guniyandi, sometimes written as Gooniyandi, is an Australian aboriginal language. Unlike many aboriginal languages, Guniyandi is actually one of the youngest and newest languages on the planet.

It was developed from Bunuba (another aboriginal language) but with great Kriol another aboriginal language) influences. This language was meant to become the language that the Bunuba, Kriol and other local tribes could use.

Originally, this was meant to be a sort of Lingua Franca- a language the tribes only used for trade and the like. However, it has since grown to becoming a language that will replace their native languages.

Just not in the way that English is replacing so many aboriginal and other native languages.

Guniyandi is currently spoken by only 100 people in the north of Western Australia, all of whom are under the age of 38. However, due to its age, it is already endangered and runs the risk of going extinct.

1. Light Warlpiri (35 years)

Light Warlpiri is an attempt to make one of the oldest languages into one of the newest languages!

In the late 1980’s and early 1990’s, linguists began to try to combine both the Traditional Warlpiri and Kriol languages, which were both at risk of going extinct.

As a result, linguists began to teach Warlpiri and Kriol children this hybrid language. However, the linguists began to realize that a late 20th and early 21st century society needed certain vocabulary in order to work.

This vocabulary existed in neither the Warlpiri and the Kriol language. As a result, Light Warlpiri needed to borrow vocabulary from Standard Australian English in order to do that.

Light Warlpiri is spoken by 350 people in the Northern Territory (central and central northern regions of Australia). The oldest native speaker of the language is only 34 years old as of the time of writing!

In fact, most of the people who speak Light Warlpiri are under the age of 40.

The top most spoken languages in the world

1. English (1,132 million speakers)

Language family: Germanic, a sub-family of Indo-European

Related to: German, Dutch, Frisian

Fun fact: The English word “goodbye” was originally a contraction of “God be with ye”.

With over 1,130 million native speakers, English is the most spoken language in the world.

It’s also the official language of the sky – all pilots have to speak and identify themselves in English.

2. Mandarin Chinese (1,117 million speakers)

Language family: Sino-Tibetan

Related to: Cantonese, Tibetan, Burmese

Fun fact: Research suggests that you’ll only need around 2,500 characters to be able to read almost 98 percent of everyday written Chinese.

In terms of native speakers alone, Mandarin Chinese is by far the second most spoken language in the world.

It’s an official language of mainland China, Taiwan and Singapore and one of the six official languages of the United Nations. So it’s not surprising that there are approximately 1.09 million native speakers worldwide.

Mandarin is a tonal language, which means that the meaning of a word changes based on the way we pronounce it.

With a set of about 50,000 characters, it is probably one of the most complex languages to learn.

3. Hindi (615 million speakers)

Language family: Indo-Ayran, a sub-family of Indo-European

Related to: Bengali, Punjabi, Marathi, Kashmiri, Nepali

There are about 615 million native Hindi speakers, which makes it the third most spoken language in the world. It’s the official language of India, and is also spoken in countries such as Nepal, Fiji, Mauritius and Guyana.

Hindi is highly influenced by Sanskrit and named after the Persian word hind, which means – quite literally – “Land of the Indus river”.

4. Spanish (534 million speakers)

Language family: Romance, a sub-family of Indo-European

Related to: French, Italian, Portuguese, Romanian

Twenty-two countries over four continents have Spanish as the or one of the official languages, and it’s already the second most studied language in the world.

5. French (280 million speakers)

Language family: Romance

Related to: Spanish, Italian, Portuguese, Romanian

Spoken across different parts of the world – think everywhere from the rest of France and parts of Canada to a handful of African countries, including Senegal and Madagascar – the French language has spread its roots far and wide.

6. Arabic (274 million speakers)

Language family: Semitic, a sub-family of Afro-Asiatic

Related to: Hebrew, Amharic, Aramaic

With 295 million native speakers, Arabic is the sixth most spoken language in the world, and the only one in our top twelve that is written from right to left.

7. Bangla/Bengali (265 million speakers)

Language family: Indo-Aryan, a sub-family of Indo-European

Related to: Hindu, Punjabi, Marathi, Kashmiri, Nepali

Bengali, known to many English speakers around the world as Bangla, is mostly spoken in Bangladesh and India and is considered by some to be the second most beautiful language after French.

With around 205 million native speakers, it’s the seventh most spoken language in the world.

The Bengali alphabet is particularly interesting.

Every consonant has a vowel sound built in, which is quite unusual for Westerners.

8. Russian (258 million speakers)

Language family: East Slavic, a sub-family of Indo-European

Related to: Ukrainian, Belarusian

One of the most spread out languages (with around 155 million native speakers living across the world), the eighth most spoken language in the world is Russian.

While Russian grammar is renowned to be a little tricky, Russian only has about 200,000 words (English has roughly one million), which is why most of them have more than one meaning.

9. Portuguese (234 million speakers)

Language family: Romance, a sub-branch of Indo-European

Related to: Spanish, French, Italian, Romanian

Portuguese is rooted in the region of Medieval Galicia (which was partly in the north of Portugal and partly in the northwest of Spain), but only five percent of the 215 million native Portuguese speakers actually live in Portugal.

It’s the official language of Brazil, and also has the sole official status in: Angola, Mozambique, Guinea-Bissau, East Timor, Equatorial Guinea, Macau, Cape Verde, and São Tomé and Príncipe.

10. Indonesian (199 million speakers)

Language family: Austronesian

Related to: Malay, Javanese, Sundranese, Madurese etc. 

A standardised variation of Malay, an Austronesian language that’s the official language of Malaysia, Indonesian is a great example of a widely spoken language that encompasses a number of distinct dialects across Indonesia.

11. Urdu (170 million speakers)  

Language family: Indo-Aryan

Related to: Hindi, Bengali, Marathi, Kashmiri, Nepali

12. German (132 million speakers)

Language family: West Germanic, a sub-family of Indo-European

Related to: English, Frisian, Dutch

Fun fact: German is known for its seemingly endless sentences.

Often referred to as the language of writers and thinkers, German has just over 100 million native – and just under 32 non-native – speakers worldwide, and is the most spoken language in the European Union.

It’s an official language of Germany, Austria, Liechtenstein, Switzerland and Luxembourg.

8 thoughts on “Anthropology: Origin of the World’s Languages

    1. Thank you I’m glad you enjoyed it. Linguistics is such an interesting field especially with technology coming out in the field of natural language processing.

  1. Ancient or modern languages do not differ from those, that have not appeared.
    If you don’t get it – idiot. Forever.

Leave a Reply