Languages and Scripts of Central Asia from Mogao Library Cave and Xinjiang 


Above is an illustrated manuscript from 9th to 10th century, containing the Sutra of Buddhas names. This document is a part of the Aurel Stein collection at the British Library.

This article is an edited version of some work I did at the institution of language and literature at Gothenburgh University in 2018. I have changed the text a bit, made it less academic, improved the language and I have also added some sections and corrected some facts. The original title was The Languages of Central Asia and the Dunhuang Manuscripts. This article basically covers manuscripts found in a cave that is named the Library Cave, or Mogao Cave-17. Sometimes the cave complex at Dunhuang is called the 1000 Buddha caves. The Dunhuang manuscripts were discovered by a poor monk in 1900 and they had been hidden there for aproximately 1000 years. In this version of the article I have removed most of the notus apparatus for improved readability. There are also a few inscriptions from other places e.g. ancient Persia, because I use them to explain some matters.

I use the name and concept "Silk Road", but the ancient overland trade route was more of a "paper road", a way of transmitting ideas. The "silk" in the name comes from the fact that silk was used as a currency, a certain amount of silk was used to define prices of other goods. And, Road in singular is completely incorrect, it was actually a network of roads, trails and routes. Futhermore, there was maritime trade as well, that stretched from East Africa and Arabia to the rivers and ports of China and South East Asia.

Peter Frankopan's excellent book The Silk Roads explains the immense diversity of the concept and gives the "Silk Road" a broader meaning and explanation. But Frankopan also uses the name as a thesis frame for the history of the world. Interestingly he uses Persia as the center of the world, and the ancient Silk road was basically managed by Sogdians. Sogdians was actually Persians living in the east at the time. 

Mogao cave no.16 with book scrolls from cave-17. Cave no 17 was sealed sometime around 1035-1054 CE and opened in 1900.

Some of the main paths and sites of the Silk Road in Northwestern China. The border city of Dunhuang is marked with a red circle. 


Here, in this article, I introduce the linguistic diversity of the manuscripts of Dunhuang and the Tarim basin in Xinjiang in particular, but I also highlight the historical multicultural nature of the Silk Road in Northwest China and Central Asia in general. The period I cover here is basically Han, Tang and early Song (Chinese dynasties). In other words, we are talking about the first millennium of Christian chronology. To be able to sort out some of the terminology I have used my limited knowledge of Sanskrit and Arabic, but regarding Chinese, my knowledge is poor.

Central Asia and the Tarim Basin has been explored by western scholars since the late 19th century. Most famous are Aurel Stein (1862-1943) and Paul Pelliot (1878-1945) but there were also early Japanese, French, Russian, German and Swedish expeditions. The Swedish explorer Sven Hedin (1865-1952) was there 1927 to 1935 and the material he collected is still in Stockholm and is named the Sven Hedin Collection. Political circumstances in China soon halted almost all further investigations until the 1980/90s. Civil war, Japanese occupation, Second World War, Communist revolution and the Cultural revolution being the most important factors that hindered academic activity. Some later qualitative work on the area has been done by Susan Whitfield and Valerie Hansen, among many others.

For a more thorough background on Dunhuang, Mogao Caves or the discovery of the Library cave, check out English Wikipedia or read Valerie Hansens books about it. The IDP or UNESCO websites are other options. You can also visit Dunhuang on your computer and take a virtual 3D stroll at following address:

The collection of manuscripts discovered in the Library cave in Dunhuang 1900, is probably the single most important discovery of literature in Chinese archaeology so far, (only contested by the oracle bones). The manuscripts give us an insight to the ethnic diversity of North west China, from Han dynasty to around year 1000CE. The most famous document is the complete Diamond sutra, which is the oldest known dated printed and complete book in the world, as it is dated 11th of May, 868CE. 

The Diamond sutra in Sanskrit is written वज्रच्छेदिकाप्रज्ञापारदितासूत्र, Vajracchedikā Prajñāpāramitā Sūtra. In English the correct translation from Sanskrit would be "The Diamond of Perfect Wisdom", alternatively "The Thunderbolt of Perfect Wisdom". But a "vajra" is also a kind of weapon used by the god Indra in Hinduism, referring to something very hard etc. It is a complex word. The word "sutra" has the meaning "thread" because palmleaf books had a thread that held the leaves together, but we use "sutra" as a name for "book" in modern languages. 

A large part of the huge collection of the manuscripts are scattered around the world, but thanks to The International Dunhuang Project (IDP)3, they are now coming together again in digital form due to a large research project involving Chinese, British, French, Russian, American, Korean, German and Japanese scholars.4 So far, about 150.000 documents has been digitized. The project is headed by The British Library but has branches in seven more countries. There are large collections of manuscripts and artworks in Berlin, St Petersburg, Beijing, Kyoto and other places. Among the manuscripts, more than 15 different orthographies and languages are represented. (IDP in general covers more than 30 languages). Chinese, is by far, the most common language and script. The great majority of the texts from the library cave are Buddhist, but there are also Nestorian Christian, Manichean, Zoroastrian (two), Shamanic and secular texts. Only one Jewish-Hebrew text was found among the Mogao material. The Hebrew text might have been a talisman and it is not necessarily brought by a Jew.

The physical nature of the material

The artefacts found in Cave 17 at Mogao, Dunhuang, are made of wood, bark, textile, leaves and paper. The textiles are either silk or hemp. The material comes in various techniques and bindings, like scrolls, concertina, butterfly, whirlwind and notebook. The reason for the relatively good condition of the material is caused by the exceptionally dry climatic condition of the Tarim area. Other prominent places of archaeological discoveries in Xinjiang are for example Qizil caves, Niya, Khotan and Kucha. In other humid areas, like India or South East Asia, nothing, except stone and metal inscriptions, has survived. In India and Southeast Asia palm leaves were used as writing material, and most of them has deteriorated, but the ones brought to the Tarim has survived. The main bulk of the manuscripts of Tarim and the Library Cave are hand written, but there are also woodblock printed books like the Diamond sutra. They simply glued a paper with text to a piece of wood and then carved around the letters. In this way they created a negative that could be used for printing. But Dunhuang was not a printing centre like Sichuan was at the time.

The Diamond Sutra scroll

Above: The last picture shows a palm leaf book, called lontar in many countries. (This one is not from Dunhuang) The name "sutra" comes from the tread that holds the palm leafs together. In the library cave this kind of documents survived due to the dry climate, but almost everywhere else they have vanished. From Cambodia and Indonesia to northern India they have all rot and deteriorated. It is only inscriptions made in stone and metal that still survive in humid places. Palm leaf books are made from prepared leafs of the Borassus or Corypha palm trees.

Above: The famous Bower manuscript (400CE) with Sanskrit and Prakrit written in late Brahmi script/ early Gupta script on birch bark, found near Kucha, Tarim Basin. The Bower manuscript covers the field of ayurvedic medicine and other topics. This is one of the oldest known surviving Sanskrit document in the world and it is actually a copy of a much older Indian text. The Bower document is today at the Bodleian library in Oxford, England. The oldest surviving Sanskrit document ever found comes from a cave in Afghanistan and was discovered in 1990.

Many completely unique documents has been found at Dunhuang and other places around Tarim. Above is the oldest known star chart, or sky atlas, in the world. The star chart was drawn around 700CE (Tang dynasty era). Among the manuscripts from the Mogao cave 17 at Dunhuang we also find Konfucian classics, Jewish prayers, recreational games, dictionaries and a lot more.

Above is a bilingual Chinese-Tibetan manuscript. To be able to read it in Tibetan you have to turn it 90 degrees. Tibetan is an abugida (kind of writing system) that is read from left to right. Chinese was written from the top to the bottom. According to IDP, this particular document was probably a working paper used to make preparations fo a new Chinese translation of a Buddhist text.

Languages and scripts used in the Dunhuang material (other than Chinese)

Brahmi script: This is the mother of most south and southeast Asian script systems, called abugidas. There is no consensus among scholars regarding the origins of Brahmi script, except its geographic provenance, that most likely was northern India and the Ganges valley. Some favour the idea that it derives from Indus valley script, i.e. an exclusively Indian origin. Others argue that it was developed through contacts with Mesopotamia. The most famous texts written with Brahmi are the Ashoka edicts. For more on Brahmi, google for example "Ashoka pillars" and rock edicts, or James Prinsep (1799-1840).

Above: Sanskrit in Brahmi script. It is actually the Buddhist Lotus sutra in what is called South Turkestan Brahmi script.

A Bactrian-Greek bilingual coin (Not Xinjiang) featuring the Bactrian king Agathokles (reign 190-180BCE).  These coins with both Greek and Brahmi script helped to solve the decipherment of the Brahmi texts found on rocks and pillars left behind by the legendary Indian emperor Ashoka (304-232BCE).  Check out James Prinsep and learn about the decipherment of Brahmi.

Sanskrit: The classical language of India, usually written in Nagari script, or it's later form, Devanagari. The Sanskrit language is one of the oldest languages still in use and it is the liturgical language of Hinduism. Sanskrit, together with Hebrew, Greek and classical Chinese, counts as the four real old classical languages still in use. Sanskrit is also the language of the early Mahayana Buddhist texts, from which it was translated to Chinese. In one house in Niya, a document written in hybrid Sanskrit-Gandharan has been found, containing fragments from the great Indian epic Mahabarata. In Kucha, one of the city states on the north silk-route in Xinjiang, there were in fact musical assemblies that performed singing in Sanskrit. Sanskrit may of course be written in any script, but traditionally scripts derived from Brahmi has been used. Earlier, commonly Brahmi and later Nagari, today Devanagari. Script systems derived from Brahmi includes Khmer, Thai, Lao, Burmese, Javanese, Balinese and all Indian scripts.

Above: The Spitzer manuscript. Sanskrit in Brahmi script. This fragment toghether with around 1000 more in this condition was found in a cave in Kizil, Xinjiang. It was produced in the 2nd century and contains parts from Mahabarata, the Artashastra and Manusmriti. All three classical Indian litterature. Mahabarata "Great story of India" is the core of Indian litterature. Artashastra is a book of encyclopaedic nature about politics and many other topics. Manusmriti is a text concerning Hindhu religious law.

Sogdian: This is an East Iranian language and the ancient capital of the Sogdian people was Samarkand in the Fergana valley. Sogdian is a Middle Iranian language together with Bactrian, Khotanese-Saka and Parthian. Sogdian played an important role on the Silk road network and was the main language of trade, it was the lingua franca. A descendant of Sogdian is still spoken in Yagnob valley in Tajikistan. Sogdian script is related to Aramaic script and like in most Semitic languages the orthographic system is an abjad, written from right to left. Sogdian in not a semitic language though. Sogdian was in use between 100-1200 CE. Sogdian script was adopted by the Uygurs in the 8th century. The Uygurs passed it on to the Mongols, in their turn. When Sogdian script developed into Uygur and Mongolian it was written from top down, as Mongolian is written today.

Sogdian document. The earliest surviving page from the Avesta, the Zoroastrian "bible". 10th century CE.

Khotanese Saka: More than hundred Khotanese documents were found in the library cave at Dunhuang. Saka is known in the west as Scythian, an Eastern Persian language. Their homeland, that is interesting in this context, was the Khotan and Tumshuq kingdoms of the Tarim basin. But there was a constant demographic and linguistic shift at this time.

The Khotanese animal zodiac written in Brahmi script

Karoshthi script: Karoshthi was used in Gandhara, Bactria, Sogdia and Kushan, among other places along the Silk Road. It is an abugida but has obvious inspiration from Aramaic, the precursor of Arabic, wich is written with an abjad. Karoshthi was written from right to left, but later, left to right. It was in use from aproximately 3rd century BCE to 3rd century CE. There are some places that used it until the 7th century CE (Khotan and Niya). Karoshthi was developed in Takshashila/Taxila, in what is present day Pakistan, around 400BCE. At that time Taxila had a university and was a town of great importance. It was visited by both Emperor Ashoka and Alexander the Great. During the Chinese Han period, one of the rulers of the southern route in Xinjiang issued a coin with both Karoshthi and Chinese inscribed. Karoshthi as the "lingua franca script" of the western part of Central Asia (Gandhara to Niya) was replaced by Brahmi around 400CE.

Karoshthi script found at Yingpan, Tarim. Probably from around the 2nd century.

Gandhara was a Greco-Buddhist kingdom located in the area that's today Afghanistan and Pakistan. Gandhari was a Middle Indic Language, sometimes classified as "Prakrit" i.e. a vernacular. According to Hansen refugees from Gandhara taught the inhabitants of Niya (Tarim, Xinjiang) to read and write in the 1st and 2nd centuries CE. The script was then Karoshthi. Kusha, not to be confounded with Kucha in Xinjiang, was a tribal federation that dominated Northwest India the two first centuries CE.

A Bactrian coin using both Greek and Karoshthi script. (Not from Xinjiang)

 Bilingual Sino-Karoshthi coin found at Khotan, Tarim/Taklamakan.

Pahlavi Script

Pahlavi or a version of Parthian script developed from Aramaic and is categorized as a Middle Persian script (and language). Documents found in Turfan has a similar form of script that is named Psalter Pahlavi.

The development of different Persian and Iranian scripts and dialects is a bit complicated and complex. Often the different dynasties are used to name them. The first well known dynasty was that of the Achaemenids (550BCE - 330CE), then came the Parthians, also known as the Arsacid empire (247BCE - 224CE). The last Persian empire before Islam was the Sassanian (224CE - 651CE). Finally, in 651, Iran/Persia was conquered by the muslim Arabs. But we have to remember, as I already mentioned, that there were Sogdians, Khotanese and Scythians among many other Persian speaking peoples. Today we have for instance Dari, Tadjik, Pashtu, Balochi, Kurdish and Ossetian among the Persian languages.

One way to define the development of the Persian language is by talking about Old-Persian (Achaemenid and cuneiform script), Middle Persian (Pahlavi script) and New Persian (Arabic script). Often we come across the term "Avestan alphabet" and that developed during the Sassanian era. Avestan developed to what is known as Pazend, also an alphabetic script. Avestan is based on Pahlavi script but it has more vowels, it is not an inexact abjad and instead a full alphabet.

A Psalter-Pahlavi document. It is a Christian text translated to Middle Persian from an earlier Aramaic text. It is dated to 6th-7th centuries. It is today a part of the Turfan collection of the Berlin-Brandenburg Academy of Sciences. It was discovered by an archaeological expedition to Central Asia at the beginning of the 20th century.

The scripts used for different Persian languages has changed over time and that is a complicated story too. 

In Old-Persian the word "land, region" is written būmi (𐏏) with cuneiform script and būm 𐭡𐭥𐭬 in Middle-Persian with Pahlavi script. In modern Persian with Arabic script it is written būm بوم 

Above I have illustrated the development of writing Persian, but it has very little to do with Dunhuang, China and Xinjiang. First is cuneiform script from the Behistun inscription (c.500BCE). Second is a Parthian inscription in Pahlavi script from Ka'ba-ye Zartosht (The Zoroastrian cube) dated to reign of Shapur I (240 - 270CE). Last is Avestan script that was used for Zoroastrian religious purposes. This Avestan document is at the Bodleian library, Oxford, England. Modern Persian with Arabic script is not presented here.

Tibetan language and script: Tibetan is one of the members of the Tibeto-Burman branch of the Sinitic family. The script is of Brahmic origin and is therefore an abugida. Tibetan script comes in two styles, one headed and one headless. The latter also comes in a very simple variant to be used for teaching children. In the case of Tibetan, the oldest surviving manuscripts are those found in Dunhuang and Tarim.

The Tibetan version of The Masters and Disciples of the Laṅka School, a Chan (Zen) Buddhist document, early 9th century, Dunhuang.

Agnean and Kuchean: (Tocharian A and B) These are two extinct Indo-European (IE) languages used in Central Asia until 8th century, when they were finally assimilated into the invading Uygur language. But they were in decline much earlier. The terminology dealing with these languages is disputed and the name "Tocharian" is a misnomer. Kuchean was closer to Germanic and Celtic, than for example Persian and other IE-languages of the east. The script is also an abugida, a syllabary. Agnean and Kuchean is also interesting because, when they were discovered, it changed the academic field of (PIE), Proto-Indo-European. Kumarajiva (344-412CE), a Buddhist and a famous translator, was a Kuchean who were appointed as the manager of a translation bureau in the Chinese capital Chang'an in 401CE. Agnean became a liturgical language at the end and was used only in monasteries. Kuchean, as a living language, lasted longer. The Kuchean people are sometimes called the Yuèzhī 月氏 in Chinese sources. According to the dictionary provided by University of Texas there are many lone words from Sanskrit in Kuchean, but other words are amazingly European, almost understandable for a Scandinavian. 

Tocharian language, as I already mentioned, is not correct terminology. The Tocharian people, as described in ancient Greek sources, lived in Bactria and they spoke a Persian language, so it is not a correct term. Hansen clarifies this in her book "Silk Road". She uses the terminology Agnean and Kuchean, which is much more adequate. On the walls of Kizil cave no. 110 and Kumtura cave no.34, it is written in Chinese that Agnean died out and that also Kuchean ceased to be used after year 800CE. The name "Tocharian" is something that has been left over from an old academic mistake made by the German scholar Freidrich W.K. Müller (1863-1930), and unfortunately the name stuck. A Swedish expert on Kuchean and Agnean is the linguist Gerd Carling at the University of Lund. See also for example the Tarim mummies and other exciting topics connected to the Indo-Europeans of Xinjiang.

Chinese Indo-Europeans with blue eyes and blond hair, a fresco from the Kizil caves.

For more on the field of Proto-Indo-European (PIE), see for example the Kurgan or Anatolian hypotheses and the Satem-Centum split. Marija Gimbutas (1921-1994) and Sir William Jones (1746-1794) are other possible entries to get an introduction to the science of PIE.

Susan Whitfield suggests that Yuezhi refers to Scythians, and not Kucheans. Encyclopaedia Brittanica call the Yuezhi "Indo-Scyth" and the Kucheans are named "Dayuan", but Dayuan is probably wrong to, because it is a name for the Sogdians of Samarkand. Takata Tokio though, use the name Tocharians for the Yuezhi. There is obviously no consensus of the nature of the Yuezhi and it may have been a Chinese generic term for Indo-European/Indo-Arian peoples living in Central Asia. Valerie Hansen has the opinion that Yuezhi was the inhabitants of the Indian-Central Asian tribal confederation called Kushan. Da Qin is the name of that state, according to the Chinese sources. Also Kenneth Chen give the Scythians the name Yuezhi. In addition to that comes the distinction between "Greater and Lesser Yuezhi". Maybe "Lesser Yuezhi" refers to the Kucheans?

Kuchean on wood written in North Turkestan Brahmi script, dated to 5th-8th century, now in a Japanese collection.

Kuchean written in Brahmi script.

Some experts call the Agneans and Kucheans a Northern European tribe and point out that they were blond and tall and that their textiles shows similarities to celtic tartan. Have a look at the Tarim mummy below and some of the Tarim textiles and judge yourself. So! It is not only the language that is grammatically and phonetically related to Northern Europe, obviously also culture and physical appearance.

Tarim mummy and reconstruction
Tarim mummy and reconstruction

Tangut: The Tangut language belongs to the Tibeto-Burman group. The Tanguts created an independent state in the Xinjiang area in 1038CE (after the Library cave was sealed). The Tangut state was annihilated by Genghis Khan 1226CE. They developed a script of their own and it was based on Chinese characters. It is sometimes called "the most complicated script system in the world".

Illustrated block-printed Tangut document from the Library Cave. This particular manuscript is the frontispiece of a version of a Buddhist sutra and is located in Sankt Petersburg, Russia.

Old Uyghur and Manichean script: Uygur is an Altaic/Turkish language. The Uygur moved to the Tarim area around the 9th century and about 18 million Uygur still live in China, most of them in Xinjiang. The script, which is written vertically has influence from Aramaic and Sogdian. Both Sogdian and Parthian have been written with Manichean script. The script eventually developed into present Mongolian. Manchurian, a Tungusic language, has been written with this script as well. The Manchurian language is endangered and it belongs to the Altaic group. But the so called Altaic language family is contested and debated, it is not recognized by all prominent modern linguists. Turkish languages and Mongolian belongs to the Altaic family, but the debate is about other languages and their possible affiliation with the family.

The term Manichean script is a bit awkward though and many scholars call it "Old-Uygur script". Manichean documents were found at Turfan, written in many languages, and in Dunhuang, also some written in Chinese. These documents are unique, because they multiplied our knowledge about this little-known religion and the followers of the prophet Mani.

Some Turkic tribes had contact with the Tarim area much earlier and there was presence of Turkic tribes at the northern route of Tarim already in 552CE. In the 9th century the Turkic immigration to the Tarim basin was accelerated due to Mongol conquest of their original area. The todays official Chinese number of Uygurs is only 8 million.

Manichaean Uygur document

Fragment from an 8th-9th-century Manichaean book picturing Manichean priests. Found at Gaochang, Tarim Basin.

Old Turkish and Orkhon Script: (Kok Turkic / Gökturkic) Orkhon script is also known as Turkish runes and has its name from the Orkhon river and valley in Mongolia where some large stones with inscriptions has been found. Orkhon script is the earliest evidence of written Turkish, known to us. The Turkish languages belongs to the Altaic family along with Mongolian, Manchurian and Tungusic (Jurchen).

Irk Bitig, a book from Mogao, written in Old Turkish with Orkhon script. Probably written sometime in between 810 and 942 CE. The Chinese was added in the later part of that period. Irk Bitig is about divination and omens and belongs to the Tengrism religion.

Manichean document written in Orkhon/Gökturk runic script.

Hebrew: Hebrew does not need any presentation here because it is out of range of the topic. However, it is very interesting that Hebrew texts has been found in China at this time. The explanation is that Jews was persecuted in Persia proper for a while and fled towards Sogdiana. In that way their presence in Samarkand and other places on the Silk road increased.

Judaeo-Persian language written with Hebrew script. Ink on paper, 8th century CE, Dandan-Uiliq ruins, Tarim. Now at British Library.


Because of the situation and location of Dunhuang there were a lot of mixed languages during the period that I cover here. One example is the Uygurized Sogdians on one hand, and on the other, there were Uygurs that mixed their language with Sogdian. Among the documents, you can find a text in Chinese on one side and the same text in Tibetan on the other side. Another example is texts written in one language but using the script of another language. According to Imre Galambos there are Chinese written phonetically in Tibetan, and Tibetan written in Chinese script, probably made by bilingual persons mastering both Tibetan and Chinese. There are also some Uygur documents with Chinese characters interspersed in the text. I addition to this, some glossaries and phrasebooks was discovered as another proof of the multilingual situation of Dunhuang and the silk road. There is also an example of a bilingual text containing Khotanese and Sanskrit. Probably there were also opportunities to study Sanskrit in Dunhuang, with Indian teachers, during the Song era.

In this kind of circumstances, it is not a surprise that the Tarim fostered a person like Kumarajiva. He was Kuchean by birth, but a translator from Sanskrit to Chinese, even though he learned Chinese fluently later, when he already had left Xinjiang and Gansu in the northwest to work in Central China, voluntary or not. Kumarajiva was probably a polyglot, mastering at least five languages and several scripts. He was born in Kucha, but his father was of Indian-Kashmiri origin and his mother Kuchean. He studied in Kashmir when he was young.

Among the manuscripts found in Turfan, there are several Manichean documents written in different Iranian languages, and in Dunhuang, some Iranian, Manichean hymns written in Chinese. In the case of the hymns, even if you can read Chinese and follow the sounds, you would not understand anyway. They were written so that Chinese people could worship with the Manicheans. These documents reveal, not only linguistic reciprocal influence, but religious, also because they mention Buddhism in a Manichean context.

In her lovely book "Life along the Silk Road", Susan Whitfield tells several stories of people who live along the silk road whom all have different ethnicities and all speak two or more languages. Her stories paint a picture of a very multi-ethnic and contested place with a lot of military activity, thence there are also many soldiers present that comes from many places far away.

A Nestorian pharmacological text written in New Persian with Syriac script, a development of the older Aramaic script. 

Possible extent of the Kushan tribal confederation (green).

Buddhism and linguistic influence

The first Buddhists in China was, most likely, merchants that arrived in the beginning of the Christian era, according to Valerie Hansen. The thesis of merchants as the forerunners of Indianization is also advocated by Kenneth Hall. Regarding the early contact between India and China there are two rivalling schools. One that argues that Buddhism came by the maritime route, and the other, that are convinced that Buddhism came via the land route. In my opinion it is possible that Indian influences came both ways at the same time, but proofs for the land route are more abundant.* In the case of the land route, the Yuezhi people, among others, carried the faith via the Silk Road from the Kushan state in Bactria. The most important school was the Sarvastivada, one of the 17 extinct old schools. This is also the time when Buddha transforms from a human, to a transcendent state, or a divine principle, if you like. This is the very beginning of Mahayana Buddhism outside India, where it later died out. Regardless of which way Buddhism entered China, it has been the most important factor regarding foreign cultural and linguistic influence in China, speaking historic times.

According to another source, the first translation to Chinese, of a Buddhist text, occurred in 148CE. Valerie Hansen also states that many translations were made in Luoyang from 148CE by An Shigao, a Parthian Buddhist missionary. Some of his Sanskrit to Chinese translations still survives and are present in the Buddhist canon of today. The question of the transmission of Buddhism to China is a very complicated subject and demands a much more extensive text than this.

Conclusion and possible further research

If Baghdad* was the cradle of translation in the Greco-Semitic-Persian world during the 8th to 11th centuries, Dunhuang was the Indo-Persian-Tibetan-Sinitic-Turkic counterpart, from 5th to early 11th centuries. If we believe Takata Tokio, more than 1000 civilians were involved in translating Buddhist texts in Dunhuang, during a period of Tibetan rule (815-841CE). It seems like the Silk road and the Tarim basin is the real hotspot of archaeology, linguistics and new historical theories these days. This is partly because both Russia and China, since two decades, opened to academic cooperation with the rest of the world. New excavations and findings are about to rewrite some of the history as we know it. But new data and information are also available due to improved DNA technology.42

The situation in Central Asia was one of linguistic change, not linguistic continuity. Migrations constantly changed the situation and extensive intermingling and reciprocal influence (sprachbund) was probably standard. Therefore, it is hard to say who spoke what and when. Especially when it comes to all the Indo-Arian and Persian languages, it must have been hard for both Greeks and Chinese to distinguish between the different dialects, and especially if they changed and intermixed. Therefore, names of tribes and people like Yuezhi, Scythians, Sakas, Sogdians, Parthians, Tokaroi etc, must be taken with a pinch of salt. What we can say for sure is that many spoke eastern Satem Indo-European languages of Indian and Iranian stock. The Kucheans and Agneans spoke a distinctively different Indo-European Centum language. In a not very distant future, the scholarly world will unite in a standardized terminology, I hope. IDP is a great leap forward, I think

It is easier to distinguish the Chinese, Turkic and Tibeto-Burman languages, than the Persian speaking counterpart. Perhaps, that is why many sources speak about "Prakrit"? Prakrit (languages spoken in Northern India), has the meaning "vernacular" and they were probably constantly changing dialects. Even Buddha himself is said to have spoken Prakrit. His family belonged to the Sakyas, hence his name Sakyamuni, the wise man from the Sakyas. The Sakyas is said to have been Scythians. Did Buddha speak a kind of Persian? That is something I never heard of, and Pali, the language of the Theravada canon, is very far from Persian, I believe.*** Pali is thought to have been close to Buddhas own language, and is close to Sanskrit, if any language.

Although more than 20 academic institutions are involved in the research of the Tarim manuscripts, there is much more to do. It is likely that hundreds of essays, academic papers and Phd-projects will come out of this gold mine of historic linguistic material. As Susan Whitfield, former director of the IDP puts it, "decades will elapse before all its secrets are uncovered". Another exiting prospect is that many more manuscripts may be discovered in the dry conditions of the Tarim, Taklamakan and Gobi areas. Moreover, we can put the Tocharians of the Tarim where they belong, in the trash bin. The terminology is wrong and obsolete.

* For more about the maritime route see for example The Indianized states of South East Asia by George Coedes, A History of Early South East Asia, by Kenneth Hall or Land Road or Sea Route? by Rong Xinjiang. 

** I consider it common knowledge that Baghdad was a multilingual community before 1258, with Aramaic, Turkic, Arabic, Persian and Greek as the main languages. It was also the place where many of the classical Greek dramas and philosophical books were preserved, so they could be re-translated to European languages during the renaissance. Especially famous was the House of Wisdom (Bayt al Hikma بيت الحكمة ) during the years 786-833CE. (See for example Lapidus, Ira M. A History of Islamic Societies)

*** I hope that Buddha as a Persian is a misunderstanding on my behalf, but Susan Whitfield mentions that Shakas (Scythians) was a tribe in Nepal at the time of Buddha, Whitfield 1999:17 (Siddharta Gautama, or Shakyamuni/Sakyamuni. Skt. शाक्यमुनि śākyamuni with IAST transliteration) It is probably a misinterpretation of śaka शक, that is a particular "white skin people" that lived and dominated in North India from 100BCE to around 200CE. ( This issue needs more research.



Chen, Kenneth. Buddhism in China, a Historical Survey. Princeton University Press, 1972.

Frankopan, Peter. The Silk Roads. Blomsbury Publishing, 2015.

Hall, Kenneth R. A History of Early South East Asia, maritime trade and societal development 100-1500. Rowman & Littlefield Publishers, 2011.

Hansen, Valerie. The Open Empire, A History of China to 1800. Second edition, W W Norton & Co, 2015.

Hansen, Valerie. Silk Road, A New History. Oxford University Press, 2012. 

Lapidus, Ira M. A History of Islamic Societies, 2nd edition, Cambridge University Press, 2002.

Whitfield, Susan. Life Along the Silk Road, University of California Press, 1999. 

Williams, Paul. Mahāyāna Buddhism: The Doctrinal Foundations. 2008.

Articles and Internet Resources

Linguistic Research Center, University of Texas at Austin (Tocharian) and, retrieved 2018-02-13/14 retrieved 2018-01-20, retrieved 2018-01-17

Galambos, Imre. Non-Chinese Influences in Medieval Chinese Manuscript Culture, 2012., retrieved 2018-01-17, retrieved 2018-01-17, retrieved 2018-01-17 Takata, Tokio 高田時雄. 2000. "Multilingualism at Tun-huang". Acta Asiatica, Bulletin of the Institute of Eastern Culture 78: 49-70, retrieved 2018-01-17

VanShaik, Sam. The Origin of the Headless Script (dbu med) in Tibet , 2012., retrieved 2018-01-17

Rong Xinjiang. Land Road or Sea Route, Commentary on the Study of the Paths of Transmission and Areas in which Buddhism Was Disseminated during the Han Period, 2004., retrieved 2018-01-19

Galambos, Imre. Composite Manuscripts in Medieval China, 2016. retrieved 2018-01-20 (IDP retrieved 2018-02-14)

Walter, Mariko Namba. Tokharian Buddhism in Kucha: Buddhism of Indo-European Centum Speakers in Chinese Turkestan before the 10th Century C.E. Sino Platonic papers, no 85. Oc-tober,1998.

Lecture by Colin Renfrew at, Retrieved 2018-02-06

Oxford Research Encyclopaedia, Linguistics at:, retrieved 2018-02-21 (And many more pages from IDP), retrieved 2020-07-13