Färgglada hus i bergsluttning, Kargil, Ladakh. Foto: Henrik Liljegren
Kargil, Ladakh, one of the data collection sites, May 2018. Photo by Henrik Liljegren


Hindu Kush (northeastern Afghanistan, northern Pakistan, and Indian Kashmir) is a distinctive region with large elevation differences and some of the world's by far highest mountain peaks. The languages belong to six linguistic phyla (or families): Indo-Aryan, Iranian, Nuristani, Sino-Tibetan, Turkic and the language isolate Burushaski. 

Project about language contact and relatedness 

The research project Language contact and relatedness in the Hindukush region systematically compared the languages spoken in the region, with the aim of finding out how similar or different these languages are in their structures (grammar, sound systems, etc.). The focus has been to investigate whether there is evidence that the languages have gradually become more similar due to contacts between geographically related speaker groups, and at the same time have become more different from their closest linguistic relatives outside the region, or if, for example, the physical environment has favoured isolation and conservation, or even the development of unusual linguistic properties. Henrik Liljegren has been the principal investigator for the project.

The following conclusions can be drawn from the completed project:

  1. There is a clear link between geography and language structure in the Hindu Kush that often cuts across family boundaries. Contacts between adjacent communities have made their languages similar to each other. This is particularly clear at the local level. There is, for example, an area in western Hindu Kush where many characteristics are shared across language boundaries and which clearly overlaps with an area that remained relatively isolated and where the population as recently as 150-200 years ago converted to Islam.
  2. The various language domains show partly different patterns in terms of contact patterns. The language features that especially characterize the languages of Hindu Kush – regardless of relatedness - have mainly to do with phonology and lexical organization. In terms of word order and sentence structure, these languages are often included in larger areal constellations; they are similar in these respects to the languages of South Asia in general or to the languages of large parts of Eurasia. 
  3. Hindu Kush and the entire contiguous Himalayan highlands probably formed a multilingual reservoir during prehistoric times, with representatives of several now extinct language families, with the language isolate Burushaski as a single contemporary remnant. This diversity has gradually diminished, first through Indo-European expansion, beginning about 4,000 years ago, and then through long periods of cultural and political influences from the surrounding lowland cultures.

In addition to direct research results, the interaction with native speakers, several of whom are language activists, has encouraged and contributed to the documentation of low-resource and endangered languages in the region.

Henrik Liljegren i Fayzabad, nordöstra Afghanistan. Foto: Sani Marzban
Henrik Liljegren in Fayzabad, northeast Afghanistan, one of the data collection sites. Photo by Sani Marzban

More about the project

Language contact and relatedness in the Hindukush region, with Henrik Liljegren as its principal investigator, has been carried out in the period 2015—2020. This is now completed, and a final report has been submitted to the project funder Vetenskapsrådet, the Swedish Research Council(421-2014-631).

How the study was conducted

79 speakers from 59 languages were recruited to participate in the study. In collaboration with three institutions in the region, interactive 4-5 days’ workshops were arranged, with speakers of 5-10 languages at a time. Audio and video recordings were made of wordlists, one longer questionnaire, a text translated from a major language (Urdu, Dari, Pashto), and a couple of experimental/interactive elicitation sessions. The material was transcribed and processed to categorize and analyse the languages based on 80 structural properties within five domains: phonology (its sound system), lexical organization, word order, grammatical categories and sentence structure. A comparative basic wordlist was also established for the purpose of confirming or revising previously proposed classification.

Online database 

One tangible outcome of the project is the online database Hindu Kush Areal Typology. It has been established to make processed project data and analysis available in the form of wordlists with linked audio files, descriptions of 80 structural linguistic features and their distributions displayed in tables and interactive maps. The design, which allows for regular instalments in the future, is a collaborative effort carried out with the Max Planck Institute for the Science of Human History in Jena, within the framework Cross-Linguistic Linked Data (CLLD).

The database is now openly available here: hindukush.clld.org

Karta över Hindukush-Karakorum, målområdet för projektet.
Map over the Hindu Kush-Karakorum, target area of the project.