Irina Samarina

Institute of Linguistics, Russian Academy of Sciences
Research Center For Language And Ethnic Relations

On the experience of the creation of data base “Languages of Russia: sociolinguistic portray”

The data base “Languages of Russia: sociolinguistic portray” [1] contains the information on the functioning of 86 languages of the Russian Federation including endangered languages of Siberia. In the paper the terms of information fulfilling, programme provision of the data base, the possibilities and perspectives of it’s development are considered.

The sources of sociolinguistic data were:

  1. publications (mainly for the last 20 years);
  2. field sociolinguistic surveys;
  3. the data given by the researchers, engaged in research of the languages;
  4. information presented on the web sites of the subjects of the Russian Federation.

The basis for information fulfilling of the system makes the collective written work “Written languages of the world. The languages of the Russian Federation. Sociolinguistic encyclopedia“ that was carried out in the frames of international project “ Written languages of the world: a survey of the degree and modes of use” (WLD 1988-1998) [2] by the Research Center on language and ethnic relations of the Institute of Linguistics in team-work with researches from republics and regions of the Russian Federation [3] (Mikhalchenko, 2000; Mikhalchenko, 2003) [4]. The description of languages represents the answers on the questions of “The charts of language description” (Mikhalchenko, 2003: XL-L). It caused the comparability of the materials and their transformability into the data base.

The considered data base has large functional possibilities and represents the tool for:

  1. carrying out science researches that require great information scope,
  2. creation of functional classification of the languages,
  3. making solutions on urgent practical points of the language politics in Russian Federation etc.

The data base includes 20 main information blocks including names of ethnoses and their languages, statistic and geographical data, brief linguistic information, data on writing system and orthography, official status of the language, data on the history of language development and bibliography; information on the use of the language in religious practice, periodicals, education, mass-media, governmental and administrative institutions, in court system, legislative sphere, data on the sources of the information. The general number of the rubricator mounts to several hundreds.

By working out of the program system Visual Studio.NET 2003 and SQL Server 2000 as mechanism of DBMS was used. In the installation of the system the file in Access format was included that was got by export from SQL Server.

The programme's main window consists of the navigation panel, rubricator and data panel. It should be noted “the friendliness” of the elements of user’s interface that is provided by their similarity with widely spread programmes.

The navigation panel allows to point out the main criteria of the query on the language or a group of languages, to move between the results of the query, to make full text search in the data base, to keep the results of the query and print them. The selection query is carried out with the help of the rubricator panel. It’s points represent modified variant of “The charts of the language description” (Mikhalchenko 2003: XL-L). The results of the queries are given on the data panel. Depending on the selected point of the rubricator the data are presented in table or as an html-document.

At the time the modification and fulfilling of the data base are being done with a) the results of the census 2002; b) new materials on functioning of languages of aboriginal ethnoses and diasporas; c) map and graphical data; d) data of field sociolinguistic surveys; e) audio-and video materials; f) text samples [5]


  1. Anoshkina etc. 1995 – Anoshkina Zh.G., Kazakevich O.A. The sociolinguistic data base ‘Languages of minority ethnoses of Russia (YAMAL)’ // The methods of sociolinguistic research. M., 1995. pp. 7-26.
  2. Mikhalchenko 2000 – The Written languages of the world. Languages of the Russian Federation. Sociolinguistic encyclopedia. Book 1 / executive editor V.Yu.Mikhalchenko. M.: Academia, 2000. 656 p.
  3. Mikhalchenko 2003 – Written languages of the world. Languages of the Russian Federation. Sociolinguistic encyclopedia. Book 2 / executive editor V.Yu.Mikhalchenko. M.: Academia, 2003. 848 p.
  4. WLW 1988-1998 – “The written languages of the world: a survey of the degree and modes of use”. 1988-1998. Vol. I-V.


  1. Grant from RGNF № 01-04-12025: “Languages of Russia: sociolinguistic portray” data base. Demo version can be found on The data base “Languages of minority ethnoses of Russia (YAMAL)” was included into the mentioned above data base (Grant RGNF № 96-04-12679 “The creation of the data base “Languages of minority ethnoses of Russia (YAMAL)”, it contains information on 54 languages of aborigional minority ethnoses of the Russian Federation (Anoshkina etc. 1995: 7-26). Research Center on ethnic and language relations worked in team work with Tokio university on the English version of this data base, it was included as a part of the world data base “Endangered languages of the world”, created with support of European Council and UNESCO. [ back ]
  2. The iniciator of this project on research of functioning of written languages of the world in different communicative spheres was International Center on language planning research by Laval University (Quebec, Canada), the head - Grant D. MacConnel. [ back ]
  3. Grant RGNF № 93-06-11039 “Written languages of Russia: sociolinguistic portray”. [ back ]
  4. Grant RGNF № 99-04-16158d and № 02-04-16059d. [ back ]
  5. Grant RGNF № 96-04-12679 “ Languages of minority ethnoses of Russia (YAMAL)”. Grant RGNF № 05-04-12425 б data base “Languages of Russia: the dynamics of functioning”. [ back ]
© IEA RAS, 2005
This website was created with support from UNESCO Moscow Office