Friday, June 26, 2009

PL 37/09: Multilingual LATINA

Filed under: LATINA — plinius @ 7:18 am

wikipediaYesterday – at LATINA – we explored the possibilities of Wikipedia as a tool for teachers and students.

Wikipedia is not just a web-based encyclopedia – it is a global network of interrelated user-created encyclopedias in more than two hundred languages. Including Latin. The Wikimedia Foundation also supervises a number of other projects based on free and open access to user-generated content.

In the LATINA Summer class 2009 nobody had English as their mother tongue. We use English as our common language (lingua franca), but speak other languages at home. The actual languages present in the class were Chinese, Luganda, Nepali, Norwegian, Polish, Swahili and Ukrainian.

Wikipedia has editions in all these languages – but their actual size (as of June 2009) varies a lot:

  • Polish – 1.1 million articles
  • Chinese – 840 thousand
  • Ukrainian – 420 thousand
  • Norwegian – 220 thousand
  • Swahili – 12 thousand
  • Nepali – 7.100
  • Luganda – 700

Wikipedia is a community as well as an encyclopedia. Even a small Wikipedia can become a useful learning resource if a group of people get together and decide to build it. About twenty people are needed, the Wikimedia Foundation suggests, to create a basic Wikipedia in a new language.

Wikipedia itself is only a few years old, and many new Wikipedias have been created in these years. At the moment, Wikipedias with more than ten thousand entries exist in nearly ninety languages.

One way of creating new materials in many local languages is to use Google Translate. Since Wikipedia allows free (non-commercial) reuse, it is perfectly acceptable to take an article from the English edition and translate it (automatically) into Chinese, Polish,  Ukrainian or Norwegian.

The result will not be perfect – and should definitely not be published as such. But with editing, of content, approach and language, you can often get a perfectly acceptable result – without writing a completely new article from scratch.

At the moment, Google Translate covers 42 languages. More are added every year. Non-Western languages are still poorly covered – but they are coming.

Already, the combination of a multilingual Wikipedia and Google Translate will provide an impressive range of resources in all major and many smaller languages. Five years from now, the resource base will be much larger. And what will happen – if we make an effort – ten, twenty, thirty years from now?

Imagine …


Participant languages

  • LATINA Spring 2009: Chinese, Hungarian, Norwegian, Spanish,
  • LATINA Summer 2008: Arabic, Chinese, Croatian, Norwegian, Polish, Russian


Google Translate

Non-western languages available for translation

  1. Arabic
    • More than 50 million Internet users
      • Saudi Arabia 18 million
      • Egypt 11
      • Morocco 7
      • Sudan 4
      • Algeria 4
      • Tunisia 3
      • Syria 3
      • Lebanon 2
      • Jordan 1
      • Kuwait 1
      • Countries with less than 0.5 million not counted
  2. Chinese
    • 300 million
  3. Filipino
    • 21 million
  4. Hindi
    • 81 million in India (Hindi is spoken by about 40% of the population)
  5. Indonesian
    • 25 million
  6. Japanese
    • 94 million
  7. Korean
    • 37 million (South Korea)
  8. Persian
    • 23 million (Farsi and related languages are spoken by about 70% of the population)
  9. Thai
    • 13 million
  10. Turkish
    • 27 million
  11. Vietnamese
    • 21 million

Western languages

  1. Albanian
  2. Bulgarian
  3. Catalan
  4. Croatian
  5. Czech
  6. Danish
  7. Dutch
  8. English
  9. Estonian
  10. Finnish
  11. French
  12. Galician
  13. German
  14. Greek
  15. Hebrew
  16. Hungarian
  17. Italian
  18. Latvian
  19. Lithuanian
  20. Maltese
  21. Norwegian
  22. Polish
  23. Portuguese
  24. Romanian
  25. Russian
  26. Serbian
  27. Slovak
  28. Slovenian
  29. Spanish
  30. Swedish
  31. Ukrainian


Classification is based on geography or culture rather than on language family. Hindi and Persian are Indo-European languages like most European languages. Estonian, Finnish and Hungarian belong to the Uralic family, however. I take Israel as culturally Western, though  Hebrew (and Maltese) are Semitic languages like Arabic.


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at

%d bloggers like this: