One million Wikipedia articles by translation

I am happy to share a news from my work at Wikimedia Foundation. The Wikipedia article translation system, known as Content Translation reached a milestone of creating one million articles. Since 2015, this is my major project at WMF and I am lead engineer for the project. The Content Translation system helps Wikipedia editors to quickly translate and publish articles from one language wiki to another. This way, the knowledge gap between different languages are reduced.

Machine translation from different providers are used to assist the editors, but raw machine translation output cannot be published. Editors are encouraged to curate the machine translation before publishing to a real Wikipedia. Different MT engines such as Google, Yandex, Apertium are used depending on the language pair.

Following is a video we created when we celebrated 100 thousand articles milestone in 2016.

The English Wikipedia has about 6 million articles and is the largest one. Wikipedia exist in more than 300 languages. The size of them varies a lot. For example, Malayalam Wikipedia has 75000 articles, Hindi has 150000 articles and Spanish Wikipedia has 1.7 million articles. So a system that helped editors to create one million articles in various language editions is a big achievement and I am proud to be part of the team developed it.

The ease of creating articles attracted lot of new editors too. From our casual observations, we saw very active users who created thousands of articles. Some of them developed a habit of translating articles every day. Some of them passionately focus on certain categories of articles and make sure all of them are available in their home Wikipedia. We are very thankful to each of these users.

comments powered by Disqus