Cross Language Approximate Search on Indic Languages- A demo

A demo of cross language approximate search in Indic text: The Malayalam word സാമ്പാര്‍ is compared against a paragraph from http://ml.wikipedia.org/wiki/Sambar. In the bottom half, words marked in yellow color are search results. You can see that a Kannada word ಸಾಂಬಾರ್‍ is matched for Malayalam word. And that is why this is called cross-language. The inflections of the words സാമ്പാര്‍ – സാമ്പാറും, സാമ്പാറു etc are also found as results. [Read More]

Conferences : FOSS.IN and NCIDEEE

FOSS.IN 2009 starts on 1st December. I wanted to attend all 5 days but I have another conference on Dec 1st to 3rd at Chennai. I am attending National Conference on ICTs for the differently- abled/under privileged communities in Education, Employment and Entrepreneurship 2009 – (NCIDEEE 2009) at Loyola College, Chennai. So I will miss the first 3 days of foss.in. We have a workout on Project Silpa during foss.in. I am also planning to have a workout with Debayan and Jinesh to get his tesseract-indic OCR work with Malayalam. [Read More]

Announcing Project Silpa

Many of my friends already know about a project I am working on, this is a public announcement of that. The project is named as Silpa, may be an acronym of Swathanthra(Mukth, Free as in Freedom) Indian Language Processing Applications. It is a web framework and a set of applications for processing Indian Languages in many ways. Or in other words, it is a platform for porting existing and upcoming language processing applications to the web. [Read More]