Malayalam Spellchecker version 1.1.1 released

A new version of Malayalam spell checker based on mlmorph is available as python library.

Install the library

$ pip install mlmorph_spellchecker

Sample usage

>>> from mlmorph_spellchecker import SpellChecker
>>> spellchecker = SpellChecker()
>>> word = "ഉച്ഛാരണം"
>>> spellchecker.spellcheck(word)
False
>>> spellchecker.candidates(word)
['ഉച്ചാരണം']
>>> spellchecker.spellcheck("ചിത്രകാരൻ")
True

The new version adds a database of commonly mistaken words of Malayalam for quick checks and correction. If the given word is present in that common list, spellcheck result and correction suggestions will be based on that database. This database is based on Malayalam Wikipedia’s commonly mistaken words and Kerala government glossary of such words. Source code is at gitlab

A web version of the same library is available for online usage at https://morph.smc.org.in/spellcheck. I had written about this in previous blog post

I had written an article about its technology two years ago. There is also an incomplete extension to LibreOffice.

The efficiency of the spellchecker depends on the coverage of Malayalam vocabulary by morphology analyser. As Malayalam is a morphologically rich language, it has infinite vocabulary and the morphology analyser that address this nature is an ongoing project.

comments powered by Disqus