I am happy to announce the first version of Malayalam morphology analyser.
After two years of development, I tagged version 1.0.0 . In this release In this release, mlmorph can analyse and generate malayalam words using the morpho-phonotactical rules defined and based on a lexicon. We have a test corpora of Fifty thousand words and 82% of the words in it are recognized by the analyser.
A python interface is released to make the usage of library very easy for developers.
[Read More]
Malayalam Script LGR rules for public review
The Malayalam and Tamil Root Zone Label Generation Rules for International Domain names have been released for public comments. See the announcement from ICANN. This was drafted by the Neo-Brahmi Script Generation Panel (NBGP), in which I am also a member.
Your comments on the proposal for the Malayalam Script Label Generation Rules for the Root Zone (LGR [XML, 18 KB] and supporting documentation [PDF, 998 KB]) can be submitted at the feedback form till Nov 7 2018.
[Read More]
Malayalam spellchecker – a morphology analyser based approach
My first attempt to develop a spellchecker for Malayalam was in 2007. I was using hunspell and a word list based approach. It was not successful because of rich morphology of Malayalam. Even though I prepared a manually curated 150K words list, it was nowhere near to cover practically infinite words of Malayalam. For languages with productive morphological processes in compounding and derivation that are capable of generating dictionaries of infinite length, a morphology analysis and generation system is required.
[Read More]
Malayalam morphology analyser – status update
For the last several months, I am actively working on the Malayalam morphology analyser project. In case you are not familiar with the project, my introduction blog post is a good start. I was always skeptical about the approach and the whole project as such looked very ambitious. But, now I am almost confident that the approach is viable. I am making good progress in the project, so this is some updates on that.
[Read More]
How to customize Malayalam fonts in Linux
Now a days GNU/Linux distributions like Ubuntu, Debian, Fedora etc comes with pre-configured fonts for Malayalam. For Sans-serif family, it is Meera and for serif, it is Rachana. If you like to change these fonts, there is no easy way to do with configuration tools in Gnome or KDE. They provide a general font selector for the whole desktop, but not for a given language.
The advantage of setting these preference at system level is, you don’t need to choose this fonts at application level then.
[Read More]
യുവാക്കളുടെ തൊഴിലഭിമാനവും തൊഴിൽ സൊസൈറ്റികളും
നമ്മുടെ നാട്ടിലെ യുവാക്കൾ നേരിടുന്ന ഒരു പ്രതിസന്ധിയെപ്പറ്റിയും അതിന് പരിഹാരമായേക്കാവുന്ന ഒരാശയത്തെപ്പറ്റിയും എഴുതിയ ഒരു കുറിപ്പാണിതു്.
നമ്മുടെ നാട്ടിൽ സവിശേഷ നൈപുണികൾ ആവശ്യമുള്ള പലതരത്തിലുള്ള കൂലിപ്പണികൾ, ഡ്രൈവിങ്ങ്, കൃഷിപ്പണികൾ, പെയിന്റിങ്ങ്, കെട്ടിടനിർമാണം, മെക്കാനിക് തുടങ്ങിയ ജോലികളിൽ ഏർപ്പെടുന്ന യുവാക്കൾ ധാരാളമുണ്ട്. ഇവരെല്ലാം മിക്കപ്പൊഴും അസംഘടിത മേഖലയിലാണുതാനും. സർക്കാർ, സ്വകാര്യ ജോലി നേടാത്തതോ നേടാനാവശ്യമായ വിദ്യാഭ്യാസമില്ലാത്തവരോ ആയ യുവാക്കളായ പുരുഷന്മാരാണ് ഇവയിലധികവും. പക്ഷേ യുവതികൾ വിദ്യാഭ്യാസം പരമാവധി വിവാഹം വരെ തുടർന്ന് പിന്നീട് കുടുംബജീവിതത്തിൽ എത്തിച്ചേരുകയാണ്. ഇരുപതിനും മുപ്പത്തഞ്ചിനും ഇടക്ക് പ്രായമുള്ള ഇവർ പുതിയൊരു വെല്ലുവിളി നേരിടുന്നുണ്ട്. അതിനെപ്പറ്റി വിശദമായ ഒരു പഠനറിപ്പോർട്ട് ഈയിടെ സമകാലിക മലയാളം വാരിക പ്രസിദ്ധീകരിച്ചിരുന്നു(നിത്യഹരിത വരൻമാർ-രേഖാചന്ദ്ര, സമകാലിക മലയാളം ജൂലൈ 16).
[Read More]
The many forms of ചിരി ☺️
This is an attempt to list down all forms of Malayalam word ചിരി(meaning: ☺️, smile, laugh). For those who are unfamiliar with Malayalam, the language is a highly inflectional Dravidian language. I am actively working on a morphology analyser(mlmorph) for the language as outlined in one of my previous blogpost.
I prepared this list as a test case for mlmorph project to evaluate the grammar rule coverage. So I thought of listing it here as well with brief comments.
[Read More]
How to type Malayalam using Keyman 10 and Mozhi
This is a quick tutorial on installing Mozhi input method in Windows 10.
Mozhi is a transliteration based keyboard for Malayalam. You can type malayaalam to get മലയാളം for example. We will use Keyman tool as the input tool. Keyman input tool is an opensource input mechanism now developed by SIL. It supports lot of languages and Mozhi malayalam is one of that.
Step 1: Download Keyman desktop with Mozhi Malayalam keyboard Go to https://keyman.
[Read More]
Kindle supports custom fonts
I am pleasantly surprised to see that Amazon Kindle now supports installing custom fonts. A big step towards supporting non-latin content in their devices. I can now read Malayalam ebooks in my kindle with my favorite fonts.
[][1]Content rendered in Manjari font. Note that I installed Bold, Regular, Thin variants so that Kindle can pick up the right one This feature is introduced in Kindle 5.9.6.1 version released in June 2018.
[Read More]
Talk on ‘Malayalam orthographic reforms’ at Grafematik 2018
Santhosh and I presented a paper on ‘Malayalam orthographic reforms: impact on language and popular culture’ at Graphematik conference held at IMT Atlantique, Brest, France. Our session was chaired by Dr. Christa Dürscheid.
The paper we presented is available here. The video of our presentation is available in youtube.
Grafematik is a conference, first of its kind, bringing together disciplines concerned with writing systems and their representation in written communication. There were lot of interesting talks on various scripts around the world, their digital representation, role of Unicode, typeface design and so on.
[Read More]