Manjari version 1.800 released

A new version of Manjari Malayalam typeface is available now. Version 1.800 adds tabular number and slashed zero opentype features along with bug fixes. New version of the font is available at: https://smc.org.in/fonts/#manjari Tabular numbers The Kerala health department publishes daily COVID-19 reports and they use Manjari(example). Sometimes the numbers in table in Manjari font is slightly difficult to read when you want to compare the numbers in columns across rows. [Read More]
fonts 

Chilanka font version 1.500 released

A new version of Chilanka Malayalam typeface is available now. Version 1.500 adds glyphs required to support International Alphabet of Sanskrit Transliteration New version of the font is available at: https://smc.org.in/fonts/#chilanka Chilanka was the first typeface I designed in 2014 for Malayalam. The usecase I had in mind while designing it was to use it in comic or non-serious contexts. But once it was released, contrary to my expectation, people used for serious Malayalam writing. [Read More]
fonts 

Procrustes Analysis Based Handwriting Recognition

Many months back, I started an experiment to see if Malayalam handwriting recognition can be done in a non-machine learning based approach. This blog post explains the approach, the work done so far and results. Handwriting recognition can be done while the user is writing(called online handwriting recognition) and recognizing a sample somebody wrote in the past(offline recognition). Online and offline recognition problems are different problems. This is because, in online recognition, it is possible to capture additional details such as pen up, pen down, pen movement directions and rotations. [Read More]

Professional student summit 2020

On February 15, I attended the Professional student summit organized by higher education department of Kerala. The summit aims to provide a venue to interact with professionals with the students. Around 2000 students from 2nd to 4th semesters from different colleges of law, engineering, medicine, agriculture, fisheries, veterinary and management in Kerala attended the summit. The Chief Minister and Minister for Higher Education also attended the Summit. This is the second edition of the event. [Read More]
Talks 

POS Tagging: A review of BIS POS tagset and ILCI-II Malayalam Text Corpus

The Bureau of Indian Standards(BIS) had published a Part of Speech(POS) tagset for Indian languages. POS is the process of assigning a part of speech marker to each word in a given text. In this article, I am reviewing the tag set defined in it. While developing mlmorph project I had explored a candidate POS tagging schema for Malayalam. I did not choose BIS tagset for the reasons I am going to explian in this article. [Read More]

Presidential award for contributions to Malayalam

Happy to share the news that I am awarded by President of India for contributions to Malayalam language. Maharshi Badrayan Vyas Samman by the Hon. President of India is in recognition of my contributions in the field of Malayalam language. The award, instituted in 2016, is given to the substantial contributions to languages such as Sanskrit, Persian, Arabi, Pali and Classical Oriya, Classical Kannada, Classical Telugu, and Classical Malayalam. This is given to young scholars in the age group of 30 to 45 years. [Read More]
award 

Root Zone Label generation rules for Malayalam released

On July 10,2019 ICANN released Label generation rules for eight scripts Devanagari, Gurmukhi, Gujarati, Kannada, Malayalam. Oriya, Tamil, Telugu. These rules are criteria for determining valid Domain Names for the Root Zone of the Domain Name System (DNS). The Internet Corporation for Assigned Names and Numbers (ICANN) is a non-profit organization which takes care of the whole internet domain name system and registration process. Internationalized Top Level Domain Names are domain names not limited to English. [Read More]
icann  idn 

Markov chain for Malayalam

I have been trying to generate a Markov chain for Malayalam content. A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.(wikipedia). For natural language, it represents a probabilistic model of words- the probability that one word can come after another word. This model can be prepared by feeding large amount of text to system that learns the probabilities of each words. [Read More]
markov