In my last blogpost I explained hyphenation of Indian language text in openoffice. In this blogpost I will explain how hyphenation can be done in webpages.
As I explained importance of hyphenation come into picture when we justify the text. The length of the lines are controlled by the parent tags…. Unicode had defined a special character called soft hyphen for hyphenation denoted by ­ . In HTML, the plain hyphen is represented by the “-” character (- or-). The soft hyphen is represented by the character entity reference ­ (­ or ­)
Hyphenator is a project which does exactly the same. “Hyphenator.js brings client-side hyphenation of HTML-Documents on to every browser by inserting soft hyphens using hyphenation patterns and Frank M. Liangs hyphenation algorithm commonly known from LaTeX and Openoffice. “
Hyphenator was not tested for any non-latin languages so far. I tried to add support for Indian languages and the result was satisfactory. I used the
same rules I defined for openoffice. Unlike latin languages, the number of hyphenation patterns for Indian languages is very less and the performance is good because of that.
Update(18-Dec-2008):Thanks to Mathias Nater, author of hyphenator, the patterns were added to upstream.