Youtube to MPEG or Ogg video conversion

Here is the two line method to convert a youtube video to oggvorbis video. Locate clive and ffmpeg2theora in your package and install $clive <a href="http://in.youtube.com/watch?v=6JeZ5oeAEyU">http://in.youtube.com/watch?v=6JeZ5oeAEyU </a>(replace this with the youtube address you want) It will create a flv file. Convert to mpeg video file $ffmpeg -i AmericaAmerica.flv AmericaAmerica.mpg Convert to ogg video file $ffmpeg2theora AmericaAmerica.mpg (replace it with the name of the flv file the previous command created) Done. You can see the . [Read More]

Dhvani 0.94 Released

A new version of Dhvani -The Indian Language Text to Speech System is available now. The new version comes with the following improvements/features Support for 11 languages- Hindi, Panjabi, Gujarati, Marati, Bengali, Oriya, Telugu, Kannada, Tamil , Malayalam and Pashto(Afganistan) Pitch and Tempo modification for speech Direct ogg-vorbis speech output and optional wav output format C/C++ APIs for applications to use dhvani as a shared library. Generic driver for Speech-dispatcher and Integration to Orca through speech dispatcher Python binding through speech dispatcher Improved language detection algorithm Dhvani documentation is available here. [Read More]
dhvani 

Language Detection and Spellcheckers

A few weeks back there was a discussion on #indlinux IRC channel about automatic language detection. The idea is, spellcheckers or any language tools should not ask the users to select a language. Instead, they should detect the language automatically. The idea is not new. There is a KDE bug hereand Ubuntu has this as an brainstorm idea. It seems M$ word already have this. A sample use case can be this: “While preparing a document in Openoffice, I want to write in English as well as in Hindi. [Read More]

Gedit plugin for showing unicode codepoints

While working with Unicode text, it is often required to get the Unicode code points of text for debugging. Using python, it is very easy to get the unicode codepoints of the text. Following examples illustrates it. ` “സന്തോഷ്”.decode(“utf-8”) u’\u0d38\u0d28\u0d4d\u0d24\u0d4b\u0d37\u0d4d’ ` or ` str=u"സന്തോഷ്" print repr(str) u’\u0d38\u0d28\u0d4d\u0d24\u0d4b\u0d37\u0d4d’ ` Well, But we need to take python console and type/paste the text etc..How can we make it more easy? What if pressing F12 key after selecting some text gives the codepoints? [Read More]
gedit  hack  plugin 

Screensavers in your language

I had written a blog post about hacking the glmatrix screensaver with the glyphs of our languages. Now I have those screensavers in the following languages: Hindi : Deb Package , RPM Gujarati : Deb Package , RPM Bengali : Deb Package , RPM Oriya: Deb Package , RPM Tamil : Deb Package , RPM Malayalam: Deb Package , RPM Try it and enjoy !! ps: I used the default fonts of Fedora 9 for these. [Read More]

Swanalekha M17N based Input Method for 11 Languages

Swanalekha is an Input method originally designed for Malayalam. It is works with scim. as well as m17n. The input method scheme is transliteration based and it has a unique feature of candidate list menu(which I will explain shortly). Now I have extended it to 10 other Indian languages. Before explaining how swanalekha is different from other phonetic/transliteration based input methods, let me explain some of the characteristics of transliteration. Transliteration based input methods were following a strict one to one mapping from english letters to another Indian language. [Read More]

സോഫ്റ്റ്‌വേര്‍ സ്വാതന്ത്ര്യദിനാഘോഷം 2008: ഭാഷാ കമ്പ്യൂട്ടിങ്ങ് സെമിനാറും ഇന്‍സ്റ്റാള്‍ ഫെസ്റ്റും

സോഫ്റ്റ്‌വേര്‍ സ്വാതന്ത്ര്യദിനാഘോഷം 2008 ഭാഷാ കമ്പ്യൂട്ടിങ്ങ് സെമിനാറും ഇന്‍സ്റ്റാള്‍ ഫെസ്റ്റും മലബാര്‍ ക്രിസ്ത്യന്‍ കോളേജ്, കോഴിക്കോട് സപ്തംബര്‍ 20, രാവിലെ 10 മണി മുതല്‍ വൈകുന്നേരം 5 മണി വരെ സംഘാടനം: സ്വതന്ത്ര മലയാളം കമ്പ്യൂട്ടിംഗ്, മലബാര്‍ ക്രിസ്ത്യന്‍ കോളേജ്, കോഴിക്കോട്, ഫോസ്സ്‌സെല്‍ നാഷനല്‍ ഇന്‍സ്റ്റിറ്റ്യൂട്ട് ഓഫ് ടെക്ലനോളജി – കോഴിക്കോട് വിവരസാങ്കേതികവിദ്യയുടെ മാനുഷികവും ജനാധിപത്യപരവുമായ മുഖവും ധിഷണയുടെ പ്രതീകവുമാണു് സ്വതന്ത്രസോഫ്റ്റ്‌വേറുകള്‍. പരമ്പരകളായി നാം ആര്‍ജ്ജിച്ച കഴിവുകള്‍ വിജ്ഞാനത്തിന്റെ സ്വതന്ത്ര കൈ മാറ്റത്തിലൂടെ, ചങ്ങലകളും മതിലുകളും ഇല്ലാതെ, ഡിജിറ്റല്‍ യുഗത്തില്‍ ഏവര്‍ക്കും ലഭ്യമാക്കുന്നതിനും ലോകപുരോഗതിക്കു് ഉപയുക്തമാക്കുവാനുമാണു് സ്വതന്ത്ര സോഫ്റ്റ്‌വേറുകള്‍ നിലകൊള്ളുന്നതു്. സ്വതന്ത്ര സോഫ്റ്റ്‌വേറുകള്‍ വാഗ്ദാനം ചെയ്യുന്ന മനസ്സിലാക്കാനും പകര്‍ത്താനും നവീകരിക്കാനും പങ്കുവെക്കുവാനുമുള്ള സ്വാതന്ത്ര്യമാണു് സ്വതന്ത്ര വിവരവികസന സംസ്കാരത്തിന്റെ അടിത്തറ. [Read More]

Geo-visualisation, the FOSS way

My friend Jaisen Nedumpala has been developing a Geo-visualisation system for Cheruvannoor Grama Panchayath(Page in ml_IN) of Kerala. The system, developed using FOSS tools is available here “Development of effective geo-visualisation based decision support system (DSS) involved primarily data compilation from collateral sources, setting up appropriate hardware configuration, design of database and design of a spatial DSS. ” Jaisen used softwares like GRASS, UMN MapServer and ka-Map. He has written a detailed documentation(English) on how he developed this and what are all the tools used. [Read More]
foss 

UTF8Decoder

zabeehkhan was trying to code a Pashto (ps_AF) module for dhvani. And he told me that “it is not saying anything” :). So I took the code and found the problem. Dhvani has a UTF-8 decoder and UTF-16 converter. It was written by Dr. Ramesh Hariharan and was tested only with the unicode range of the languages in India. It was buggy for most of the other languages and there by the language detection logic and text parsing logic was failing. [Read More]