Conferences : FOSS.IN and NCIDEEE

FOSS.IN 2009 starts on 1st December. I wanted to attend all 5 days but I have another conference on Dec 1st to 3rd at Chennai. I am attending National Conference on ICTs for the differently- abled/under privileged communities in Education, Employment and Entrepreneurship 2009 – (NCIDEEE 2009) at Loyola College, Chennai. So I will miss the first 3 days of foss.in.
We have a workout on Project Silpa during foss.in. I am also planning to have a workout with Debayan and Jinesh to get his tesseract-indic OCR work with Malayalam.

See you at foss.in!

Dhvani 0.94 Released

A new version of Dhvani -The Indian Language Text to Speech System is available now. The new version comes with the following improvements/features

Dhvani documentation is available here.

Binary packages and source code are available here

Thanks

  • Rahul Bhalerao for Marathi module and patches
  • Zabeehkhan for Pashto Module
  • Nirupama, CDAC Chennai and CDAC Noida people for testing and reporting bugs
  • NRCFOSS Chennai, Krishnakanth Mane and many others for feedbacks
  • Amida Simputer team for patches on Telugu module especially the Telugu number reading logic
  • Debayan and Roshan for testing and informing problems

There was good amount of code change in this version. Still there are many improvements to do in language modules and synthesizer. Some of the language modules requires developers who speak that language. Syntheziser got some improvements and require some amount of research to make the speech more natural. So your feedbacks, suggestions, bug reports and patches are valuable.

PS: A note for quick usage after installation from binary: After installing deb or rpm, Open gedit, edit->preferences->plugins, enable external tools. Dhvani will be available as a plugin there. Select some text in any of the supporting languages and click the Dhvani menu.

UTF8Decoder

zabeehkhan was trying to code a Pashto (ps_AF) module for dhvani. And he told me that “it is not saying anything” :). So I took the code and found the problem. Dhvani has a UTF-8 decoder and UTF-16 converter. It was written by Dr. Ramesh Hariharan and was tested only with the unicode range of the languages in India. It was buggy for most of the other languages and there by the language detection logic and text parsing logic was failing. So I did some googling, went through the code tables of gucharmap and got some helpful information from here and here

So here is my new UTF8Decoder and converter

/*
UTF8Decoder.c
This program converts a utf-8 encoded string to utf-16 hexadecimal code sequence

UTF-8 is a variable-width encoding of Unicode.
UTF-16 is a fixed width encoding of two bytes

A UTF-8 decoder must not accept UTF-8 sequences that are longer than necessary to
encode a character. For example, the character U+000A (line feed) must be accepted from
a UTF-8 stream only in the form 0x0A, but not in any of the following five possible overlong forms:

  0xC0 0x8A
  0xE0 0x80 0x8A
  0xF0 0x80 0x80 0x8A
  0xF8 0x80 0x80 0x80 0x8A
  0xFC 0x80 0x80 0x80 0x80 0x8A

Ref: UTF-8 and Unicode FAQ for Unix/Linux http://www.cl.cam.ac.uk/~mgk25/unicode.html

Author: Santhosh Thottingal <santhosh.thottingal at gmail.com>
License: This program is licensed under GPLv3 or later version(at your choice)
*/
#include<stdlib.h>
#include<stdio.h>
#include<string.h>
unsigned short
utf8_to_utf16 (unsigned char *text, int *ptr)
{

  unsigned short c;		/*utf-16 character */
  int i = 0;
  int trailing = 0;
  if (text[*ptr] < 0x80)	/*ascii character till 128 */
    {
      trailing = 0;
      c = text[(*ptr)++];
    }
  else if (text[*ptr] >> 7)
    {
      if (text[*ptr] < 0xE0)
	{
	  c = text[*ptr] & 0x1F;
	  trailing = 1;
	}
      else if (text[*ptr] < 0xF8)
	{
	  c = text[*ptr] & 0x07;
	  trailing = 3;
	}

      for (; trailing; trailing--)
	{
	  if ((((text[++*ptr]) & 0xC0) != 0x80))
	    break;
	  c <<= 6;
	  c |= text[*ptr] & 0x3F;
	}

    }
  return c;

}


/* for testing */
int
main ()
{
  char *instr = "സന്തോഷ് തോട്ടിങ്ങല്‍";	/* my name :) */
  int length = strlen (instr);
  int i = 0;

  for (; i < length;)
    {
      printf ("0x%.4x ", utf8_to_utf16 (instr, &i));
    }
  printf ("\n");
/* output is:
0x0d38 0x0d28 0x0d4d 0x0d24 0x0d4b 0x0d37 0x0d4d 0x0020 0x0d24 0x0d4b 0x0d1f 0x0d4d 0x0d1f 0x0d3f 0x0d19 0x0d4d 0x0d19 0x0d32 0x0d4d 0x200d 
*/

  return 0;
}

There may be already existing libraries for this, but writing a simple one ourself is fun and good learning experience.

For example, in python, to get the UTF-16 code sequence for a unicode string, we can use this:

str=u"സന്തോഷ്‌"
print repr(str)

This gives the following output

u'\u0d38\u0d28\u0d4d\u0d24\u0d4b\u0d37\u0d4d'

say_namaskaar.c

/* say_namaskaar.c
 *  This is a sample C code using dhvani text to speech API which I am 
 *  developing now and planning to release soon. New version of dhvani 
 *  will provide a shared library libdhvani and it allows other C or C++
 *  applications to use dhvani synthesizer. Tamil and Marathi modules, pitch, tempo 
 *  control etc are the features for the coming release.
 *  I need to prepare documentation, fix many bugs, test, commit the files in cvs ...
 *  Looking for some free time for all these...
 *  Visit http://dhvani.sourceforge.net
 */

/* compile with gcc -ldhvani -o namaskaar say_namaskaar.c */
#include <dhvani/dhvani_lib.h>
int main(int argc, char *argv[]) {
    dhvani_options options;
    /* Set the pitch and tempo of the speech */
    options.tempo = -10.0; /* reduce the speed by 10%  */
    options.pitch = 2.0;    /* increase the pitch b 2 semitons */
    options.rate = 16000;  /* 16KHz Sampling rate */
    /* Initialize dhvani */
    dhvani_init(&options);
    /* Say Namaskar */
    dhvani_say("नमसकार",  &options);
    /* close the synthesizer */
    dhvani_close();
    return 0;
}
 
/*  We can write a blog post in C too :P . Syntax highlighted by Code2HTML */

Dhvani Now Speaks Marathi

Thanks to Rahul Bhalerao , he wrote the Marathi module for dhvani– The Indian Language Text to speech System. Dhvani can speak 10 Indian languages now: Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Panjabi, Tamil, and Telugu.

Rahul also gave some patches for hindi module and for some other bugs. The code is available in CVS.

The automatic language detection algorithm will not work for Marathi since it uses the devanagari script and I have assigned the unicode range used for language detection to Hindi. So it requires a langauge switch like “dhvani -l mr inputfile”

Many new features for dhvani are in development, incluiding pitch and tempo control of the generated speech. And I am trying to improve the code quality too.

I had demonstrated the tamil module at NRCFOSS, AU-KBC centre, chennai a few days back and Amachu offered help for improving the tamil pronunciaton rules.

For those are interested in Marathi module, I have some sample speech files generated by dhvani in ogg format. The text is taken from an article about Marathi langauge in Marathi wikipedia. Here is the article and here is the exact text used for the speech
1. With default pitch and tempo- Male voice
2. Female voice by positive pitch shift- A feature in development

Dhvani has an IRC channel now: #dhvani at freenode

Can’t Speak? Dhvani will speak for you!

Dhvani can help not only blind users but also dumb users. I will explain how dhvani act as your mouth using KMouth.
Kmouth is as KDE Accessibility Appllication and it act as a test to speech front end. KMouth is a program that enables persons that cannot speak to let their computers speak. It includes a history of spoken sentences from which the user can select sentences to be re-spoken. It learns the words the user wrote and have autocompletion. It also includes a phrasebook, using that you can store the commonly used phrases for quick access.
We will see how dhvani can be used with Kmouth.
open KMouth : KMenu->Utilities->Accessibility->Kmouth. Install it if not already installed
You will get configuration window and give the “Command to speak text” as dhvani %f

Done. Now you can type some text in the Kmouth and ask it speak.

To avoid typing the words that are used often, create a Phrasebook. Refer KMouth Help document for that. You can also add a wordlist so that you will get autocompletion feature while typing words. Refer Kmouth Handbook for that also. It is easy and just a matter of giving some text file to learn.
I hope it will be helpfull for the dumb users even though there are some practical problem like keeping the computer with them…

For for information about dhvani, how to install etc see the documentation

Dhvani – KDE Integration.

It is possible integrate Dhvani Indian Langauge TTS to KDE desktop through its TTS system KTTS. Using this you can dhvani can read the text in kate,kedit,kwrite, Konqueror. You can even listen to the text in the webpages in Konqueror
Dhvani can be itegrated to KTTS using its Command plugin feature. To do this go to control center–>Regional and Accessibility –>Text-to-speech –>Talker Tab. Add a new Synthesizer.


Select the syntesizer type as Command and Langauge as Other. You can select any language since Dhvani doesn’t want langauge parameter and it detects the language automatically.
Give the synthesizer command as dhvani %f

Move this synthesizer to top in the list of Synthesizers and Click apply. Done.
Now take a UTF-8 text in any of the editors described above or take a webpage in any of the supported language. From the tools menu take Speak Text and listen !!!
For for information about dhvani, how to install etc see the documentation

Creating audio books using Dhvani

Dhvani can be used for creating audiobooks in any of the supported languages(Hindi, Malayalam, Telugu, Kannada, Oriya, Bengali, Gujarati, Panjabi).
First of all you should get the latest dhvani source code from CVS in sourceforge. Compile it and install.
To create an audiobook follow these steps
You need the text in utf-8 format. No need to specify the langauge. Dhvani will detect the langauge automatically.

dhvani -o audiobook.wav textfile
oggenc -B 16 -C 1 -R 16000 audiobook.wav

Now you have a file called audiobook.ogg. If you prefer ogg, then your audiobook is ready. If you want the file in mp3 format

oggdec audiobook.ogg

(This will create a file named audiobook.ogg.wav )
lame --preset 192 -ms -h audiobook.ogg.wav

(install lame if it is not present using your package manager)

Now your mp3 file is ready. Transfer it to your music player and enjoy!

I have a sample Malayalam Audio book here

Note: The speech produced for Languages other than Hindi and Malayalam may not be as per their pronunciation rules. There are two solution for this:
a) Teach me that langauge 😉 or
b) Submit a patch to fix that language module

You can find the Dhvani documentation here

Dhvani rewrite

I started a re-write on Dhvani architecture, keeping the algorithm same. This is based on Kiss principle. Instead of client server architecture a single executable is my plan. This will help me to integrate it with KTTS, kate etc. Now I know how to integrate Dhvani with KTTS.
Studied autoconf and automake for this…
More and more items in my TODO list…