This code
#!/usr/bin/env python # -*- coding: utf-8 -*- ml_string=u"സന്തോഷ് हिन्दी" for ch in ml_string: if(ch.isalpha()): print ch
gives this output
സ ന ത ഷ ह न द
And fails for all mathra signs of Indian languages. This is a known bug in glibc.
Does anybody know whether python internally use glibc functions for this basic string operations or use separate character database llke QT does?