This code
#!/usr/bin/env python
# -*- coding: utf-8 -*-
ml_string=u"സന്തോഷ് हिन्दी"
for ch in ml_string:
if(ch.isalpha()):
print ch
gives this output
സ ന ത ഷ ह न द
And fails for all mathra signs of Indian languages. This is a known bug in glibc.
Does anybody know whether python internally use glibc functions for this basic string operations or use separate character database llke QT does?
The Python source code seems to suggest so
From stringobject.c,
.......
/* Shortcut for single character strings */
if (PyString_GET_SIZE(self) == 1 &&
isalpha(*p))
return PyBool_FromLong(1);
.......
Re: The Python source code seems to suggest so
Thanks Sayamindu,
So when glibc patches gets into distros , let us hope that these problems will disappear.
But QT problem remains
looks like a feature
that seems to be a feature. its perfectly removing the matras.