Python isalpha is buggy

This code

#!/usr/bin/env python
# -*- coding: utf-8 -*-
ml_string=u"സന്തോഷ്  हिन्दी"
for ch in ml_string:
    if(ch.isalpha()):
        print ch

gives this output

സ
ന
ത
ഷ
ह
न
द

And fails for all mathra signs of Indian languages. This is a known bug in glibc.
Does anybody know whether python internally use glibc functions for this basic string operations or use separate character database llke QT does?

3 thoughts on “Python isalpha is buggy”

  1. The Python source code seems to suggest so

    From stringobject.c,


    .......
    /* Shortcut for single character strings */
    if (PyString_GET_SIZE(self) == 1 &&
    isalpha(*p))
    return PyBool_FromLong(1);
    .......

    1. Re: The Python source code seems to suggest so

      Thanks Sayamindu,
      So when glibc patches gets into distros , let us hope that these problems will disappear.
      But QT problem remains

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.