Search Results

Search found 1 results on 1 pages for 'user10607'.

Page 1/1 | 1 

  • Find non-ascii characters from a UTF-8 string

    - by user10607
    I need to find the non-ASCII characters from a UTF-8 string. my understanding: UTF-8 is a superset of character encoding in which 0-127 are ascii characters. So if in a UTF-8 string , a characters value is Not between 0-127, then it is not a ascii character , right? Please correct me if i'm wrong here. On the above understanding i have written following code in C : Note: I'm using the Ubuntu gcc compiler to run C code utf-string is xvab c long i; char arr[] = "xvab c"; printf("length : %lu \n", sizeof(arr)); for(i=0; i<sizeof(arr); i++){ char ch = arr[i]; if (isascii(ch)) printf("Ascii character %c\n", ch); else printf("Not ascii character %c\n", ch); } Which prints the output like: length : 9 Ascii character x Not ascii character Not ascii character ? Not ascii character ? Ascii character a Ascii character b Ascii character Ascii character c Ascii character To naked eye length of xvab c seems to be 6, but in code it is coming as 9 ? Correct answer for the xvab c is 1 ...i.e it has only 1 non-ascii character , but in above output it is coming as 3 (times Not ascii character). How can i find the non-ascii character from UTF-8 string, correctly. Please guide on the subject.

    Read the article

1