printf field width : bytes or chars?

Posted by leonbloy on Stack Overflow See other posts from Stack Overflow or by leonbloy
Published on 2010-05-08T01:21:39Z Indexed on 2010/05/08 1:28 UTC
Read the original article Hit count: 335

Filed under:
|
|

The printf/fprintf/sprintf family supports a width field in its format specifier. I have a doubt for the case of (non-wide) char arrays arguments:

Is the width field supposed to mean bytes or characters?

What is the (correct-de facto) behaviour if the char array corresponds to (say) a raw UTF-8 string? (I know that normally I should use some wide char type, that's not the point)

For example, in

char s[] = "ni\xc3\xb1o";  // utf8 encoded "niño"
fprintf(f,"%5s",s);

Is that function supposed to try to ouput just 5 bytes (plain C chars) (and you take responsability of misalignments or other problems if two bytes results in a textual characters) ?

Or is it supposed to try to compute the length of "textual characters" of the array? (decodifying it... according to the current locale?) (in the example, this would amount to find out that the string has 4 unicode chars, so it would add a space for padding).

© Stack Overflow or respective owner

Related posts about unicode

Related posts about glibc