This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: "C" character set (again)


2010/1/8 Thomas Wolff:
>> No, this is about the C locale only. Lots of people and programs make
>> assumptions about the C locale which may not be valid according to
>> POSIX, but which nevertheless hold true for Linux and most (if not
>> all) other Unices, including Cygwin 1.5. The most important assumption
>> is that the C locale is 8-bit clean.
>>
> And byte-transparent, right?
> Which gets me back to this printf issue; actually your point here seems to
> approve my arguments there, if only I had explicitly restricted them to the
> C locale.
> Could you agree that functions like sprintf should handle their char *
> arguments byte-transparently if acting in the C locale?

According to POSIX: no. The format string is a character string, and
all that POSIX guarantees for the C locale is the portable character
set.

But if Linux compatibility is required: yes.

I've had a quick look at the newlib vfprintf source, and what
currently happens is that an invalid byte in the format string is
treated exactly the same as a null terminator: processing ends, no
error is returned. Perfectly valid according to POSIX, since behaviour
at that point is undefined.

But I think it would actually be quite easy to wave invalid bytes
through anyway: print the byte, reset the multibyte conversion state,
and continue processing the string. Still valid according to POSIX,
but also Linux-compatible. I'll propose a patch.

Andy


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]