This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
Re: "C" character set (again)
On Jan 15 07:32, Andy Koppe wrote:
> 2010/1/10 Corinna Vinschen:
> > Andy Koppe wrote:
> >> So how about leaving the initial __mbtowc and __wctomb pointers as
> >> they are?
> >
> > It feels so unclean...
>
> Does that matter, as long as everything's cleaned up by the time the
> actual program starts? Speaking of which, what locale context are C++
> global constructors executed in? Is the filesystem/console charset
> already set according to the environment by that point?
Yes.
>
> Here's another concern regarding C changing to ASCII: what would a
> user who sets LANG=C (or LANG=C.ASCII, for that matter) expect to
> happen to filenames? Currently, anything non-ASCII would turn into
> ^X-escaped UTF-8. However, since ASCII doesn't have anything beyond
> 0x7F (btw, thanks for patching newlib accordingly), the ^X isn't
> actually necessary and filenames in C(.ASCII) could just use straight
> UTF-8 anyway.
>
> Therefore, would something like the patch below make sense?
I'm pondering this for at least two weeks now. I'm still not sure what
new problems we add by reverting C to ASCII. As long as the underlying
charset is UTF-8, I don't see any problems, but that could simply be the
result of me being too unimaginative.
Anyway, I have something like your patch already in my locale code. I'm
not setting the cygheap->locale.charset to UTF-8, though. This should
avoid unnecessary calls to internal_setlocale in child processes. I'll
apply that, together with setting C to ASCII by default.
And a matching change to the docs.
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat