This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: default encoding (was: Re: GNU screen hangs)


On 2009-08-30, Andy Koppe <andy.koppe@gmail.com> wrote:
> If a locale is specified without an encoding, Cygwin 1.7 uses the
> Windows system's default "ANSI" codepage, i.e. CP1252 or such like.
>
> Presumably X implements the encodings itself rather than use
> setlocale(LC_CTYPE, "") and rely on the standard conversion functions?
> Hence, for proper interoperability, it would need to duplicate the
> fallback to the Windows ANSI codepage as well.
>
> Unfortunately there doesn't seem to be a standard interface for
> finding out what charset is being used with a locale setting that
> doesn't explicitly specify one.

I have LC_CTYPE=en_US.UTF-8, of course. And still Xlib fails.

>> Another problem is that a after an upgrade a couple of
>> months, various Python software (duplicity and eyeD3 at
>> least) stopped working with ÂUTF-8 file names (and probably
>> other input too). This is fixed by adding the call
>>
>> Âlocale.setlocale(locale.LC_CTYPE, "")
>>
>> in the programs. Not sure where the fault is, or if it
>> has been fixed by now.
>
> Strictly speaking, the default "C" locale is ASCII only, so programs
> shouldn't rely on anything that happens to be working on a particular
> system. Having said that, handling of non-ASCII characters in Cygwin's
> C locale has indeed changed. Not sure how and why though. See my "The
> C Locale" post.

I'm not sure how this is relevant. The problem seems to be
that since that one update (might have been a minor version
change in Python), Python programs aren't in 
multibyte/locale-aware mode by default anymore, which that 
call above enables, my setting being LC_CTYPE=en_US.UTF-8. 
Now, the question is whether

  1. Have Cygwin packagers somehow disabled the Python 
     interpreter from calling setlocale?

  2. Or has it been disabled in Python entirely? There 
     was no problem previously.

I think the Python interpreter should call setlocale,
instead of having Python programs themselves do it,
because it is half-an-OS and does lots of character
set mangling, that Python software shouldn't have to
be aware of.

Anyway, I think this problem may have been fixed already
-- not 100% certain -- since eyeD3 no longer dies on
some tested file names that do not fit into the ASCII 
range, and I never hacked it to include the setlocale 
call, just some custom id3 tag backup scripts using 
its library.

-- 
Stop Gnomes and other pests! Purchase Windows today!
  http://iki.fi/tuomov/b/archives/2009/07/21/T17_26_09/


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]