This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

"C" character set (again)


Following the "printf treats differently a string constant and a
character array" issue at
http://cygwin.com/ml/cygwin/2009-12/msg01009.html, I'm wondering again
whether the "C" locale shouldn't go back to using ASCII rather than
UTF-8, to avoid surprises like that and also to fit with many people's
expectation that "C" means ASCII. I think that would save us a bunch
of trouble and pointless legal/religious discussions about the C
locale.

This does not affect the default locale returned by setlocale when
nothing is specified in the environment, which should remain
"C.UTF-8". In particular, filename conversion and console would not
(and should not) be affected by changing the "C" locale's character
set, as their characters sets are determined by a call to
"setlocale(LC_CTYPE, "") at process startup, after which . In other
words, we don't actually need "C" to mean UTF-8.

What would be required to do this, beyond removing the following
special casing in newlib?

Andy


--- newlib/libc/locale/locale.c 9 Oct 2009 08:25:28 -0000       1.29
+++ newlib/libc/locale/locale.c 29 Dec 2009 06:47:57 -0000
@@ -242,13 +242,8 @@ static const char *__get_locale_env(stru

 #endif

-#ifdef __CYGWIN__
-static char lc_ctype_charset[ENCODING_LEN + 1] = "UTF-8";
-static char lc_message_charset[ENCODING_LEN + 1] = "UTF-8";
-#else
 static char lc_ctype_charset[ENCODING_LEN + 1] = "ASCII";
 static char lc_message_charset[ENCODING_LEN + 1] = "ASCII";
-#endif
 static int lc_ctype_cjk_lang = 0;

 char *
@@ -450,11 +445,7 @@ loadlocale(struct _reent *p, int categor
   if (!strcmp (locale, "POSIX"))
     strcpy (locale, "C");
   if (!strcmp (locale, "C"))                           /* Default "C" locale */
-#ifdef __CYGWIN__
-    strcpy (charset, "UTF-8");
-#else
     strcpy (charset, "ASCII");
-#endif
   else if (locale[0] == 'C'
           && (locale[1] == '-'         /* Old newlib style */
               || locale[1] == '.'))    /* Extension for the C locale to allow


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]