Default locale for Russian/Russia should be ru_RU.CP1251


I'm running Cygwin 2.2.0 on an English Windows 8.1 box:

> CYGWIN_NT-6.3 UNIT-725 2.2.0(0.289/5/3) 2015-08-03 12:51 x86_64 Cygwin

Windows regional settings are set to Russian/Russia.

In the absence of any settings in bashrc/bash_profile, `locale` command
outputs the following:

> LANG=ru_RU
> LC_CTYPE="ru_RU"
> LC_TIME="ru_RU"

This is perfectly fine, except that "no charset" in the locale output
means "ISO charset", which is ISO-8859-5 for Russian/Russia and has
never been used (historically, DOS used CP866, Windows used CP1251 ANSI
codepage, and various Unices sticked to KOI8-R before the rise of
Unicode era).

The above is consistent with locale charmap output, which is again

Short C example also confirms ISO-8859-5 is used:

> #include <stdio.h>
> #include <locale.h>
> #include <langinfo.h>
> int main() {
>     const char *locale = setlocale(LC_ALL, "");
>     const char *codeset = nl_langinfo(CODESET);
>     printf("locale: %s\n", locale);
>     printf("codeset: %s\n", codeset);
>     return 0;
> }


> locale: ru_RU/ru_RU/ru_RU/ru_RU/ru_RU/C
> codeset: ISO-8859-5

Cygwin docs state that

> Starting with Cygwin 1.7.2, the default character set is determined by the default Windows ANSI codepage for this language and territory.

which is not true in my case (Windows ANSI codepage for Cyrillic is
CP1251, not ISO-8859-5!). Surprisingly, for Belarusian (a.k.a
Belorussian, Eastern Slavic language very close to Russian) "be_BY"
locale the default charset is indeed CP1251 which is in accordance with
both the documentation and common sense.

Additionally, in `strace locale -u` output, I see multiple
> __get_lcid_from_locale: LCID=0x0419 

"0x0419" corresponds to Russian/Russia (see

Despite that, $(locale -u) returns "en_GB", despite all regional
settings are set to Russian/Russia. I believe this is not correct,
either, and needs to be fixed.


