This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: Need help with multibyte UTF-8 characters
Greetings, Thomas Taylor!
> I believe that Cygwin displays certain UTF-8 characters incorrectly. To
> see the problem, first save the attached "utf-8_test.sed" text file to
> your desktop.
First, your "NBSP" is actually http://www.fileformat.info/info/unicode/char/23b5/index.htm
> Then run "mintty," and set its options by right clicking
> in its title bar, selecting "Options" and then "Text."
I just keep them clear.
> On the Text page
> set "Locale" to "en_US" and "Character set" to "UTF-8," and then
> "Save." Now exit and restart mintty. Change directory to your desktop
> and run the editor "vim" on the utf-8_test.sed file. Once inside vim do
> a ":set fileencoding=utf-8". You should now see that vim displays
> correctly a sample of one-, two-, and three-byte UTF-8 character
> encodings in the test file. Vim fails, however, on the three-byte
> encodings for the "en" dash, the "em" dash, and the ellipsis, each of
> which displays incorrectly as a filled-in rectangle. Now exit vim and
> do a "less" or "cat" on the utf-8_test.sed file. You should see most of
> the sample UTF-8 encoded characters displayed correctly, except once
> again for the en dash, em dash, and ellipsis.
All displayed correctly. Lucida Console 11pt.
> So it looks like a problem in the underlying Cygwin run-time libraries
> rather than in vim, less, or cat. I haven't tested this on four-byte UTF-8
> character encodings, but assume Cygwin will have similar problems.
I don't have a good console font for mb4, but I presume it will be displaed
just fine.
--
With best regards,
Andrey Repin
Thursday, December 14, 2017 21:59:07
Sorry for my terrible english...
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple