This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: [Fwd: [1.7] wcwidth failing configure tests]
On May 12 17:56, Andy Koppe wrote:
> > And here's another question. ?The utf8*.h files claim they have been
> > generated from the unicode.txt file of the Unicode 3.2 standard. ?Do we
> > have the script which generated the utf8*.h files? ?Can we regenerate
> > the files to match the current Unicode 5.1 standard?
>
> There's Markus Kuhn's wcwidth implementation, which says it's based on
> Unicode 5.0:
>
> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
This looks nice.
> Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
> category of characters, which consists of things like Greek and
> Cyrillic letters as well as line drawing symbols. Those have a width
> of 1 in Western use, yet with CJK fonts they have a width of 2. That's
> why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.
We should use the standard variation alone, imho.
And we need some workaround for UTF-16 systems like Cygwin.
Unfortunately, surrogate pairs only work well as part of a string, not
as standalone chars. So wcwidth would return -1 for each single char,
but wcswidth could be tweaked to handle them gracefully.
Corinna
--
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/