This is the mail archive of the cygwin mailing list for the Cygwin project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hi all,This is absolutely in line with the specified interface of read(), whether or not you apply some tcsetattr settings, and whether or not there is a difference between cygwin console and mintty. It is a traditional byte-oriented function and has no knowlege or handling of character encoding, and there is no guarantee that a multi-byte character comes in one piece. (Even if mintty were changed to try to feed them in one piece, there would still be no guarantee that you receive them in one piece.)
I was testing a program that uses non-canonical mode input via tcsetattr().
... Specifically, I entered the chinese character "ä" (which means "rule" or "example"). It occupies 3 bytes in UTF-8 representation: E4, BE, 8B.
On standard console, the read() call returned THREE bytes (n == 3), and (not surprisingly) E4, BE and 8B were returned to buf[].
On mintty console, the read() call returned ONE byte (n == 1), and only
E4 were returned to buf[]. I could grab the other two bytes if I did
additional calls to read().
-- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |