This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: readdir truncates file names whose UTF-8 representation is longer than 255 bytes


On Mar  2 06:56, Uri Simchoni wrote:
> Hi,
> I'm using Cygwin 1.7.7 in UTF-8 mode. I have a file whose name is composed of Hebrew character, so the UTF-8 representation is longer than 255 characters.
> Trying "ls -l" fails to list the file's attributes.
> Using a short C program that loops through a directory (readdir()/stat()) shows that readdir() truncates the file name.
> Is there any way around it? (using environment variable, fstab or system call other readdir - I want to keep UTF-8)

I don't think there's a way around this, at least not an easy one for
Cygwin.  The problem is that the dirent structure has no room for a
multibyte string of more than 255 bytes, while the underlying OS
provides filenames with up to 255 UTF-16 chars.

To support that, we would have to raise the size of a single dirent so
that it allows names with at least 512 bytes, but even that might be too
short, 1024 would be required.  That's not exactly an easy change, so we
won't do that any time soon, I think.

The only solution for now is to switch to another charset or to
shorten the filenames for now.


Sorry for not having better news,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]