This is the mail archive of the cygwin-apps@sourceware.cygnus.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Pending change to cygwin DLL and binmode/textmode musings


On Mon, Jun 26, 2000 at 10:41:16AM +1000, Robert Collins wrote:
>> On Sun, Jun 25, 2000 at 03:54:01PM +1000, Robert Collins wrote:
>> >I like the idea of a database of files... that would mean less porting
>> >issues, particularly with programs that act on files in common with other
>> >tools.
>>
>> Personally, I don't like the thought of maintaining a list.  I think we'll
>> constantly be saying "Update your /etc/filemodes to the newest version".
>
>...instead of please update <tool with text mode bug of the month>? Which is
>easier/more/less error prone?

Assuming that it is relatively easy to update a program, which it sort
of is, then it seems obvious to me that saying "grab the latest awk"
will be better (or at least equivalent to) "grab the latest
/etc/filemodes".

If /etc/filemodes is accidentally deleted or the user resets their /etc
mount or if they download a "new" /etc/filemodes then they will cause
themselves problems.

>> And, as the file grows larger it will take time to parse, slowing down
>> every cygwin application.  I guess we could get around that by setting
>> some kind of flag in the executable but if we are going to do that then
>> why not just record the filenames in the executable itself.
>
>a) Well I think what's really under discussion is some sort of new file
>attribute - and with a hash table on the name of the file the lookup would
>stay very quick for reasonable numbers of files. (check list file date, if
>modified rebuild hash table, otherwise just lookup). that wouldn't work if
>you were planning wildcards like
>  set_default_open ("/*/*.c", O_BINARY);
>but I sure a reasonably fast implementation could be designed.

I'm not sure why you are using set_default_open as an example since this
has nothing to do with the alternate file-based method of doing things.
Maybe you're advocating that programs that need it could open a file and
load the appropriate tables.

I didn't plan on implementing any kind of pattern matching since that
would slow a program down.  It would only slow down a program that used
it, though.

The fact remains that no matter how fast you make the opening and
reading of a file it would slow down every single cygwin operation to
some degree.  If you try to avoid opening and parsing the file based on
file dates that means that you have to store YA thing in cygwin's shared
memory.  That's not intrinsically bad but it is not currently designed
to grow without bounds.

If you don't use cygwin's shared memory then any program that needs the
information has to open the external file.  I don't see any way around
this.

I'm not sure what you mean about a file attribute, though.  It is already
possible to mount a file as "text" if you want and cygwin should do the
right thing.

If we're only talking about a minimal number of system files, then we
could just provide a default set of mounts:

mount -t c:\cygwin\etc\passwd /etc/passwd
mount -t c:\cygwin\etc\group /etc/group
mount -t c:\cygwin\etc\termcap /etc/termcap

and let people manage this kind of thing through the mount table.

>b) As we are only talking about setting defaults for the common files on the
>system, the number of files in list w/wildcards should not get beyond a
>couple of hundred anyway, and a non-wildcarded database could uses hashes.
>
>> I'd rather have the programs operate correctly without external
>> dependencies.
>
>I agree with this, but no file stands on it's own - look at the amount of
>(excellent) work done to date on cygwin... that is an external dependency. I
>believe minimising the changes needed to move source between platforms is a
>Good Thing.

Sure, the OS is an external dependency.  Cygwin1.dll is an external dependency,
Kernel32.dll is an external dependency.  Luckily, they are all pretty much
transparent.  I don't think that an external text file would be transparent.

>One benefit of an external database that I missed before is that when folk
>download un-cygwinised source that compiles and runs with only text-mode
>issues, the problem will be caught _by the platform_.. no questions needed
>on the mailing lists. Here is where a wildcard or perhaps a regex style
>definition in an external database could be very useful...ie
>====file starts======
>[O_TEXT]
>/.*\.c
>/.*\.h
>/.*\.cpp
>/.*[Mm]akefile
>[O_BINARY]
>/.*\.o
>/.*\.a
>======file ends=====

I still think that 1) opening and reading a file and 2) doing multiple
regex pattern matching on every opened filename is not a performance
hit that we want to consider.

IMO, all of your examples above are currently non-issues due to the fact
that gcc and make should now be correctly interpreting files with \r\n
line endings.  Now that we can easily update individual patches and
get them to the cygwin community quickly, I think we'll see a drastic
cutoff in complaints about textmode issues as we clean up utilties like
make.

What I was trying to do was provide a method to easily and minimally
modify a package to ease the effort in poring over files, looking for
fopens to change.  The external file method is attractive because it
allows us to "fix" a utility quickly but I am not comfortable with the
other tradeoffs.

cgf

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]