This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Bogus assumption prevents d2u/u2d/conv/etal working on mixed files.


> -----Original Message-----
> From: cygwin-owner On Behalf Of Ken Thompson
> Sent: 05 April 2004 12:40

> I don't think the behavior should be changed. d2u stands for 
> dos to unix
> which means \r\n to \n.  Why would one expect a dos to unix utility to
> convert mixed line terminator files.

  Well, when you get right down to it, there's no such thing as innate mode
to the file: the EOL type is as mutable as any of the other contents.  It's
each individual line end that has a type, not the file as a whole, and if
d2u means \r\n to \n then that's what I expect it to do.

>  If you need such a 
> utility, then add
> one but don't take a utility that does dos to unix and try to 
> turn it in to
> "anything" to unix or if you do then change the name. Just my 
> 2 cents worth

  I can see some sense in both the suggestion that trying not to mess with
the timestamp if you don't actually change the file is a good idea, and in
the notions that there's a sanity check going on here, either in terms of
detecting binary files or in terms of saying "if the file is already the
mode you've asked to convert it to, maybe YDRMTST".

  OTOH there's a reason why I use a command line interface, and it's because
what I give my computer are commands, not suggestions, and I expect them to
be obeyed.  Anything less is mutiny!  (I'll overlook the occasional "Are you
sure (y/n)?" as minor insubordination).  I don't like my computer to
second-guess me.  I particularly don't like it to second guess me with an
inaccurate heuristic.  And I particularly particularly don't like it to
second-guess me, get it wrong, and then silently bail without any kind of
error message, warning or explanation.

  Now, as far as attempting to protect binary files goes, I think if that's
a goal it should be done with a real heuristic that looks for binary files
rather than one based on EOL character types, and detecting zeros seems like
a good one to me.  If it's a goal, it should be explicitly designed into the
code; if it's not a goal, it shouldn't be used as a heuristic; but I really
can't agree with the "It's a beneficial intermittent side-effect" style of
software engineering.  Hey, making it randomly exit early 50% of the time
would also have the side effect of 'sometimes protecting binary files from
accidentally being converted'.

  As far as the 'it should demand a pure file type' argument goes, I think
that's a bit needlessly puritanical.  Since we're particularly talking about
cygwin here, which means a mixed windoze and posix environment, I think it's
fair to assume that files ending up with a few lines of one type in amongst
mostly lines of another type is going to be a very common occurrence, not a
special case.

  OTOH the argument for preserving existing behaviour is a strong one, and
it doesn't really matter to me whether I need to use a --force flag with d2u
or whether I need to invoke it under a different name such as 'any2u' for
example.

  And on the other hand again <reaches into pocket and removes small
octopus> when what I want is to really and truly delete all \r from a file I
can always pipe it through tr -d.

  Oh, and on the final hand:  protecting the user from their own mistakes is
a gross violation of the BWAM license!

    cheers, 
      DaveK
-- 
Can't think of a witty .sigline today....


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]