This is the mail archive of the
cygwin@cygwin.com
mailing list for the Cygwin project.
1.3.18: BUG: Piping DOS files to grep (v2.5) doesn't work properly
- From: Stacey Sheldon <ssheldon at catena dot com>
- To: "'cygwin at cygwin dot com'" <cygwin at cygwin dot com>
- Date: Wed, 15 Jan 2003 19:39:38 -0500
- Subject: 1.3.18: BUG: Piping DOS files to grep (v2.5) doesn't work properly
Mailing list search didn't find this, nor does it appear
in the FAQ... hopefully this isn't old news to all of you.
Files read from a pipe are treated differently by grep
than files read directly. This results in some unexpected
(by me) behaviour when using grep on files which use
the a DOS line-end (cr/nl). This looks like a bug to me.
I'd expect the following commands to have equivalent
results:
grep myregex blah
grep myregex < blah
cat blah | grep myregex
They are equivalent when the regular file blah uses
Unix line ends, but they differ for a file blahdos which
uses DOS line ends. It appears to me as though grep
is treating its input as binary when reading from a pipe,
but correctly using "undossify_input()" in other cases.
Here is an example. I've created two files, blah (nl line-end)
and blahdos (cr/nl line-end).
$ cat blah
foobarTest
$ od -Ax -a blah
000000 f o o b a r T e s t nl
00000b
$ od -Ax -a blahdos
000000 f o o b a r T e s t cr nl
00000c
These files should match the regex 'Test$' in all cases,
but grep on blahdos fails for this case:
$ cat blahdos | grep 'Test$'
$
And here's why (not the -v to invert the match so we have
something to look at):
$ cat blahdos | grep -v 'Test$' | od -Ax -a
000000 f o o b a r T e s t cr nl
00000c
There's still a cr/nl on the output which wouldn't be there if
grep had interpreted its input as having DOS line ends. Here's
what a successful grep of the UNIX line end file looks like:
$ cat blah | grep 'Test$' | od -Ax -a
000000 f o o b a r T e s t nl
00000b
In fact, if I read the blahdos file in any other way except through
a pipe, it successfully matches (note the stripped out cr on the output):
$ grep 'Test$' blahdos | od -Ax -a
000000 f o o b a r T e s t nl
00000b
$ grep 'Test$' < blahdos | od -Ax -a
000000 f o o b a r T e s t nl
00000b
Just in case you might think that this has something to do with cat
(I did), here's the output of cat for each file:
$ cat blah | od -Ax -a
000000 f o o b a r T e s t nl
00000b
$ cat blahdos | od -Ax -a
000000 f o o b a r T e s t cr nl
00000c
Using head instead of cat gives the same results as well, just to
completely remove cat from the picture.
I'm currently running these versions of tools on win2k:
cygwin 1.3.18-1
textutils 2.0.21 (cat, od, head)
grep 2.5
bash 2.05b.0(8)-release
I also tried this out with cygwin 1.3.17-1 with identical results.
If you need any further information, please cc me directly since I
don't read the mailing lists very often.
Stacey.
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/