This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: grep treating my text files as binary!
- From: Bengt Larsson <lists dot cygwin4 at bengtl dot net>
- To: cygwin at cygwin dot com
- Date: Sat, 27 Dec 2014 11:07:27 +0100
- Subject: Re: grep treating my text files as binary!
- Authentication-results: sourceware.org; auth=none
- References: <XnsA40D81CA1FAA8davidrayninfocouk at 80 dot 91 dot 229 dot 13> <549B4258 dot 5050509 at redhat dot com> <XnsA40DECB2AE256davidrayninfocouk at 80 dot 91 dot 229 dot 13> <549C5A6B dot 2000509 at towo dot net> <27CE6A0A-9845-4A1C-A0F8-C0236B95A1E3 at etr-usa dot com>
- Reply-to: cygwin at cygwin dot com
Warren Young wrote:
>On Dec 25, 2014, at 11:41 AM, Thomas Wolff <towo@towo.net> wrote:
>
>> In any case the argument is quite artificial since the new behaviour
>> hits many files that are in fact text files.
>
>Please define the term ?text file? in a way that allows a C programmer
>to write a program that automatically does the correct thing for all
>members of the class ?text file? without involving locales, or an
>equivalent mechanism.
...
>If grep runs into a byte sequence that makes it think it is not legal
>for your current locale, it must treat the file as raw bytes, unless you
>give it -a.
>
>If you don?t like this behavior, say ?alias grep=grep -a? in your
>~/.bashrc, and forget the change ever happened. It?ll be on you when
>some non-text file gets treated as text and grep spams your terminal
>with binary garbage, though.
It's better to use the "alias grep='LC_ALL=C grep'" method. It keeps the
old way of detecting binaries (for example it detects an .EXE as binary)
while allowing you to match mostly-ASCII files with some
mismatched-locale characters. The definition you ask for is already in
the code. For us non-english people detecting what is "mostly ASCII" is
mostly right, at least interactively.
I ran into this, actually. I keep a list of my directories and it is in
CP1252 for reasons of interfacing with CMD.EXE. Suddenly grep couldn't
match it. But I figured something was up and set my locale to CP1252 and
then it worked.
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple