This is the mail archive of the cygwin-apps mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch/rebase] Add a rebase database to keep track of DLL addresses


On 7/24/2011 7:44 AM, Corinna Vinschen wrote:
> On Jul 23 13:01, Charles Wilson wrote:
>> Well, I guess -- but there aren't (or shouldn't be) any LFs *or* CRs in
>> the filenames of DLLs.
> 
> The file header is binary data.  It can also contain 0x10 bytes which
> are converted to 0x13 0x10 at write time.  Maybe that somehow threw it
> off.

Good point.

>>    read(fd, array[i].name, array[i].name_size)
>>
>> returned success (or, at least, a non-negative value), but nothing was
>> stored in array[i].name.  That's one reason I switched to calloc for the
> 
> How do you know there was nothing stored?  Maybe it was just not what
> you expected.  Say, if read returned 1 and wrote a 0x0 byte, it can
> easily be missed.

Sure, but the actual contents of *(array[i].name) were not changed,
according to my tests.  I think the issue is that all of the read()
checks are of the form:

	if (read (...) < 0)
            report error

but read is returning 0, which is perhaps unexpected, but according to
posix is /not/ (always) an error (*). So the test above is more-or-less
correct (**)...but rebase.c isn't set up to handle 'short reads'.  By
opening the file in O_BINARY, for whatever reason/speculation, we no
longer experience the '(really) short reads' on mingw.

(*) although I would have thought that read() would block until it DID
get the number of requested bytes -- at least on mingw, where it can't
be interrupted by 'signals' -- since we do not open() with O_NONBLOCK. ??

(**) I suppose we *could* explicitly check that the return value == the
requested number of bytes.  This would cause failures to be reported,
even on cygwin, if the read is interrupted by a signal.  However...if
the read is interrupted by a signal on cygwin /as is/, we'll have an
incomplete string in array[i].name (with no trailing null!), and all
array[i+N].name will be wrong.  So maybe, if we're not going to support
retrying each read if the return value < requested, then we SHOULD check
for that and return failure.

> But actually that doesn't matter.  Using O_BINARY is the right thing to
> do anyway, so I'm glad you stumbled over this problem before we made a
> release :}

Ack.

>> Ack -- but that's for another patch. the Q is, add quick-n-dirty now,
>> then consolidate and reorganize later -- or reorganize first, and then
>> (or simultaneous with) adding the new developer's utility.  Either way,
>> doesn't matter to me.
> 
> I'd prefer to put just the type information into a header right from the
> start.  The rest can be changed later.

OK. I'll incorporate that into this patch, but omitting the new program.

>> That's precisely why. msys compiler is 3.4.x which supports only
>> -static-libgcc.
> 
> Oh boy.

Yep. I'm hungering for the day when msys-2.0 is created, as a much
simpler fork from cygwin-$modern.  If we're lucky it could even be a
simple patch rather than a full-on fork. (With cygwin's new /etc/fstab
support, support for multiple simultaneous cygwin installs, AND dropping
support in MinGW/MSYS for Win9x/WinNT-, the delta between cygwin and
$stuff-msys-needs is much smaller. Basically: changing the name of
global objects, modifying uname(), turning off a lot of perms/security,
and the auto-path-translation logic in spawn.cc)

> The entire expression is just too complicated and error prone, IMHO.
> AFAICS, all of what's needed can be done in a single invocation of
> GA-"swiss army knife"-WK, and it's probably not even complicated...

Likely.

> [...time passes...]
> 
> What about this:
> 
>   /bin/ps -s | /bin/gawk '\
>     # Count number of running ash or dash. \
>     /\/bin\/(d)?ash(\.exe)?$/{ ash_cnt++; } \
>     # Count number of ps and gawk. \
>     /\/bin\/ps(\.exe)?$/{ cnt++; } \
>     /\/bin\/gawk(\.exe)?$/{ cnt++; } \
>     END{ \
>       # Uncomment for testing: \
>       # printf "TOTAL: %d CNT: %d ASH_CNT: %d\n", NR, cnt, ash_cnt; \
>       # Only one of ps and gawk each may run. \
>       if (cnt > 2) \
> 	exit 0; \
>       # The total number of allowed processes is one less than the number \
>       # of input lines.  The extra line is the ps header output. \
>       if (NR - cnt - ash_cnt > 1) \
> 	exit 0; \
>       # All is well. \
>       exit 1; \
>     }'
>   ProcessResult=$?
> 
> It's more comment than code and it does what you need.

Yes, that's workable. It doesn't report to the user the names of the
offending processes, but that's (a) not really necessary, and (b) easy
enough to add if desired (just call ps again as part of the error report).

>> The strto[u]ll stuff was exported from cygwin two weeks later:
>> Mon Oct 1 14:25:00 2001  Charles Wilson  <...>
>> having been added to newlib the same day:
>> 2001-10-01  Charles Wilson  <...>
> 
> Oh boy.

Yep, again.

>> No, that's not what happened on mingw.
>>
>> I was getting a "successful" return (e.g. non-negative return value)
>> from read(), but *nothing* was written into the .name buffer. So, later
> 
> Are you sure this wasn't when it used textmode by accident?  This sounds
> *SO* unlikely to happen in O_BINARY mode, even for the stone-age old
> code msys is based upon.

Sorry for the confusion.  You're right: the behavior I reported was due
to the erroneous textmode usage.  Using calloc() allowed me to detect
the problem without coredumping; switching to binmode appears to fix the
problem.  Now, with O_BINMODE, calloc() isn't strictly
necessary...especially if we change the 'if (read(...) < 0)' tests to
validate the # of bytes read.

If we *don't* make that change (to the read tests), then I'd still feel
better about using calloc -- or at least setting name[0] to '\0'
explicitly (but short reads would then overwrite the initial '\0' with
no guarantee that any trailing '\0' would be present).

>> limits.h:
>> #define PATH_MAX        259
>>
>> stdlib.h:
>> #define       MAX_PATH        (260)
> 
> So long path names wouldn't work anyway.  What about just allocating
> a buffer of size 32K to be on the safe side? 

OK.

> Or just ignore long
> paths since they are probably not used anyway.

...and nobody will ever need more than 640K of RAM. :-)

>> OK.  All that stuff with NT native paths is so much black magic to me,
> 
> It's everything but.  \?? is basically just a folder(*) in the native NT
> namespace which contains symlinks to devices.  The names of these
> symlinks constitute the Win32 device names.  For instance, Win32 drive
> C:.  Prepend \??\ and you get the NT name of the device: \??\C:.
> However, \??\C: is actually just a symlink to the NT device name
> \Device\HarddiskVolume1.  Fetch the tool "winobj" from sysinternals.
> It's pretty instructive.
> 
> As for Win32 long pathnames.  Those start with \\?, for instance \\?\C:.
> The path conversion routine in ntdll.dll just exchanges the second
> backslash with a question mark and, voila, it has the NT path.
>
> Same for network paths.  The short name is \\server\share, the long
> pathname is \\?\UNC\server\share, the NT name is \??\UNC\server\share.
> And UNC is just a symlink to the network redirector device \Device\Mup.
> 
> (*) Actually \?? is a virtual folder which merges a system-wide folder
>     (\GLOBAL??) and a per-session folder (\Sessions\0\DosDevices\$session-id). 

Oh, sure...simple as ABC. :-P

Thanks for the explanation, tho -- that's going in my permanent
cygwin-data folder...

>>> Same as in peflagsall.in, I don't see why this should be necessary.
>>
>> As explained above, it /is/ necessary -- we want to bail if we detect
>> ANY active cygwin/msys processes, but avoid false positives due to the
>> commands used to DETECT those processes.
> 
> Just use a simple ps | gawk as outlined above.

Ack.

I'll post an updated patch later today.

--
Chuck


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]