This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ASLR sometimes stops working on Vista with 1.7? [was: Re: Cygwin 1.7 release (was ...)]


Corinna Vinschen wrote:
> On Jun  5 10:56, Charles Wilson wrote:
>> You did reboot, right? IIRC Windows only calculates the new base
> 
> No.  Because *none* of my DLLs is marked to be ASLR compatible.  I'm
> testing what happens OOTB.  

OK, that makes sense.

> The entire problem starts already in the
> parent when the DLL base address is 0x6ee00000.  The parent inevitably
> rebases the DLL to a very low address like 0xa00000 or even 0x900000 at
> load time.  The child then fails to map the DLL to the same address.
> 
> However, if I rebase the DLL to some other spot, like 0x65000000, then
> the DLL is loaded at that address exactly, and everything works fine.  I
> still don't think this has anything to do with ASLR.  ASLR only
> complicates the picture.  AFAICS, there's no guarantee that the address
> computed by ASLR will help forever.  It only eases the underlying
> problem by chance if the addresses happen to have a low chance for
> collision.

???

One of the side effects of ASLR is to, in effect, perform a custom
rebase for every dynbase-enabled DLL, that is (a) unique to the usage
pattern of your machine workflow since you (booted/logged on), and (b)
includes not just cygwin DLLs but also all normal system DLLs so marked,
and (c) persists for the entirety of the current boot/logon session.
This means very little chance of collision at all, AFIACT, at least not
any that arise from (new)ImageBase+codesize.

For ASLR, all dynbase DLLs (including cygwin ones) get mapped to the
range 0x50000000 to 0x78000000. If something ends up at 0x900000, either
with +dynbase/ASLR or without, it's not directly related to ASLR...

> The problem is not the fact that the DLL is rebased at all in the
> parent.  Even though in my case the address range 0x6ee00000-0x6ee08000
> isn't taken by another DLL, it could be taken by memory dynamically
> allocated by one of the formerly loaded DLLs.

Really? This would have to be by virtue of some call OTHER than to
cygwin's malloc, right?  Because I thought cygwin maintained a single
process heap way down low in memory...it's hardly likely for that heap
to clash with 0x50000000...0x78000000 (or, for the original rebase,
non-ASLR scenario, 0x68000000...downwards), for small allocations like
this (e.g. where your solution of adding a single page to the
DefaultOffset size as a buffer helps).

If it's a cygwin-linked DLL doing this (e.g. Dumper.dll) maybe it's a
bug in that DLL, using the wrong mechanism to allocate memory (e.g. a
direct call to VirtualAlloc or to (deprecated) GlobalAlloc or LocalAlloc
functions?) ...Hmm, perl itself (incl. Dumper.dll) doesn't seem to.

> The real shit starts with the fact that W7 (and Vista, too, apparently)
> rebases the DLL to an address which is so very low in the address space
> of the application.

But why did it do this at all? In the normal "rebase" scenario (and
assuming no dynamically allocated memory sucking up 0x68000000-level
space), sure -- I could see some other system DLL interfering with where
rebase wanted to put, e.g. Cwd.dll.  But in the ASLR scenario, almost
all system DLLs are marked +dynbase, so ASLR is supposed to
"auto-rebase" the entire working set of +dynbase DLLs (system AND
win32app AND cygwin), which avoids that issue.

I think the answer to the question is what you have discovered about
dynamic (e.g. non-dll-image-related) allocations up in the ImageBase
memory area(s).

> This is uncomfortably near to where the process
> heap is expected to be.  When Vista was new, we had the problem already
> in a somewhat different way.  Note this comment in Cygwin's heap.cc:
[snip]
> This problem with dynamically linked DLLs looks quite similar.  Some
> space after the heap is reserved in the child which wasn't reserved
> in the parent.

Hmmm...

> If Vista/W7 would refrain from using the lowest available address in
> the parent already, the entire problem might go away (aka "occurs
> very, very seldom")
> 
>>> I think I'm going to ask MSFT if there's any workaround for
>>> this problem.
>> If my understanding of ASLR is correct, then ASLR *ought* to have solved
>> this problem
[snip]
> As I mentioned above, I don't think that ASLR can solve this problem
> once and for all.  Whereever any DLL is rebased to by the ASLR
> mechanism, there's a chance that the address is already taken in the
> child when LoadLibrary is called for the dynamically loaded DLL.

Ack -- with the caveat that IF that is the case, it most likely isn't
related to the actual DLL images loaded up in that area of the child's
memory.  I still think ASLR would/should/does solve THAT issue. It's
just that there may be some non-Image-related memory that is getting
reserved -- perhaps by the DLLs themselves during a custom DllMain
DLL_PROCESS_ATTACH?  No...the only DllMain is in win32/perllib.c, which
AFAICT is not used on cygwin builds.

...digging in to perl source code...

Hmm...depending on configure options and platform defaults, perl may
implement it own memory manager (which can be/is used by extensions,
like Dumper.dll)  I can't tell if cygwin's perl is built that way, but
(a) cygperl5_10.dll does import sbrk(), and (b) there are only two
non-extension users of sbrk() in perl: its malloc implementation and
perl.c (Perl_my_unexec).

but sbrk is implemented by cygwin, and that's the user heap which ought
to be way down low in memory, not up high in the rebase/ASLR image base
area.  I'm stumped.

> Here's another thought:
> 
> I examined the address layout of the perl process again, and it struck
> me as weird that the base addresses of all the DLLs which get dynamically
> loaded by perl are so near together.

Well, it's not really weird.  In the rebase/non-ASLR case, the file list
is computed by rebaseall by inspected all of the *.lst.gz files in
/etc/setup.  Since all the perl extension DLLs are in the same .lst.gz
file, they end up next to each other in the the list-to-rebase.  Also,
they are all very small, so their "footprint" is just whatever minimum
DefaultOffset rebaseall uses, 0x10000==64k.

We could play with this by randomizing the list-to-rebase (e.g. sort
--random-sort)...but I hate random. WAY too hard to debug. Maybe
sort-by-dll-basename:
loop over F
  B1=${F##*/}
  B2=${B1##*cyg}
  B=${B2%.*}
  echo "$B|$F"
and accumulate in temp file
then cat temp | sort -t'|' -k1 | awk -F'|' '{print $2}' >\
   real list-to-rebase

This way, /usr/bin/cygpcre-0.dll
              --> pcre-0|/usr/bin/cygpcre-0.dll
but       /usr/long/path/Posix.dll
              --> Posix|/usr/long/path/Posix.dll
and Posix.dll ends up rebased a long way away from the other perl DLLs
like Dumper.dll or IO.dll.


In the ASLR case, their fake ImageBases are computed on first load, and
they are just "in order" down from the current (random)
top-of-all-ASLR-image-bases. But, say you run some perl script...it will
have predictable sequence of "use" statements that trigger dlopening
certain DLLs.  So, 1-2-3 you have a fast sequence of very small perl
dlls loaded, with "fake" ASLRed ImageBases separated by the quantization
size of the ASLR _MiImageBitMap -- which again, is 64k.  There doesn't
seem to be a good way to influence this, unlike the rebase case.

In that sense, ASLR would be a disimprovement over what we can do with
rebase, because we can apply a larger quantization (e.g.
DefaultOffset=0x20000). ASLR doesn't know anything more than
roundUpToNearest64k(ImageSize).  If you're right about the cause of
these clashes (and it seems likely), then the only way to "fix" the ASLR
case is to find out WHY certain perl DLLs are allocating "high" memory
instead of using cygwin's (low) user_heap, and fix /that/ -- so that the
ImageSize information used by ASLR accurately reflects the full memory
footprint needed by the DLL in the "high" area.

>  It looks like the problem is
> actually tightened by the order in which the DLLs are rebased by rebaseall,

Yep, for the non-ASLR case.

> and the order in which the DLLs are loaded into the running process.

Hmm...definitely for the ASLR case. Possibly, for some-naughty-dll
allocating memory in a "bad" place for the non-ASLR case.

> Some perl DLL (Dumper.dll?) allocates additional memory and that's right
> after it's own image.  That's where Cwd.dll is based to.  Cwd.dll gets
> rebased and ... poof.

Right then. So...why? Seems odd that the dynamic allocation is occuring
"up high" and not down in user_heap.

> What I did then was to change the offset to rebaseall:
> 
> ash$ rebaseall -o 0x20000   (default is 0x10000)
> 
> Then I reinstalled /bin/cyggmp-3.dll and reran cygport.  This time
> it ran fine.  This is still w/o ASLR flags.

Ack. I already have so many cygwin DLLs on my system that by the time
they are all rebased, cygz.dll ends up very low (0x5cf90000). I'd hate
to double the size of ALL the "small" DLLs. But it that's what it takes
to get rid of the *** failure to remap, then that's what it takes.

> In this configuration, I can reproduce running cygport successfully
> every time.

So, this is great if no "naughty" DLL is ever "really naughty" and
dynamically allocates more than a page "up high". I can't judge how safe
that is, because I still don't understand why ANY memory is being
dynamically allocated "up high".

>> Could it be possible that cygwin's dlopen (or fork) implementation is
> Not that I can see.  The memory for the data storing the loaded DLLs is
> loaded from the parent memory into a stack slot.  There's no other
> memory allocation going on.  Well, except when LoadLibrary already
> failed.

Ack. Good. Cygwin's fork/exec code is much much scarier to me than
memory allocation...

> What I see only affects one single perl parent and the forked child.
> There's not a single perl process involved which had the dynamically
> loaded DLLs loaded at the correct (aka "desired") spot in memory.

Ack.

--
Chuck


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]