This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
Re: crash on latest cygwin snapshot
- From: Christopher Faylor <cgf-use-the-mailinglist-please at cygwin dot com>
- To: cygwin-developers at cygwin dot com
- Date: Mon, 2 Jul 2012 12:01:57 -0400
- Subject: Re: crash on latest cygwin snapshot
- References: <4FEA04B2.9010209@gmail.com> <20120626192357.GB25282@ednor.casa.cgf.cx> <20120626194755.GA25725@ednor.casa.cgf.cx> <20120626201447.GB25725@ednor.casa.cgf.cx> <4FEA1B54.2030905@gmail.com> <20120626203229.GC26174@ednor.casa.cgf.cx> <4FEA20AB.8060503@gmail.com> <20120626205331.GD26174@ednor.casa.cgf.cx> <20120627014651.GA10400@ednor.casa.cgf.cx> <4FF1B37A.4000902@gmail.com>
- Reply-to: cygwin-developers at cygwin dot com
[redirecting to cygwin-developers]
On Mon, Jul 02, 2012 at 04:43:06PM +0200, marco atzeri wrote:
>On 6/27/2012 3:46 AM, Christopher Faylor wrote:
>>
>> Sorry, Marco. Nevermind. I duplicated this. No need to upload anything.
>> I'm still working on it.
>
>it seems solved on 20120702 snapshots
Thanks for confirming. I suspect that this snapshot really only masks the
problem. The stack was getting corrupted by something but I was having a
really hard time figuring out what was doing it.
Compiling path.cc without optimization or passwd.cc without
-fomit-frame-pointer "fixed" the problem but clearly something is wrong.
I tried building Cygwin with stack probes and that made the DLL
unrunnable. I tried instrumenting it with -finstrument-functions and
that made the problem go away.
The problem is that something, somewhere along the line, replaces a
frame pointer in the stack with 0x10c (268) and the return address with
zero. So, eventually, when a function returns, %ebp is set to 0x10c and
the program jumps to address zero. That is one manifestation. In others
the 0x10c is still there but it is interpreted as a pointer to something
and dereferencing it causes a SEGV.
0x10c is 256 + 12 but I haven't been able to find that anywhere in the
source.
Your test triggered a problem which has existed in Cygwin since last November.
I noticed it in snapshots going back to 2012-11-14. But, when I tried to
build a version of Cygwin from before that time, it still manifested the
problem. I did change to gcc 4.5.3 around that time so I'm thinking that
either this version of gcc exposed a problem in Cygwin or Cygwin has exposed
a problem in this version gcc. There was an odd problem in select() where
it seemed like constructors weren't being properly run on a local variable
when alloca was used in the same function.
cgf