This is the mail archive of the
cygwin-patches@cygwin.com
mailing list for the Cygwin project.
Re: fix cond_race... was RE: src/winsup/cygwin ChangeLog thread.cc thread.h ...
- To: "Jason Tishler" <jason at tishler dot net>
- Subject: Re: fix cond_race... was RE: src/winsup/cygwin ChangeLog thread.cc thread.h ...
- From: "Robert Collins" <robert dot collins at itdomain dot com dot au>
- Date: Sun, 7 Oct 2001 22:24:30 +1000
- Cc: <cygwin-patches at cygwin dot com>
- References: <20011006203630.A1148@dothill.com>
----- Original Message -----
From: "Jason Tishler" <jason@tishler.net>
To: "Robert Collins" <robert.collins@itdomain.com.au>
Cc: <cygwin-patches@cygwin.com>
Sent: Sunday, October 07, 2001 10:36 AM
Subject: Re: fix cond_race... was RE: src/winsup/cygwin ChangeLog
thread.cc thread.h ...
> Rob,
>
> On Sat, Sep 29, 2001 at 07:45:34PM +1000, Robert Collins wrote:
> > ----- Original Message -----
> > From: "Jason Tishler" <jason@tishler.net>
> > > On Fri, Sep 28, 2001 at 05:48:16PM +1000, Robert Collins wrote:
> > > > Well this patch should make evreything good - fixing the
critical
> > > > section induced race.
> > >
> > > At the risk of appearing dense... Should this patch fix the
pthreads
> > > hang trigger by Python's test_threadedtempfile regression test?
> >
> > I've checked in my completed code. I -cannot- tickle this bug via my
> > test suite at all now. (I found that one of my test scripts was
slightly
> > buggy in that it made an incorrect assumption - it was passing when
this
> > bug was tickled - correcting that let me hit this bug nearly every
time
> > :]).
> >
> > So please, give it a go and see how it fares.
>
> Unfortunately, Python's test_threadedtempfile regression test still
> hangs (IIRC) in the same place. See attached for details.
>
> BTW, the code (i.e., pthread_cond::TimedWait) still has a FIXME in it.
> Did your latest patches fix a different race condition? And, if so,
> is this yet another known race condition?
No, same condition, but I'm not 100% convinced it's fully fixed yet. I
only remove FIXME's when I'm _really_ sure. There is a second race
condition though, it's the one that Norman(??) was hitting.
As for you, you are stopping in the same place. Can you give me your
system config - Processor count & speed? Are you able to try SP 2?
Norman, can you do the same? And add in OS/Service pack ?
> FWIW, I built from CVS on 10/5/2001 and I'm running under Win2K SP1.
Thats certainly recent enough, and the SP _shouldn't_ :] matter.
I'm going to have to think about this one - unless your systems is
massively overloaded during the test - such that the spinloop around
line 482 is able to get 10 timeslices without the waiting thread getting
1?!? - there should be no way to tickle this.
I'd like you to add a system_printf, at line 483, something like
"system_printf ("repulsing event at count 5\n"); (oh, and put it at the
PulseEvent in {}. If that fires then we know that the detection code is
ok. If so, can you try bumping the spin count up, and make the pulsevent
fire if spins mod 5 == 0 ?
Rob