This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

1.5.20-1 perl 2328 sig_send: wait for sig_complete event failed


We have a collection of perl scripts that run every night.  Just recently we
updated Cygwin to 1.5.20-1 using the latest setup.exe on the site, and
updated all the other binaries as part of it.  Unfortunately, I cannot
remember what version we upgraded from; it was a long time ago, I'd guess
about 5 months.  However, as of the update, we are always seeing the
following messages in one of the later scripts:

      2 [unknown (0xD8)] perl 2328 sig_send: wait for sig_complete event
failed, signal -41, rc 258, Win32 error 0
  14776 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -41, rc 258, Win32 error 0
60040177 [unknown (0xD8)] perl 2328 sig_send: wait for sig_complete event
failed, signal -34, rc 258, Win32 error 0
60040273 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -34, rc 258, Win32 error 0
120065422 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -34, rc 258, Win32 error 0
180075174 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -34, rc 258, Win32 error 0
240084816 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
305125262 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
370150140 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
435174910 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
500199735 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
565224563 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
630249383 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
695289861 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
760314652 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
825339473 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
890364299 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
955389118 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
1020429594 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
1085454400 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
1150479299 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0
1215504053 [unknown (0x44C)] perl 2328 sig_send: wait for sig_complete event
failed, signal -40, rc 258, Win32 error 0


-----------------------------------------------------------------
Observations
-----------------------------------------------------------------

This always happens in the same script, though I have been unable to produce
a simple testcase that exhibits this behavior.  Unfortunately, I cannot do a
simple attachment/paste of our scripts, since they contain a lot of private
information.  I am working on creating a public version that can be
reviewed, but it is going to take a while.  I will try to summarize as best
I can in the meantime.  Here are some important details about the
environment and the errors:

* The script uses Perl's ithreads (summary of what the script does is below)

* The signal numbers that you see in the output are always the same; always
2x -41, 4x -34, 16x -40.  After that last -40, I need to kill the sh.exe
process to get a prompt back; Ctrl+C does nothing.  Killing perl.exe does
not work.  After killing sh.exe, perl.exe is still running.  If nothing is
killed, the script will hang indefinitely (well, at least for 17 hours).

* The numbers at the very beginning of the lines are different every time

* The hex numbers by the 'unknown' word change with each run of the script,
however in every instance they always follow the same pattern.  HexA, HexB,
HexA, 19x HexB

* The problematic script is invoked from a parent script via system("perl
problem_script.pl")

* The errors always seem to occur about 20 minutes into the problem script.
It isn't exact, but so far the times have been 21 minutes, 20, 22, 21, 23.
This is roughly halfway through the 28 expect scripts that each thread
executes (see below for details).

* The scripts did not change during or since the Cygwin upgrade to 1.5.20-1.


-----------------------------------------------------------------
How the script works
-----------------------------------------------------------------

1) We have a parent script that orchestrates all the child test scripts.  It
iterates through all possible tests, and if the test should run, it invokes
the appropriate perl script via a system("perl script.pl").  There are about
6 possible test scripts; only the problematic one uses threads.

2) Now in the problematic script.  We have two build machines; the current
one, which is running Windows XP Pro SP2 with all updates, and a Linux
machine running Mandrake 9.2 (if it ain't broke... :).  The threads are used
to have the machines work in parallel.  The Windows tasks and the Linux
tasks are each in their own subroutine.  The code used to start these tasks
looks like this:

push @threadList, threads->create("LinuxWrapperFunc", "path_to_a_logfile");
push @threadList, threads->create("WindowsWrapperFunc",
"path_to_a_logfile");
$_->join() foreach (@threadList);

The wrapper functions open the passed logfile and do a select call on the
handle to have all output go there.  Once everything is done, a select
STDOUT is performed.

3) The point of this test is to run about 28 expect scripts on each OS with
various binaries.  On both Windows and Linux, we create an appropriate shell
script for each expect test and invoke them in serial.  The main difference
between the Windows and Linux functions (aside from shell syntax) is simply
that we scp the shell script to the Linux machine, execute it via ssh, and
then scp the results back to the Windows box for parsing.


-----------------------------------------------------------------
Environment details
-----------------------------------------------------------------

I have attached the cygcheck output per the problem reporting guidelines.

I don't think the Linux machine details are relevant to this problem, but
nonetheless:

Linux machine uname -a : Linux sisyphus 2.4.22-10mdk #1 Thu Sep 18 12:30:58
CEST 2003 i686 unknown unknown GNU/Linux

Linux machine /etc/mandrake-release : Mandrake Linux release 9.2 (FiveStar)
for i586


-----------------------------------------------------------------
Possible solution paths?
-----------------------------------------------------------------

* I read in the guidelines that it is generally not advisable to mention
that these scripts worked with the previous versions of Cygwin.  One idea I
had was to try some older versions of Perl from the Cygwin setup and see if
the problem still occurs.  Should I go through with that idea?

* Thank you for reading all of this!  I know this is long; this is the best
I could do to be concise yet descriptive.

William Sheehan
Builds Engineer / Network Administrator
Open Interface North America

Attachment: cygcheck.out
Description: Binary data

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]