This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PLEASE TEST: New implementation of blocking socket I/O


On Apr  2 00:29, Pierre A. Humblet wrote:
> At 10:04 AM 4/1/2004 -0500, Pierre A. Humblet wrote:
> DWORD what, why;
> 
> wsock_event::wait (int sock, int &closed)
> {
>   int ret = SOCKET_ERROR;
>   int wsa_err = 0;
>   WSAEVENT ev[2] = { event, signal_arrived };
>   WSANETWORKEVENTS evts;
>   memset(&evts, 0, sizeof(evts));
>   what = WSAWaitForMultipleEvents (2, ev, FALSE, WSA_INFINITE, FALSE);
>   switch (what)
>     {
>       case WSA_WAIT_EVENT_0:
> 	why = WSAEnumNetworkEvents (sock, event, &evts);
> 
> (gdb) p what
> $1 = 0
> (gdb) p why
> $2 = 0
> (gdb) p evts
> $3 = {lNetworkEvents = 0, iErrorCode = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}}
> (gdb) p /x sock
> $4 = 0x102
> (gdb) p /x event
> $5 = 0x154
> 
> Weird. 

Yeah, that's weird.  So WSAWaitForMultipleEvents returned with
WSA_WAIT_EVENT_0 but no event has been recorded?!?

[again, a lot of time passes, reading...]

I've searched for WSAEventSelect, WSAWaitForMultipleEvents and
WSAEnumNetworkEvents in Microsoft's knowledge base but I haven't
found anything indicating that this could happen.

HOWEVER.

I searched google groups for that problem and several threads since
the mid-90s indicate, that it is indeed possible that WSAEnumNetworkEvents
returns with no network event recorded.  The reason being that the
event object hasn't been reset for some reason but somehow nobody
really seems to know how that's supposed to work correctly.

But it's sure that the successful call to WSAEnumNetworkEvents resets
the event object.   So one solution/workaround/bad hack for that is
to recognize a 0 in lNetworkEvents and return to WSAWaitForMultipleEvents:

Index: net.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/net.cc,v
retrieving revision 1.167
diff -u -p -r1.167 net.cc
--- net.cc	1 Apr 2004 10:36:40 -0000	1.167
+++ net.cc	2 Apr 2004 10:25:33 -0000
@@ -71,12 +71,15 @@ wsock_event::wait (int sock, int &closed
   int ret = SOCKET_ERROR;
   int wsa_err = 0;
   WSAEVENT ev[2] = { event, signal_arrived };
+wait_again:
   switch (WSAWaitForMultipleEvents (2, ev, FALSE, WSA_INFINITE, FALSE))
     {
       case WSA_WAIT_EVENT_0:
         WSANETWORKEVENTS evts;
 	if (!WSAEnumNetworkEvents (sock, event, &evts))
 	  {
+	    if (!evts.lNetworkEvents)
+	      goto wait_again;
 	    if (evts.lNetworkEvents & FD_READ)
 	      {
 		if (evts.iErrorCode[FD_READ_BIT])

Other postings indicate that that won't happen, if every
WSAEnumNetworkEvents always gets it enabling function (send or recv)
like a siamese twin.  But the only reason I can think of when that
might be a problem is in case of sendto/sendmsg after receiving an
FD_CLOSE.  In that case the send loop is left without calling WSASendTo
again, even if FD_WRITE has been detected.  So the other possible
solution is to differ between FD_CLOSE and FD_CLOSE | FD_WRITE, so
that the loop always calls WSASendTo after FD_WRITE:

Index: fhandler_socket.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/fhandler_socket.cc,v
retrieving revision 1.127
diff -u -p -r1.127 fhandler_socket.cc
--- fhandler_socket.cc	1 Apr 2004 17:00:21 -0000	1.127
+++ fhandler_socket.cc	2 Apr 2004 10:41:23 -0000
@@ -955,8 +955,8 @@ fhandler_socket::sendto (const void *ptr
 		      break;
 		    }
 		}
-	      while (!(res = wsock_evt.wait (get_socket (), has_been_closed))
-		     && !has_been_closed);
+	      while ((res = wsock_evt.wait (get_socket (),
+					    has_been_closed)) > 0);
 	      wsock_evt.release (get_socket ());
 	    }
 	}
@@ -1091,8 +1091,8 @@ fhandler_socket::sendmsg (const struct m
                       break;
                     }
                 }
-              while (!(res = wsock_evt.wait (get_socket (), has_been_closed))
-	             && !has_been_closed);
+              while ((res = wsock_evt.wait (get_socket (),
+					    has_been_closed)) > 0);
 	      wsock_evt.release (get_socket ());
 	    }
 	}
Index: net.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/net.cc,v
retrieving revision 1.168
diff -u -p -r1.168 net.cc
--- net.cc	2 Apr 2004 10:29:53 -0000	1.168
+++ net.cc	2 Apr 2004 10:41:23 -0000
@@ -89,7 +89,7 @@ wsock_event::wait (int sock, int &closed
 		if (evts.iErrorCode[FD_WRITE_BIT])
 		  wsa_err = evts.iErrorCode[FD_WRITE_BIT];
 		else
-		  ret = 0;
+		  ret = 1;
 	      }
 	    if (evts.lNetworkEvents & FD_CLOSE)
 	      {
@@ -98,7 +98,8 @@ wsock_event::wait (int sock, int &closed
 		  {
 		    if (evts.iErrorCode[FD_CLOSE_BIT])
 		      wsa_err = evts.iErrorCode[FD_CLOSE_BIT];
-		    else
+		    /* Don't override FD_WRITE notification. */
+		    else if (ret == SOCKET_ERROR)
 		      ret = 0;
 		  }
 	      }

So, one of the above patches (or both together?) should solve the problem.  
Needless to say that both work fine on my machine so I won't say that.

Would you mind testing both variations?

> Sysinternals doesn't show a handle 0x102, FWIW.

Sigh.  Btw., on what OS/Service Pack are you testing?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Developer                                mailto:cygwin@cygwin.com
Red Hat, Inc.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]