This is the mail archive of the cygwin-patches@cygwin.com mailing list for the Cygwin project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
This is a rather long email, so I've made out like I was back in ISO 9001 land and gone overboard with the "change request". Enjoy :-) ** Summary: This is a patch for a race in cygwin's UNIX domain socket authentication protocol. It also fixes a couple of other minor problems with cygwin's UNIX domain socket implementation. This patch has been tested on both win2k and win98SE (but not on winsock 1). It has been tested with blocking and non-blocking socket calls (and in the latter case both with polling for connections and with select(2) calls). It has also been tested with a threaded client. ** Other fixes: This patch also fixes: * getpeername(2) for UNIX domain sockets (currently this just returns the information for the underlying INET socket). * with a non-blocking socket, a server couldn't poll a UNIX domain socket by calling accept(2) repeatedly. * a socket could be created with one address family and then connected/bound to a different one. * there was no checking of the UNIX domain socket file, so you could simply create a standard file with the relevant data in it and the code would not notice. ** Known issues in this patch: One issue with the new protocol is that the client cannot close its connection until the server has performed its half of the protocol. This can only be an issue in the following situation: * the client just writes to the connection (as a read would block anyhow) * the server is *really* slow at accepting connections (or is hung) If the close is called before the client exits, it can be interrupted (by a signal). Otherwise, the solution is to kill the server (or kill the client from the Windows Task Manager). A second issue is that there is no code to handle multi-threading, which could be a problem if a socket is shared between threads. This was a problem in the previous version of the code as well. I'll submit another patch to fix this. ** Problem with previous protocol: The original authentication protocol relies on both client and server creating a win32 event object with a "secret" name. The secret part of the name is a random key that is stored in the UNIX domain socket file and is thus only accessible to whoever can read that file. Any process (client or server) trying to use the underlying socket without knowledge of the secret key is prevented from receiving connections. The handshake is that on accept or connect, a process sets its own secret event object and waits on the peer process's secret event object. The race is then as follows. If two clients attempt to connect to the same server, their two requests will be placed on the server's connection queue and their connect requests will succeed. They will then both signal their own secret events and wait on the server's secret event. When the accept request succeeds in the server with one or other of these two requests, the server will signal its own secret event and wait on the relevant peer process's secret event. This wait in the server will succeed immediately (since the clients signal their own events as soon as the connect succeeds for them). The problem is that both the clients are waiting on the one server secret event object, and it is possible for the "wrong" client to wake up. At this point, the client whose connection has been accepted is still blocked on the server's secret event while the client whose connection is still pending, carries on as if it had been authenticated. If the server now tries to read some data from the client, it will block as the client itself is blocked. In testing cygserver with UNIX domain sockets, I was getting such blocks frequently. At each block, there is a ten second delay (as the protocol times out) and that client request fails. There is also a problem with the protocol in that for non-blocking connections, where the client calls connect(2) and then waits in select(2) until the connection is signalled, it never performs its half of the handshake and so could connect to unauthorized servers. ** New protocol: The new protocol is very similar to the original one, using secret objects with the same names as before, except that now the objects are semaphores rather than event objects and the processes do authentication by checking for the existence of the semaphore rather than by waiting on it. In detail, both client and server create their semaphore before attempting to connect. In the client's case this implies that it needs to explicitly bind(2) to a system-provided INET address as the port number is part of the secret event name (this binding would otherwise be done implicitly by the call to connect(2) so this doesn't change the behaviour of the application). In the client, when the connect(2) call succeeds it merely needs to check that the server's secret event exists: if so, it succeeds; otherwise it resets the connection and fails. The logic is fundamentally the same in the server (when the accept(2) call succeeds, just check for the existence of the client secret event) except for one annoying twist. It is possible for a client that doesn't read from the socket to connect to a server, write some data and close the socket *before* the server's accept(2) call returns that connection (i.e. the whole thing happens with the client connection sitting on the server's pending connection queue). In this case, the server would reject the connection since the client could have closed its secret event too soon. Thus for this one problem, the server does release the client's semaphore (as a "I've seen your secret event" signal) and the client waits on this signal in its close(2) code. For simplicity in the code, when a secret event is duplicated (by dup(2) or fork(2) for example) the secret event is also signalled (in both the server and the client). The server also signals its secret event as soon as it creates it. This means that in a server, the release count on the semaphore equals the number of handles there are to that socket (and thus to that event), while in the client the release count is one lower than the handle count and so it blocks in the last close, until the server signals it too that is. ** Grovelling conclusion: I think this patch is really groovy and I've just spent the last week and a half sweating over it, so please apply it :-) // Conrad
Attachment:
ChangeLog.txt
Description: Text document
Attachment:
af_local.patch.bz2
Description: Binary data
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |