This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 1.5.20(0.156/4/2) pipe hangs, dos files


Lev Bishop wrote:
On 8/1/06, Darryl Miles wrote:
I am still interested in tackling the whole situation but I do need to
be furnished with a testcase to work with.  I believe the original
comeback by the group of users running "unison" should have insisted a
testcase was produced by them to demonstrate the new breakage.

As I recall, the "group of users running unison" was the exact same group as the group who developed the currently-commented-out code in select.cc, so there wasn't any particular need for them to provide themselves a test case....

I'm sure it's all explained in the mailing list archives. Basically,
the NtQueryInformationFile() gives back the amount of non-paged pool
used by the pipe, which is only the same thing as the amount of data
available to read in the case that there are no outstanding read()s on
the pipe. Otherwise, the commented-out code can cause a write()r to
deadlock any time the process at the other end of the pipe issues a
read() for more than a pipe buffer's worth of data. This is much worse
than the current situation, where a non-blocking write can
occasionally block, which in turn may cause (serious) performance
issues but rarely a total deadlock. (After all, cygwin is not an rtos
and there is allowed to have arbitrary delays at any point in the
code, without violating the posix semantics, so long as eventually the
write() *eventually* returns.)

Okay you seem to have some understanding as to how and why it failed for the "unison" group of users. Do you think the commented out code is fixable in any so that all cases work correctly ?


The problem at the moment is that Corinna would like someone to explain how the NtQueryInformationFile() approach is broken (and me for that matter).

I find it difficult to understand that a Query function has a side effect of causing other IO work to become deadlocked. So maybe for the uninitiated I'd like to hear a clear simple description of events that would occur from someone who understands it.

Maybe the deadlock you are reffering to a problem where the NtQueryInformationFile() fails to see data which is actually in the pipe so the deadlock comes from select() never returning correct events when it should. i.e. the exact opposite of the current problem of it always returning writability even when it shouldn't.



If we can all get to that level on understanding you, Corinna and I then maybe we can all take a look at my propose approach to the problem. By converting all writes (blocking and non-blocking alike) on pipes into overlapping IO requests and double buffering the written data. Any blocking sementics we need are created in CYGWIN code by putting the thread to sleep. This also means we should be able to wake up correctly for signals too.

Kernel buffer resource limits are imposed by a simple outstanding byte counter, so we start returning EAGAIN when we have more than 'ulimit -p' order of writes outstanding.

Checking the writability of a given FD then is a simply case of revalidating if the outstanding byte counter has dropped below the lowater buffering mark and also providing a wakeup to select() in every case that it does.



Again thank you for your response the main problem on the issue is that no many people know much about the history and technical reasons.

Darryl

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]