I ran into an issue with stunnel (verified with the latest version) where under certain conditions, client connections will hang in the "Waiting for a libwrap process" state.
The problem is a race condition in src/libwrap.c:
170 --num_busy; /* the child process has been released */ 171 if(num_busy==num_processes-1) { /* need to wake up a thread */ 172 retval=pthread_cond_signal(&cond); /* signal waiting threads */ 173 if(retval) { 174 errno=retval; 175 ioerror("pthread_cond_signal"); 176 longjmp(c->err, 1); 177 } 178 }
Removing the statement "if (num_busy==num_processes-1)" is one way to fix the problem.
The conditions under which the race condition occur are: - All libwrap processes are busy, and multiple threads are waiting [num_busy == num_processes] - One thread releases its libwrap process (this signals one of the waiting threads) - Another thread releases its libwrap process before the signaled thread has a chance to acquire and increment num_busy.
Once that happens, all queued threads are stuck waiting on the condition variable, until stunnel is hit with enough concurrent connections to max out num_busy again. (Depending on traffic patterns, this may not happen for a long time.)
Andy Skalski
Andrew Skalski wrote:
Removing the statement "if (num_busy==num_processes-1)" is one way to fix the problem.
Yes, this is probably the was way do deal with this problem.
Thank you for reporting and investigating this issue.
Mike