Under heavy usage, stunnel-4.20 on AIX 5.3 enters into a loop with high CPU load.
Test setup:
client ------------> gateway ------------> server ApacheBench 2.0.40-dev stunnel-4.20 Apache 1.3 in server mode OpenSSL 0.9.8d
stunnel 4.20 on powerpc-ibm-aix5.3.0.0 with OpenSSL 0.9.8d 28 Sep 2006 Threading:PTHREAD SSL:ENGINE Sockets:POLL,IPv4
; stunnel configuration file, server mode cert = /etc/stunnel.pem setuid = stunnel setgid = stunnel pid = /tmp/stunnel.pid debug = local3.notice output = /tmp/stunnel.log [https] accept = gateway:4433 connect = server:80
The first client requests are processed without error, then some connections fail:
% ab -n 1000 -c 70 https://gateway:4433/ This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0 [...] Test aborted after 10 failures Apr_socket_connect(): Invalid argument (22) Total of 705 requests completed
One or more connections stay in state CLOSE_WAIT, and stunnel is chewing away CPU cycles (CPU usage >99%).
% lsof -a -i -c stunnel COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME stunnel 196856 stunnel 6u IPv4 0xf100020002feeb98 0t0 TCP gateway:4433 (LISTEN) stunnel 196856 stunnel 46u IPv4 0xf100020002e98398 0t0 TCP gateway:4433 ->server:50271 (CLOSE_WAIT)
In this state, attaching a debugger shows that stunnel is stuck in a loop in init_ssl() within client.c:
+321 while(1) { +322 if(c->opt->option.client) +323 i=SSL_connect(c->ssl); +324 else +325 i=SSL_accept(c->ssl); +326 err=SSL_get_error(c->ssl, i); /* err==5 */ /* ... */ +349 if(err==SSL_ERROR_SYSCALL) { +350 switch(get_last_socket_error()) { +351 case EINTR: +353 case EAGAIN: /* loop continues */ +354 continue; +355 } +356 }
SSL_accept() returns with error, SSL_get_error() returns 5 (SSL_ERROR_SYSCALL). get_last_socket_error() returns EAGAIN, and the loop continues without end. Nevertheless, new connection requests still get processed.
Even before CPU usage goes up, for every new connection, SSL_accept() results in SSL_ERROR_SYSCALL for 20 to 1800 times before finally data is exchanged.
stunnel enters this endless loop on AIX both with OpenSSL 0.9.8d and OpenSSL 0.9.8e. If I replace the AIX gateway machine with one running stunnel-4.20 on Solaris, I do not experience any problems.
What might trigger this loop on AIX 5.3? How can it be avoided?
-- Peter Heimann
_______________________________________________________________ SMS schreiben mit WEB.DE FreeMail - einfach, schnell und kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192