Hi,
I'm running stunnel (ubuntu 4.42-1, confirmed with 4.54 that has been built from source) against a backend I've only limited control over which was running fine for a long time.
At some point in the last few days, the clients stopped being able to make *concurrent* HTTP requests (by different threads) over the stunnel. The client applications using the tunnel enounter a 'Connection refused' error. The issue has been isolated to curl requests (instead of the original client application).
Running a tcpdump on both, loopback and uplink, showed that every additional request results in an ACK from the stunnel to both the client and the remote side and a few ms later a RST, ACK is sent in the *same* direction (again from the stunnel to both client and remote end).
The stunnel log shows an error about a broken pipe whenever these failures occur:
Feb 18 11:09:24 x2 stunnel: LOG7[555:139702117717760]: Socket closed on read Feb 18 11:09:24 x2 stunnel: LOG7[555:139702117717760]: Sending SSL write shutdown Feb 18 11:09:24 x2 stunnel: LOG7[555:139702117717760]: SSL alert (write): warning: close notify Feb 18 11:09:24 x2 stunnel: LOG6[555:139702117717760]: SSL_shutdown successfully sent close_notify Feb 18 11:09:24 x2 stunnel: LOG5[555:139702117717760]: Error detected on socket (read) file descriptor: Broken pipe (32) Feb 18 11:09:24 x2 stunnel: LOG5[555:139702117717760]: Connection reset: 1320 bytes sent to SSL, 48625 bytes sent to socket Feb 18 11:09:24 x2 stunnel: LOG7[555:139702117717760]: Service xtunnel finished (4 left)
An strace on the stunnel shows:
6425 <... setsockopt resumed> ) = 0 6425 close(18 <unfinished ...> 6465 poll([{fd=15, events=POLLOUT}, {fd=22, events=POLLIN}], 2, 43200000 <unfinished ...> 6425 <... close resumed> ) = 0 6465 <... poll resumed> ) = 2 ([{fd=15, revents=POLLOUT|POLLERR|POLLHUP}, {fd=22, revents=POLLIN}]) 6425 sendto(13, "<31>Feb 18 11:34:35 stunnel: LOG7[6395:140601236756224]: Service xtunnel finished (4 left)", 99, MSG_NOSIGNAL, NULL, 0 <unf inished ...> 6465 getsockopt(15, SOL_SOCKET, SO_ERROR, [32], [4]) = 0 6425 <... sendto resumed> ) = 99 6465 futex(0x7fe045ec3754, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> 6425 futex(0x7fe045ec3754, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> 6465 <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) 6425 <... futex resumed> ) = 0 6465 sendto(13, "<29>Feb 18 11:34:35 stunnel: LOG5[6395:140601236686592]: Error detected on socket (read) file descriptor: Broken pipe (32)", 122, MSG_NOSIGNAL, NULL, 0 <unfinished ...> 6425 write(2, "2013.02.15 11:34:35 LOG7[6395:140601236756224]: Service xtunnel finished (4 left)\n", 91 <unfinished ...> 6465 <... sendto resumed> ) = 122 6425 <... write resumed> ) = 91 6465 futex(0x7fe045ec3754, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> 6425 sendto(13, "<31>Feb 18 11:34:35 stunnel: LOG7[6395:140601236756224]: str_stats: 0 block(s), 0 byte(s)", 89, MSG_NOSIGNAL, NULL, 0 <unfinished ...> 6465 <... futex resumed> ) = 0 6465 write(2, "2013.02.15 11:34:35 LOG5[6395:140601236686592]: Error detected on socket (read) file descriptor: Broken pipe (32)\n", 114 <unfinished ...>
$ ps auxww|grep stunnel4 shows: root 28587 0.1 0.0 437844 7780 ? Ss 09:37 0:11 /usr/bin/stunnel4 /etc/xtunnel/stunnel.conf root 29170 0.0 0.0 43876 1180 ? S 09:37 0:00 /usr/bin/stunnel4 /etc/xtunnel/stunnel.conf root 29171 0.0 0.0 43876 1180 ? S 09:37 0:00 /usr/bin/stunnel4 /etc/xtunnel/stunnel.conf root 29172 0.0 0.0 43876 1180 ? S 09:37 0:00 /usr/bin/stunnel4 /etc/xtunnel/stunnel.conf root 29173 0.0 0.0 43876 1180 ? S 09:37 0:00 /usr/bin/stunnel4 /etc/xtunnel/stunnel.conf root 29174 0.0 0.0 43876 1180 ? S 09:37 0:00 /usr/bin/stunnel4 /etc/xtunnel/stunnel.conf
The stunnel config looks like this:
foreground=yes debug=7 pid=/var/run/xtunnel.pid client=yes
CAfile=/etc/ssl/certs/xtunnel-chains.pem verify=3
engine=dynamic engineCtrl=SO_PATH:/usr/lib/engines/engine_pkcs11.so engineCtrl=ID:pkcs11 engineCtrl=LIST_ADD:1 engineCtrl=LOAD engineCtrl=MODULE_PATH:libopencryptoki.so.0 engineCtrl=INIT engineCtrl=PIN:XXXXXX
[xtunnel] engineNum=1 key=id_4 cert=/etc/xtunnel/client-cert.pem accept=127.0.0.1:9999 connect=192.168.200.1:8443
Does anyone have an idea what could be going wrong here or how to troubleshoot this any further?
Thanks! Jan