Hi,
I'm running stunnel (ubuntu 4.42-1, confirmed with 4.54 that has been
built from source) against a backend I've only limited control over
which was running fine for a long time.
At some point in the last few days, the clients stopped being able to
make *concurrent* HTTP requests (by different threads) over the
stunnel. The client applications using the tunnel enounter a
'Connection refused' error. The issue has been isolated to curl
requests (instead of the original client application).
Running a tcpdump on both, loopback and uplink, showed that every
additional request results in an ACK from the stunnel to both the
client and the remote side and a few ms later a RST, ACK is sent in
the *same* direction (again from the stunnel to both client and remote
end).
The stunnel log shows an error about a broken pipe whenever these
failures occur:
Feb 18 11:09:24 x2 stunnel: LOG7[555:139702117717760]: Socket closed on read
Feb 18 11:09:24 x2 stunnel: LOG7[555:139702117717760]: Sending SSL
write shutdown
Feb 18 11:09:24 x2 stunnel: LOG7[555:139702117717760]: SSL alert
(write): warning: close notify
Feb 18 11:09:24 x2 stunnel: LOG6[555:139702117717760]: SSL_shutdown
successfully sent close_notify
Feb 18 11:09:24 x2 stunnel: LOG5[555:139702117717760]: Error detected
on socket (read) file descriptor: Broken pipe (32)
Feb 18 11:09:24 x2 stunnel: LOG5[555:139702117717760]: Connection
reset: 1320 bytes sent to SSL, 48625 bytes sent to socket
Feb 18 11:09:24 x2 stunnel: LOG7[555:139702117717760]: Service xtunnel
finished (4 left)
An strace on the stunnel shows:
6425 <... setsockopt resumed> ) = 0
6425 close(18 <unfinished ...>
6465 poll([{fd=15, events=POLLOUT}, {fd=22, events=POLLIN}], 2,
43200000 <unfinished ...>
6425 <... close resumed> ) = 0
6465 <... poll resumed> ) = 2 ([{fd=15,
revents=POLLOUT|POLLERR|POLLHUP}, {fd=22, revents=POLLIN}])
6425 sendto(13, "<31>Feb 18 11:34:35 stunnel:
LOG7[6395:140601236756224]: Service xtunnel finished (4 left)", 99,
MSG_NOSIGNAL, NULL, 0 <unf
inished ...>
6465 getsockopt(15, SOL_SOCKET, SO_ERROR, [32], [4]) = 0
6425 <... sendto resumed> ) = 99
6465 futex(0x7fe045ec3754, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
6425 futex(0x7fe045ec3754, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
6465 <... futex resumed> ) = -1 EAGAIN (Resource
temporarily unavailable)
6425 <... futex resumed> ) = 0
6465 sendto(13, "<29>Feb 18 11:34:35 stunnel:
LOG5[6395:140601236686592]: Error detected on socket (read) file
descriptor: Broken pipe (32)", 122, MSG_NOSIGNAL, NULL, 0 <unfinished
...>
6425 write(2, "2013.02.15 11:34:35 LOG7[6395:140601236756224]:
Service xtunnel finished (4 left)\n", 91 <unfinished ...>
6465 <... sendto resumed> ) = 122
6425 <... write resumed> ) = 91
6465 futex(0x7fe045ec3754, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
6425 sendto(13, "<31>Feb 18 11:34:35 stunnel:
LOG7[6395:140601236756224]: str_stats: 0 block(s), 0 byte(s)", 89,
MSG_NOSIGNAL, NULL, 0 <unfinished ...>
6465 <... futex resumed> ) = 0
6465 write(2, "2013.02.15 11:34:35 LOG5[6395:140601236686592]: Error
detected on socket (read) file descriptor: Broken pipe (32)\n", 114
<unfinished ...>
$ ps auxww|grep stunnel4 shows:
root 28587 0.1 0.0 437844 7780 ? Ss 09:37 0:11
/usr/bin/stunnel4 /etc/xtunnel/stunnel.conf
root 29170 0.0 0.0 43876 1180 ? S 09:37 0:00
/usr/bin/stunnel4 /etc/xtunnel/stunnel.conf
root 29171 0.0 0.0 43876 1180 ? S 09:37 0:00
/usr/bin/stunnel4 /etc/xtunnel/stunnel.conf
root 29172 0.0 0.0 43876 1180 ? S 09:37 0:00
/usr/bin/stunnel4 /etc/xtunnel/stunnel.conf
root 29173 0.0 0.0 43876 1180 ? S 09:37 0:00
/usr/bin/stunnel4 /etc/xtunnel/stunnel.conf
root 29174 0.0 0.0 43876 1180 ? S 09:37 0:00
/usr/bin/stunnel4 /etc/xtunnel/stunnel.conf
The stunnel config looks like this:
foreground=yes
debug=7
pid=/var/run/xtunnel.pid
client=yes
CAfile=/etc/ssl/certs/xtunnel-chains.pem
verify=3
engine=dynamic
engineCtrl=SO_PATH:/usr/lib/engines/engine_pkcs11.so
engineCtrl=ID:pkcs11
engineCtrl=LIST_ADD:1
engineCtrl=LOAD
engineCtrl=MODULE_PATH:libopencryptoki.so.0
engineCtrl=INIT
engineCtrl=PIN:XXXXXX
[xtunnel]
engineNum=1
key=id_4
cert=/etc/xtunnel/client-cert.pem
accept=127.0.0.1:9999
connect=192.168.200.1:8443
Does anyone have an idea what could be going wrong here or how to
troubleshoot this any further?
Thanks!
Jan