Hi,
I've spent the better part of the day trying to find answers on this,
and haven't had much luck.
I have two systems that talk to eachother via stunnel. In the last week
or so, I have noticed a number of errors in the application using the
tunnel (messaging application), requiring multiple re-sends, etc.
This is my setup.
Both server and client run CentOS 6.2, with the RPM version of stunnel.
Though it isn't the latest version, I believe RedHat do release any
critical bugs in their updated RPMs. I did also go through the ChangeLog
but couldn't find anything implemented since 4.29 that seemed to deal
with what I'm seeing.
stunnel 4.29 on x86_64-unknown-linux-gnu with OpenSSL 1.0.0-fips 29 Mar 2010
Threading:PTHREAD SSL:ENGINE Sockets:POLL,IPv6 Auth:LIBWRAP
Global options
debug = 5
pid = /var/run/stunnel.pid
RNDbytes = 64
RNDfile = /dev/urandom
RNDoverwrite = yes
Service-level options
cert = /etc/stunnel/stunnel.pem
ciphers = ALL:!aNULL:!eNULL:!SSLv2
key = /etc/stunnel/stunnel.pem
session = 300 seconds
stack = 65536 bytes
sslVersion = SSLv3 for client, all for server
TIMEOUTbusy = 300 seconds
TIMEOUTclose = 60 seconds
TIMEOUTconnect = 10 seconds
TIMEOUTidle = 43200 seconds
verify = none
Socket option defaults:
Option Accept Local Remote OS default
SO_DEBUG -- -- -- 0
SO_DONTROUTE -- -- -- 0
SO_KEEPALIVE -- -- -- 0
SO_LINGER -- -- -- 0:0
SO_OOBINLINE -- -- -- 0
SO_RCVBUF -- -- -- 87380
SO_SNDBUF -- -- -- 16384
SO_RCVLOWAT -- -- -- 1
SO_SNDLOWAT -- -- -- 1
SO_RCVTIMEO -- -- -- 0:0
SO_SNDTIMEO -- -- -- 0:0
SO_REUSEADDR 1 -- -- 0
SO_BINDTODEVICE -- -- -- --
TCP_KEEPCNT -- -- -- 9
TCP_KEEPIDLE -- -- -- 7200
TCP_KEEPINTVL -- -- -- 75
IP_TOS -- -- -- 0
IP_TTL -- -- -- 64
TCP_NODELAY -- -- -- 0
Common config (basically default file released by RedHat, with certs and
ports configured) :
----------------
; Protocol version (all, SSLv2, SSLv3, TLSv1)
sslVersion = SSLv3
; Some security enhancements for UNIX systems - comment them out on Win32
chroot = /var/run/stunnel/
setuid = stunnel
setgid = stunnel
; PID is created inside the chroot jail
pid = /stunnel.pid
; Some debugging stuff useful for troubleshooting
debug = 7
output = /var/log/stunnel.log
-------------------
When the connection breaks, the logs (debug=7) on both sides are:
Client:
2012.03.21 19:58:14 LOG7[30685:140185543952128]: SSL state (connect):
before/connect initialization
2012.03.21 19:58:14 LOG7[30685:140185543952128]: SSL state (connect):
SSLv3 write client hello A
2012.03.21 19:58:14 LOG3[30685:140185543952128]: SSL_connect: Peer
suddenly disconnected
2012.03.21 19:58:14 LOG5[30685:140185543952128]: Connection reset: 0
bytes sent to SSL, 0 bytes sent to socket
Server:
2012.03.21 19:58:14 LOG3[22230:140462283343616]: SSL_accept: Peer
suddenly disconnected
2012.03.21 19:58:14 LOG5[22230:140462283343616]: Connection reset: 0
bytes sent to SSL, 0 bytes sent to socket
As I mentioned, it happens intermittently, with probably 50% of the
connections working just fine, and the rest being disconnected. It
ALWAYS seems to happen just after the client 'write client hello A', as
opposed to later in the SSL handshake.
I ran a tcpdump on both sides, it is below. Note that both the client
and the server are NATd behind firewalls, on the server the port stunnel
listens on (31112) is opened through the firewall.
CLIENT (192.168.22.120, NATd externally as 66.66.66.66):
18:21:57.009887 IP 192.168.22.120.55747 > 77.77.77.77.31112: Flags [S],
seq 2556598106, win 14600, options [mss 1460,sackOK,TS val 88488311 ecr
0,nop,wscale 7], length 0
18:21:57.076130 IP 77.77.77.77.31112 > 192.168.22.120.55747: Flags [S.],
seq 351052925, ack 2556598107, win 14480, options [mss 1460,sackOK,TS
val 646044220 ecr 88488311,nop,wscale 7], length 0
18:21:57.076195 IP 192.168.22.120.55747 > 77.77.77.77.31112: Flags [.],
ack 1, win 115, options [nop,nop,TS val 88488377 ecr 646044220], length 0
18:21:57.077234 IP 192.168.22.120.55747 > 77.77.77.77.31112: Flags [P.],
seq 1:140, ack 1, win 115, options [nop,nop,TS val 88488378 ecr
646044220], length 139
18:21:57.143582 IP 77.77.77.77.31112 > 192.168.22.120.55747: Flags [.],
ack 140, win 122, options [nop,nop,TS val 646044288 ecr 88488378], length 0
18:21:57.143982 IP 77.77.77.77.31112 > 192.168.22.120.55747: Flags
[RP.], seq 1, ack 140, win 122, length 0
SERVER (10.65.0.130, NATd externally as 77.77.77.77):
18:21:57.042122 IP 66.66.66.66.7760 > 10.65.0.130.http: Flags [S], seq
4074124673, win 14600, options [mss 1460,sackOK,TS val 88488311 ecr
0,nop,wscale 7], length 0
18:21:57.042161 IP 10.65.0.130.http > 66.66.66.66.7760: Flags [S.], seq
458906148, ack 4074124674, win 14480, options [mss 1460,sackOK,TS val
646044220 ecr 88488311,nop,wscale 7], length 0
18:21:57.108325 IP 66.66.66.66.7760 > 10.65.0.130.http: Flags [.], ack
1, win 115, options [nop,nop,TS val 88488377 ecr 646044220], length 0
18:21:57.109507 IP 66.66.66.66.7760 > 10.65.0.130.http: Flags [P.], seq
1:140, ack 1, win 115, options [nop,nop,TS val 88488378 ecr 646044220],
length 139
18:21:57.109532 IP 10.65.0.130.http > 66.66.66.66.7760: Flags [.], ack
140, win 122, options [nop,nop,TS val 646044288 ecr 88488378], length 0
18:21:57.110092 IP 10.65.0.130.http > 66.66.66.66.7760: Flags [P.], seq
1:178, ack 140, win 122, options [nop,nop,TS val 646044288 ecr
88488378], length 177
18:21:57.175518 IP 66.66.66.66.7760 > 10.65.0.130.http: Flags [R.], seq
140, ack 1, win 115, length 0
Based on what I am seeing, a mysterious RST packet gets received by both
sides, which causes them to terminate the session. Neither side
originates any RST packets. When the connection is successful, the
tcpdump looks a lot more normal, with an appropriate FIN exchange
closing the session once the data has been sent.
My questions are - am I missing something obvious? Am I correct in
reading the above as some firewall or router between the two servers
breaking the TCP connection by sending TCP RSTs to both endpoints?
any help is appreciated.
jordan