[stunnel-users] stunnel service leaves stale connections to rsync on the server side

Polyack, Steve Steve.Polyack at intermedix.com
Wed Oct 22 20:25:52 CEST 2014


This turned out to be an issue with rsync-3.1.0, and fixed in rsync-3.1.1:
    - Fixed a bug in the early daemon protocol where a timeout failed to be
      honored (e.g. if the remote side fails to send us the initial protocol
      greeting).

Thanks,

Steve


From: stunnel-users [mailto:stunnel-users-bounces at stunnel.org] On Behalf Of Polyack, Steve
Sent: Wednesday, October 22, 2014 12:36 PM
To: stunnel-users at stunnel.org
Cc: Saathoff, Graham
Subject: [stunnel-users] stunnel service leaves stale connections to rsync on the server side

Hi,

We've been running a rsync service over stunnel for some time now.  After upgrading our environment, which involved moving from an old version of Linux to FreeBSD 9.3, and from stunnel-4.29 to stunnel-5.06, we've had a problem pop up.

Ever since switching over, we see that rsync daemon processes slowly build up over time on our server, each consuming 100% CPU.  The vast majority of our clients are working just fine, and the problematic rsync processes just seem to have been dead/aborted connections from the clients (which are using wireless aircard, so connectivity can be shotty).

A quick 'truss' on the process shows they are doing nothing but sitting there and read()'ing in a loop indefinitely:
# truss -p 14689 | head
read(6,0x7fffffff8240,1)                                                = 0 (0x0)
read(6,0x7fffffff8240,1)                                                = 0 (0x0)
read(6,0x7fffffff8240,1)                                                = 0 (0x0)
read(6,0x7fffffff8240,1)                                                = 0 (0x0)
read(6,0x7fffffff8240,1)                                                = 0 (0x0)
read(6,0x7fffffff8240,1)                                                = 0 (0x0)

Furthermore, the connection appears to be closed:
# sockstat -4 | grep 14689
root     rsync      14689 3  tcp4   127.0.0.1:873     127.0.0.1:27724
netstat -an | grep 27724
netstat: kvm not available: /dev/mem: No such file or directory
tcp4       0      0 127.0.0.1.873      127.0.0.1.27724    CLOSED

Looking through the stunnel logs, it looks like no data was ever really passed (perhaps the client gave up?):
2014.10.22 10:45:52 LOG7[34384941056]: Service [rsync] started
2014.10.22 10:45:52 LOG5[34384941056]: Service [rsync] accepted connection from 1.1.1.1:63095
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): before/accept initialization
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 read client hello A
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 write server hello A
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 write certificate A
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 write server done A
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 flush data
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 read client key exchange A
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 read finished A
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 write change cipher spec A
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 write finished A
2014.10.22 10:45:52 LOG7[34384941056]: SSL state (accept): SSLv3 flush data
2014.10.22 10:45:52 LOG7[34384941056]:   41 items in the session cache
2014.10.22 10:45:52 LOG7[34384941056]:    0 client connects (SSL_connect())
2014.10.22 10:45:52 LOG7[34384941056]:    0 client connects that finished
2014.10.22 10:45:52 LOG7[34384941056]:    0 client renegotiations requested
2014.10.22 10:45:52 LOG7[34384941056]:  110 server connects (SSL_accept())
2014.10.22 10:45:52 LOG7[34384941056]:  107 server connects that finished
2014.10.22 10:45:52 LOG7[34384941056]:    0 server renegotiations requested
2014.10.22 10:45:52 LOG7[34384941056]:   66 session cache hits
2014.10.22 10:45:52 LOG7[34384941056]:    0 external session cache hits
2014.10.22 10:45:52 LOG7[34384941056]:   32 session cache misses
2014.10.22 10:45:52 LOG7[34384941056]:    0 session cache timeouts
2014.10.22 10:45:52 LOG6[34384941056]: No peer certificate received
2014.10.22 10:45:52 LOG6[34384941056]: SSL accepted: new session negotiated
2014.10.22 10:45:52 LOG6[34384941056]: Negotiated SSLv3 ciphersuite AES256-SHA (256-bit encryption)
2014.10.22 10:45:52 LOG6[34384941056]: Compression: null, expansion: null
2014.10.22 10:45:52 LOG6[34384941056]: s_connect: connecting 127.0.0.1:873
2014.10.22 10:45:52 LOG5[34384941056]: s_connect: connected 127.0.0.1:873
2014.10.22 10:45:52 LOG5[34384941056]: Service [rsync] connected remote server from 127.0.0.1:27724
2014.10.22 10:45:52 LOG7[34384941056]: Remote socket (FD=11) initialized
2014.10.22 11:01:12 LOG3[34384941056]: SSL_read: Connection reset by peer (54)
2014.10.22 11:01:12 LOG5[34384941056]: Connection reset: 56 byte(s) sent to SSL, 56 byte(s) sent to socket
2014.10.22 11:01:12 LOG7[34384941056]: Remote socket (FD=11) closed
2014.10.22 11:01:12 LOG7[34384941056]: Local socket (FD=10) closed
2014.10.22 11:01:12 LOG7[34384941056]: Service [rsync] finished (4 left)

The clients are generally field units running stunnel versions in the 4.x range.  Unfortunately, updating them is not an easy task and is not really under my control.  Either way, this seems like a server side issue, and possibly an issue with rsync (we're running 3.1.0).

Has anyone run into anything like this before?  Does anyone have any suggestions for additional debugging we can do to try to pinpoint the problem?  We are going to try rolling back to stunnel-4.29 to see if the problem still occurs there on the new system.

Thanks,

Steve
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.stunnel.org/pipermail/stunnel-users/attachments/20141022/6ce042d2/attachment.html>


More information about the stunnel-users mailing list