[stunnel-users] stunnel randomly crashing
Chris Knipe
savage at savage.za.org
Thu Feb 2 15:10:38 CET 2017
Hi All,
Let me first get the formalities out of the way:
stunnel 5.40 on x86_64-unknown-linux-gnu platform
Compiled/running with OpenSSL 1.0.1f 6 Jan 2014
Threading:PTHREAD Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI
Compiled the latest version of stunnel today, and yes, I know I am running an old/insecure version of OpenSSL
stunnel.cnf
debug = debug
pid = /var/run/stunnel.pid
socket = l:TCP_NODELAY=1
socket = r:TCP_NODELAY=1
ciphers = ALL
options = NO_SSLv2
fips = no
[my.service]
accept = *:501
CAfile = /etc/stunnel/my.service.ca
cert = /etc/stunnel/my.service.pem
exec = /path/to/my/server
TIMEOUTclose = 0
Here’s what been happening. I’ve been running stunnel for almost 4 or 5 years now with the above service, BUT, I was always running under xinetd. I’ve never had one single issue, stunnel served me well, and I was rather happy. Lately however, the amount of connections (and I guess more importantly the rate of incoming connections) has been increasing steadily on the server. After lots of debugging, it was determined that this was due to stunnel processes constantly being fired up under xinetd. By a lot, I am talking about 20-30+ connections/sec.
So today, I took the time and changed our entire cluster of 17 servers…. All servers was upgraded to the latest version (we were on 5.24 previously), and instead of using xinetd I have amended the configurations so that stunnel now runs in daemon mode (runs under root). For the most of it, it works absolutely fine. As you can see the configuration is in maximum debugging mode, and I am thus taking as much info as I can out of the logs. The logs show absolutely nothing as to what is happening and why ☹ I’m more than happy to provide the logs to someone to look at, but there’s THOUSANDS of connections and debug info – it’s large, very large.
After a seemingly random amount of time (from a few minutes, to a few hours), and after successfully accepting THOUSANDS of connections, stunnel would just die. Nothing abnormal logged, no dmesg, no crash. The process just dies. Stunnel accepts by default 500 connections (which is a bit low), but I have also confirmed that it is not stunnel running out of file descriptors. When it does run out, stunnel logs the appropriate connections refused messages, and continues to run (i.e. it does not crash – we’ve specifically tested that).
# cat /proc/7095/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 257585 257585 processes
Max open files 1024 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 257585 257585 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
# ls /proc/7095/fd/ | wc -l
956
We are very far from the limits (from what I understand at least, seeing that we are running under root, hard limit?).
Except for the fact that the servers are very busy in terms of incoming connections/sec (although +- 20/sec surely can’t be that much?), is there anything else to possibly look at? After moving from xinetd to daemon mode my load on the server dropped by > 60% - so the saving is significant and I don’t want to go back to xinetd mode if I can avoid it. It also means that the machine isn’t under any significant strain anymore where load could be a factor affecting stunnel.
Thnx,
Chris.
More information about the stunnel-users
mailing list