stunnel (5.62 and 5.59) causing high load... - stunnel-users

6 Feb 2022


      Hi there,
I was running stunnel successfully for many years (4.26 with 
older openssl) wrapping http and imap and upgraded for security 
reasons to openssl 1.1.1 with stunnel 5.59.
The platform is somewhat exotic: SPARC Solaris 10 using 
gcc 3.4.6.
The stunnel works for imap and https without problems for some 
time but after heavy load conditions it starts consuming 100%
CPU load although it still works.
In this case it is not the DH computation as no ciphers depending
on that one are used:
2022.02.06 11:56:33 LOG6[ui]: DH initialization skipped: no DH ciphersuites
On the life machine it happens e.g. after some script based port
scan attacks and in a way to trigger the phenomenon I found, that 
running a cipher enumberate for example triggers the "high-load"
state. Running nmap 7.80 the command doing the job is
nmap --script ssl-enum-ciphers -p PORT SERVER-IP
Checking the log-file (level 7) the end of the logfile after such
a burst of acitity with the stunnel ennding up at 100% CPU load 
looks like that:
2022.02.06 14:23:17 LOG7[39]: Deallocating application specific data for
session connect address
2022.02.06 14:23:17 LOG7[39]: linger (local): Invalid argument (22)
2022.02.06 14:23:17 LOG7[39]: Local descriptor (FD=22) closed
2022.02.06 14:23:17 LOG7[39]: Service [imaps] finished (6 left)
2022.02.06 14:23:14 LOG7[main]: FD=4 events=0x1 revents=0x1
2022.02.06 14:23:17 LOG7[main]: FD=11 events=0x1 revents=0x0
2022.02.06 14:23:17 LOG7[main]: FD=12 events=0x1 revents=0x0
2022.02.06 14:23:17 LOG7[main]: FD=13 events=0x1 revents=0x0
2022.02.06 14:23:17 LOG7[main]: FD=14 events=0x1 revents=0x0
2022.02.06 14:23:17 LOG7[main]: Dispatching a signal from the signal pipe
2022.02.06 14:23:17 LOG7[main]: Processing SIGCHLD
2022.02.06 14:23:17 LOG7[main]: Retrieving pid statuses with waitpid()
2022.02.06 14:23:17 LOG7[main]: Retrieving pid statuses with waitpid() - DONE
Remarkable is that some imaps (we named the IMAP service to be wrapped
imaps) still seem to remain active (6 left). The last entry always 
is the one with the waitpid() and to be sure that it is not the 
while loop in that one (stunnel.c) I added an additional log entry
"DONE" in the source whenever the routine is left - so it is not that 
easy.
The workaround currently is a script monitoring the CPU usage of 
stunnel and killing/restarting it if >80% for extended period of
time is observed, but that can just be a temporary solution.
Any ideas what to try/debug locating the reason for the problem?
Thanks,
Erik.
P.S. Experimenting with the timeouts does not alter the phenomenon.
P.P.S. Here the active lines of the config file (deleted comments and
       unused lines to remove clutter):
cert = /usr/local/etc/stunnel/stunnel.pem
sslVersion = all
chroot = /var/lib/stunnel/
setuid = nobody
setgid = nogroup
pid = /stunnel.pid
socket = l:TCP_NODELAY=1
socket = r:TCP_NODELAY=1
[imaps]
accept  = 14310
connect = 143  
local = 127.0.0.1