Hi there,
I was running stunnel successfully for many years (4.26 with older openssl) wrapping http and imap and upgraded for security reasons to openssl 1.1.1 with stunnel 5.59.
The platform is somewhat exotic: SPARC Solaris 10 using gcc 3.4.6.
The stunnel works for imap and https without problems for some time but after heavy load conditions it starts consuming 100% CPU load although it still works.
In this case it is not the DH computation as no ciphers depending on that one are used:
2022.02.06 11:56:33 LOG6[ui]: DH initialization skipped: no DH ciphersuites
On the life machine it happens e.g. after some script based port scan attacks and in a way to trigger the phenomenon I found, that running a cipher enumberate for example triggers the "high-load" state. Running nmap 7.80 the command doing the job is
nmap --script ssl-enum-ciphers -p PORT SERVER-IP
Checking the log-file (level 7) the end of the logfile after such a burst of acitity with the stunnel ennding up at 100% CPU load looks like that:
2022.02.06 14:23:17 LOG7[39]: Deallocating application specific data for session connect address 2022.02.06 14:23:17 LOG7[39]: linger (local): Invalid argument (22) 2022.02.06 14:23:17 LOG7[39]: Local descriptor (FD=22) closed 2022.02.06 14:23:17 LOG7[39]: Service [imaps] finished (6 left) 2022.02.06 14:23:14 LOG7[main]: FD=4 events=0x1 revents=0x1 2022.02.06 14:23:17 LOG7[main]: FD=11 events=0x1 revents=0x0 2022.02.06 14:23:17 LOG7[main]: FD=12 events=0x1 revents=0x0 2022.02.06 14:23:17 LOG7[main]: FD=13 events=0x1 revents=0x0 2022.02.06 14:23:17 LOG7[main]: FD=14 events=0x1 revents=0x0 2022.02.06 14:23:17 LOG7[main]: Dispatching a signal from the signal pipe 2022.02.06 14:23:17 LOG7[main]: Processing SIGCHLD 2022.02.06 14:23:17 LOG7[main]: Retrieving pid statuses with waitpid() 2022.02.06 14:23:17 LOG7[main]: Retrieving pid statuses with waitpid() - DONE
Remarkable is that some imaps (we named the IMAP service to be wrapped imaps) still seem to remain active (6 left). The last entry always is the one with the waitpid() and to be sure that it is not the while loop in that one (stunnel.c) I added an additional log entry "DONE" in the source whenever the routine is left - so it is not that easy.
The workaround currently is a script monitoring the CPU usage of stunnel and killing/restarting it if >80% for extended period of time is observed, but that can just be a temporary solution.
Any ideas what to try/debug locating the reason for the problem?
Thanks,
Erik.
P.S. Experimenting with the timeouts does not alter the phenomenon. P.P.S. Here the active lines of the config file (deleted comments and unused lines to remove clutter):
cert = /usr/local/etc/stunnel/stunnel.pem sslVersion = all
chroot = /var/lib/stunnel/ setuid = nobody setgid = nogroup
pid = /stunnel.pid
socket = l:TCP_NODELAY=1 socket = r:TCP_NODELAY=1
[imaps] accept = 14310 connect = 143 local = 127.0.0.1