Hello,
We are using stunnel 5.71 on Solaris 10 to tunnel NFS on more than 50 systems.
In the past we saw stunnel dumping cores without any errors or hints in the stunnel log file.
It was believed to be an OpenSSL 1.x problem.
We got an stunnel version with SSL 2.x and 3.x but we had to go back to 1.x because of other problems.
We noticed the dumps to happen at regular times and found a script that made NFS requests every 10 Min.
After changing the script so there are less NFS requests it was stable. It looked like the problem accumulated over time and a counter or something alike reached its
limit which then resulted in the core dump.
Then the core dumps started again. It still seems to be related to the amount of requests because the problem occurs on the two NFS Servers with the highest number of
connections. This time, we see errors in the stunnel log. The error slightly varies:
2023.10.12 22:05:43 LOG3[10745]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
// There were also a few cases with an additional "internal error". I.e:
2023.10.16 00:23:53 LOG5[2968]: Service [tls-nfs-srv] connected remote server from 127.0.0.1:38270
2023.10.16 00:40:19 LOG3[2968]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
2023.10.16 00:40:19 LOG3[2967]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
2023.10.16 00:40:19 LOG5[2968]: Connection reset: 2476 byte(s) sent to TLS, 3324 byte(s) sent to socket
2023.10.16 00:40:19 LOG5[2967]: Connection reset: 2500 byte(s) sent to TLS, 3100 byte(s) sent to socket
INTERNAL ERROR: Bad magic at ssl.c, line 192
2023.10.16 00:40:20 LOG6[ui]: Initializing inetd mode configuration
// or
2023.10.23 07:33:53 LOG5[62]: Service [tls-nfs-srv] connected remote server from 127.0.0.1:50626
2023.10.23 07:38:37 LOG3[62]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
2023.10.23 07:38:37 LOG5[62]: Connection reset: 364 byte(s) sent to TLS, 336 byte(s) sent to socket
...
2023.10.23 07:44:37 LOG5[61]: Connection reset: 1704 byte(s) sent to TLS, 2360 byte(s) sent to socket
INTERNAL ERROR: Bad magic at OpenSSL, line 0
// or
2023.10.24 08:20:59 LOG5[52]: Service [tls-nfs-srv] connected remote server from 127.0.0.1:43966
2023.10.24 08:56:13 LOG6[52]: Read socket closed (readsocket)
...
2023.10.24 09:00:01 LOG5[53]: Service [tls-nfs-srv] accepted connection from 1.2.3.4:59973
2023.10.24 09:00:01 LOG6[53]: Peer certificate not required
2023.10.24 09:00:01 LOG3[49]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
2023.10.24 09:00:01 LOG5[49]: Connection reset: 7112 byte(s) sent to TLS, 9188 byte(s) sent to socket
INTERNAL ERROR: Dead canary at OpenSSL, line 0
I have attached a file with more details.
Although this looks like an OpenSSL problem at first glance, we think it still could be a problem with stunnel.
Can you make more out of this data?
Do you know a way to log more information about the connections?
i.e. not all requests were accepted. What happened here?
2023.11.24 02:50:13 LOG7[359]: 354 server accept(s) requested
2023.11.24 02:50:13 LOG7[359]: 351 server accept(s) succeeded
Thank you in advance.
Best regards
Sasha
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager.