I have the following kind of test environment: SSL clients call a public ip address from which the calls are forwarded to a linux server with Stunnel. The linux server is in a private network. Stunnel decrypts the data and sends it forward.
I have been testing this with a browser, monitoring the traffic on the server to see that Stunnel forwards the calls. Some strange things happen there that I can't explain. First of all: sometimes the calls go through the server as expected, sometimes the server doesn't respond in any way to the client. If I have two terminal windows open, one with tcpdump and another with tail -f stunnel.log - nothing comes into the log in spite of the incoming connections attempts.
Then on other occasions when the calls come to the server, it forwards them beautifully to the address and port set in the configuration file.
Does anyone have any clue, what this could be due to? I haven't been able to explain why it sometimes works and sometimes doesn't.
Another thing that bothers me is, that sometimes there are TCP frames with incorrect checksum. I've monitored with Ethereal and tcpdump. Both show incorrect frames, and they are always from the stunnel-end of the connection. What could be the cause of those broken frames?
Tommi Nieminen
-------------------------------------------------------- My stunnel.conf looks like this: (any constructive criticism would be welcome :-))
CAfile = /home/tommi/cert/7/demoCA/cacert.pem cert = /home/tommi/cert/7/newcert.pem key = /home/tommi/cert/7/newkey.pem
socket = l:TCP_NODELAY=1 socket = r:TCP_NODELAY=1
output = /var/log/stunnel/stunnel.log pid = /var/run/stunnel/stunnel.pid debug = 7 client = no
[https] accept = 443 connect = 192.168.10.17:5010 TIMEOUTclose = 0
-------------------------------------------------------- A succesful connection from stunnel.log:
2006.10.09 18:01:43 LOG7[11889:3083744960]: https accepted FD=7 from 131.177.254.92:2689 2006.10.09 18:01:43 LOG7[11889:3083742128]: https started 2006.10.09 18:01:43 LOG7[11889:3083742128]: FD 7 in non-blocking mode 2006.10.09 18:01:43 LOG7[11889:3083742128]: TCP_NODELAY option set on local socket 2006.10.09 18:01:43 LOG5[11889:3083742128]: https connected from 131.177.254.92:2689 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): before/accept initialization 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 read client hello A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write server hello A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write certificate A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write server done A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 flush data 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 read client key exchange A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 read finished A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write change cipher spec A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write finished A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 flush data 2006.10.09 18:01:43 LOG7[11889:3083742128]: 2 items in the session cache 2006.10.09 18:01:43 LOG7[11889:3083742128]: 0 client connects (SSL_connect()) 2006.10.09 18:01:43 LOG7[11889:3083742128]: 0 client connects that finished 2006.10.09 18:01:43 LOG7[11889:3083742128]: 0 client renegotiations requested 2006.10.09 18:01:43 LOG7[11889:3083742128]: 8 server connects (SSL_accept()) 2006.10.09 18:01:43 LOG7[11889:3083742128]: 7 server connects that finished 2006.10.09 18:01:43 LOG7[11889:3083742128]: 0 server renegotiations requested 2006.10.09 18:01:43 LOG7[11889:3083742128]: 4 session cache hits 2006.10.09 18:01:43 LOG7[11889:3083742128]: 1 session cache misses 2006.10.09 18:01:43 LOG7[11889:3083742128]: 1 session cache timeouts 2006.10.09 18:01:43 LOG6[11889:3083742128]: SSL accepted: new session negotiated 2006.10.09 18:01:43 LOG6[11889:3083742128]: Negotiated ciphers: AES256-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(256) Mac=SHA1 2006.10.09 18:01:43 LOG7[11889:3083742128]: FD 8 in non-blocking mode 2006.10.09 18:01:43 LOG7[11889:3083742128]: https connecting 192.168.10.17:5010 2006.10.09 18:01:43 LOG7[11889:3083742128]: connect_wait: waiting 10 seconds 2006.10.09 18:01:43 LOG7[11889:3083742128]: connect_wait: connected 2006.10.09 18:01:43 LOG7[11889:3083742128]: Remote FD=8 initialized 2006.10.09 18:01:43 LOG7[11889:3083742128]: TCP_NODELAY option set on remote socket 2006.10.09 18:02:45 LOG7[11889:3083742128]: Socket closed on read 2006.10.09 18:02:45 LOG7[11889:3083742128]: SSL write shutdown 2006.10.09 18:02:45 LOG7[11889:3083742128]: SSL alert (write): warning: close notify 2006.10.09 18:02:45 LOG7[11889:3083742128]: SSL_shutdown retrying 2006.10.09 18:02:45 LOG7[11889:3083742128]: SSL doesn't need to read or write 2006.10.09 18:02:45 LOG6[11889:3083742128]: s_poll_wait timeout: connection close 2006.10.09 18:02:45 LOG5[11889:3083742128]: Connection closed: 0 bytes sent to SSL, 405 bytes sent to socket 2006.10.09 18:02:45 LOG7[11889:3083742128]: https finished (0 left)
(and the failed connections leave also no mark in the log, but tcpdump sees the attempts on the server)
-------------------------------------------------------- A sample of tcpdump's output with incorrect checksum:
18:02:45.150656 IP (tos 0x0, ttl 64, id 11218, offset 0, flags [DF], proto: TCP (6), length: 77) 192.168.20.18.https > 131.177.254.92.2689: P, cksum 0x5708 (incorrect (-> 0xf1bf), 1035:1072(37) ack 756 win 7504
Another thing that bothers me is, that sometimes there are TCP frames with incorrect checksum. I've monitored with Ethereal and tcpdump. Both show incorrect frames, and they are always from the stunnel-end of the connection. What could be the cause of those broken frames?
I was talking to a friend of mine about these checksum errors. He had seen them also, but only if Ethereal was in the same server where the ssl software was (he had used different ssl software, not Stunnel). If he used an external pc for monitoring, then Ethereal showed no checksum errors. So it seems that for some reason Ethereal should not be run in the same server were SSL software is running. I'll test this myself, once I get a chance.
This probably answers one of my questions. The other one remains: sometimes the server doesn't respond to connections. Any ideas?
Tommi
What does the tcpdump indicate? Are the failed connections getting dropped or reset on the computer that's forwarding or are they actually arriving at the stunnel server? If they make it to the stunnel server what does tcpdump indicate at that connection point. Pete ----- Original Message ----- From: "Tommi Nieminen" ttn@mbnet.fi To: stunnel-users@mirt.net Sent: Monday, October 09, 2006 9:40 AM Subject: [stunnel-users] Connection problems and TCP frame checksum errors
I have the following kind of test environment: SSL clients call a public ip address from which the calls are forwarded to a linux server with Stunnel. The linux server is in a private network. Stunnel decrypts the data and sends it forward.
I have been testing this with a browser, monitoring the traffic on the server to see that Stunnel forwards the calls. Some strange things happen there that I can't explain. First of all: sometimes the calls go through the server as expected, sometimes the server doesn't respond in any way to the client. If I have two terminal windows open, one with tcpdump and another with tail -f stunnel.log - nothing comes into the log in spite of the incoming connections attempts.
Then on other occasions when the calls come to the server, it forwards them beautifully to the address and port set in the configuration file.
Does anyone have any clue, what this could be due to? I haven't been able to explain why it sometimes works and sometimes doesn't.
Another thing that bothers me is, that sometimes there are TCP frames with incorrect checksum. I've monitored with Ethereal and tcpdump. Both show incorrect frames, and they are always from the stunnel-end of the connection. What could be the cause of those broken frames?
Tommi Nieminen
My stunnel.conf looks like this: (any constructive criticism would be welcome :-))
CAfile = /home/tommi/cert/7/demoCA/cacert.pem cert = /home/tommi/cert/7/newcert.pem key = /home/tommi/cert/7/newkey.pem
socket = l:TCP_NODELAY=1 socket = r:TCP_NODELAY=1
output = /var/log/stunnel/stunnel.log pid = /var/run/stunnel/stunnel.pid debug = 7 client = no
[https] accept = 443 connect = 192.168.10.17:5010 TIMEOUTclose = 0
A succesful connection from stunnel.log:
2006.10.09 18:01:43 LOG7[11889:3083744960]: https accepted FD=7 from 131.177.254.92:2689 2006.10.09 18:01:43 LOG7[11889:3083742128]: https started 2006.10.09 18:01:43 LOG7[11889:3083742128]: FD 7 in non-blocking mode 2006.10.09 18:01:43 LOG7[11889:3083742128]: TCP_NODELAY option set on local socket 2006.10.09 18:01:43 LOG5[11889:3083742128]: https connected from 131.177.254.92:2689 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): before/accept initialization 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 read client hello A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write server hello A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write certificate A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write server done A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 flush data 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 read client key exchange A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 read finished A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write change cipher spec A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 write finished A 2006.10.09 18:01:43 LOG7[11889:3083742128]: SSL state (accept): SSLv3 flush data 2006.10.09 18:01:43 LOG7[11889:3083742128]: 2 items in the session cache 2006.10.09 18:01:43 LOG7[11889:3083742128]: 0 client connects (SSL_connect()) 2006.10.09 18:01:43 LOG7[11889:3083742128]: 0 client connects that finished 2006.10.09 18:01:43 LOG7[11889:3083742128]: 0 client renegotiations requested 2006.10.09 18:01:43 LOG7[11889:3083742128]: 8 server connects (SSL_accept()) 2006.10.09 18:01:43 LOG7[11889:3083742128]: 7 server connects that finished 2006.10.09 18:01:43 LOG7[11889:3083742128]: 0 server renegotiations requested 2006.10.09 18:01:43 LOG7[11889:3083742128]: 4 session cache hits 2006.10.09 18:01:43 LOG7[11889:3083742128]: 1 session cache misses 2006.10.09 18:01:43 LOG7[11889:3083742128]: 1 session cache timeouts 2006.10.09 18:01:43 LOG6[11889:3083742128]: SSL accepted: new session negotiated 2006.10.09 18:01:43 LOG6[11889:3083742128]: Negotiated ciphers: AES256-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(256) Mac=SHA1 2006.10.09 18:01:43 LOG7[11889:3083742128]: FD 8 in non-blocking mode 2006.10.09 18:01:43 LOG7[11889:3083742128]: https connecting 192.168.10.17:5010 2006.10.09 18:01:43 LOG7[11889:3083742128]: connect_wait: waiting 10 seconds 2006.10.09 18:01:43 LOG7[11889:3083742128]: connect_wait: connected 2006.10.09 18:01:43 LOG7[11889:3083742128]: Remote FD=8 initialized 2006.10.09 18:01:43 LOG7[11889:3083742128]: TCP_NODELAY option set on remote socket 2006.10.09 18:02:45 LOG7[11889:3083742128]: Socket closed on read 2006.10.09 18:02:45 LOG7[11889:3083742128]: SSL write shutdown 2006.10.09 18:02:45 LOG7[11889:3083742128]: SSL alert (write): warning: close notify 2006.10.09 18:02:45 LOG7[11889:3083742128]: SSL_shutdown retrying 2006.10.09 18:02:45 LOG7[11889:3083742128]: SSL doesn't need to read or write 2006.10.09 18:02:45 LOG6[11889:3083742128]: s_poll_wait timeout: connection close 2006.10.09 18:02:45 LOG5[11889:3083742128]: Connection closed: 0 bytes sent to SSL, 405 bytes sent to socket 2006.10.09 18:02:45 LOG7[11889:3083742128]: https finished (0 left)
(and the failed connections leave also no mark in the log, but tcpdump sees the attempts on the server)
A sample of tcpdump's output with incorrect checksum:
18:02:45.150656 IP (tos 0x0, ttl 64, id 11218, offset 0, flags [DF], proto: TCP (6), length: 77) 192.168.20.18.https > 131.177.254.92.2689: P, cksum 0x5708 (incorrect (-> 0xf1bf), 1035:1072(37) ack 756 win 7504
stunnel-users mailing list stunnel-users@mirt.net http://stunnel.mirt.net/mailman/listinfo/stunnel-users
-- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.407 / Virus Database: 268.13.1/466 - Release Date: 10/7/2006
What does the tcpdump indicate? Are the failed connections getting dropped or reset on the computer that's forwarding or are they actually arriving at the stunnel server? If they make it to the stunnel server what does tcpdump indicate at that connection point.
The connections are actually forwarded by a router, not a computer.
The connections arrive at the stunnel server. The following is the tcpdump from the stunnel server. All the traffic of a failed connection is there. After about 20 seconds Seamonkey gives up saying "Network Error".
I've added empty lines to make the text a bit more legible.
---------------------------------------------------------------- 14:57:07.990693 IP (tos 0x20, ttl 116, id 62395, offset 0, flags [DF], proto: TCP (6), length: 48) 131.177.254.92.3792 > 192.168.20.18.https: S, cksum 0x5509 (correct), 1333491727:1333491727(0) win 65535 <mss 1260,nop,nop,sackOK>
14:57:10.906554 IP (tos 0x20, ttl 116, id 62429, offset 0, flags [DF], proto: TCP (6), length: 48) 131.177.254.92.3792 > 192.168.20.18.https: S, cksum 0x5509 (correct), 1333491727:1333491727(0) win 65535 <mss 1260,nop,nop,sackOK>
14:57:16.916385 IP (tos 0x20, ttl 116, id 62499, offset 0, flags [DF], proto: TCP (6), length: 48) 131.177.254.92.3792 > 192.168.20.18.https: S, cksum 0x5509 (correct), 1333491727:1333491727(0) win 65535 <mss 1260,nop,nop,sackOK> ----------------------------------------------------------------
As you can see, there is nothing coming back from the server. And since tcpdump saw the incoming call, stunnel should see it too. They are on the same machine.
It's so strange: at one time I connect the server, and it forwards the traffic just the way it should. Then quite inexplicably, it just won't do it...and then it forwards it again. I have no clue what makes it to not work and then to work again. I don't need to restart the server, I'm not changing anything. It's like there would be some kind of an internal timer, but that doesn't make any sense. And there has been only one connection attempt at a time, so it can't be the excess of traffic either.
Tommi
The SYN packets look ok to me.
A few things to do if you have not done so:
1. "netstat -an" - to make sure stunnel is listening on the correct interface and port
2. does "lastcomm stunnel' show anything useful? If you don't use threads a new stunnel process starts with each connection.
3. just a guess but remove the socket entries in the config file - maybe they are causing a problem. I don't use them but maybe there is a good reason to use them.
4. try connecting directly to the stunnel box (no router). does that always work
5. maybe the NIC card is flaky
6. run "stunnel -version" to verify all is configured as you think.
That's all I can think of at the moment.
pete
5.
----- Original Message ----- From: "Tommi Nieminen" ttn@mbnet.fi To: "Peter" pslists@warren-selbert.com Cc: stunnel-users@mirt.net Sent: Wednesday, October 11, 2006 5:55 AM Subject: Re: [stunnel-users] Connection problems and TCP frame checksum errors
What does the tcpdump indicate? Are the failed connections getting dropped or reset on the computer that's forwarding or are they actually arriving at the stunnel server? If they make it to the stunnel server what does tcpdump indicate at that connection point.
The connections are actually forwarded by a router, not a computer.
The connections arrive at the stunnel server. The following is the tcpdump from the stunnel server. All the traffic of a failed connection is there. After about 20 seconds Seamonkey gives up saying "Network Error".
I've added empty lines to make the text a bit more legible.
14:57:07.990693 IP (tos 0x20, ttl 116, id 62395, offset 0, flags [DF], proto: TCP (6), length: 48) 131.177.254.92.3792 > 192.168.20.18.https: S, cksum 0x5509 (correct), 1333491727:1333491727(0) win 65535 <mss 1260,nop,nop,sackOK>
14:57:10.906554 IP (tos 0x20, ttl 116, id 62429, offset 0, flags [DF], proto: TCP (6), length: 48) 131.177.254.92.3792 > 192.168.20.18.https: S, cksum 0x5509 (correct), 1333491727:1333491727(0) win 65535 <mss 1260,nop,nop,sackOK>
14:57:16.916385 IP (tos 0x20, ttl 116, id 62499, offset 0, flags [DF], proto: TCP (6), length: 48) 131.177.254.92.3792 > 192.168.20.18.https: S, cksum 0x5509 (correct), 1333491727:1333491727(0) win 65535 <mss 1260,nop,nop,sackOK>
As you can see, there is nothing coming back from the server. And since tcpdump saw the incoming call, stunnel should see it too. They are on the same machine.
It's so strange: at one time I connect the server, and it forwards the traffic just the way it should. Then quite inexplicably, it just won't do it...and then it forwards it again. I have no clue what makes it to not work and then to work again. I don't need to restart the server, I'm not changing anything. It's like there would be some kind of an internal timer, but that doesn't make any sense. And there has been only one connection attempt at a time, so it can't be the excess of traffic either.
Tommi
-- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.408 / Virus Database: 268.13.1/470 - Release Date: 10/10/2006
Hi Peter,
thanks for all your suggestions. They were really helpful in bringing me to the solution of the problem.
- "netstat -an" - to make sure stunnel is listening on the correct
interface and port
This was OK.
- does "lastcomm stunnel' show anything useful? If you don't use threads a new stunnel process starts with each connection.
This showed nothing useful.
- just a guess but remove the socket entries in the config file - maybe they are causing a problem. I don't use them but maybe there is a
good reason to use them.
The socket entries were there because they were in the original config file which I edited for my purposes. They seemed ok to me so I left them in my config when I began experimenting with stunnel. Commenting them out didn't make any difference for this problem.
- try connecting directly to the stunnel box (no router). does that
always work
Maybe not always, but remarkably better!!!
- maybe the NIC card is flaky
The card had worked just fine until then, so I didn't really believe in this. I thought I'd save this for the last.
- run "stunnel -version" to verify all is configured as you think.
Seems all right.
So what the heck could the problem be. It took me a long time to figure out the answer. The fact that almost all connection attempts succeeded when the router was left out of the picture would suggest there was a problem with the router configurations. But no, the router was correctly configured. Instead, the routing tables of the linux work station were not right! That's a problem I've hardly ever had to deal with (and therefore a subject I don't understand enough of) so it took some experimenting to get the routing tables right. Now it looks good. I still can't explain why the original routing tables sometimes worked and sometimes didn't, but I'll study the subject :-)
Tommi
That's great news your up and running. Good detective work on your part. Glad I was able to offer some help.
Pete
----- Original Message ----- From: "Tommi Nieminen" ttn@mbnet.fi To: "Peter" pslists@warren-selbert.com Cc: stunnel-users@mirt.net Sent: Saturday, October 21, 2006 2:11 AM Subject: Re: [stunnel-users] Connection problems and TCP frame checksum errors
Hi Peter,
thanks for all your suggestions. They were really helpful in bringing me to the solution of the problem.
- "netstat -an" - to make sure stunnel is listening on the correct
interface and port
This was OK.
- does "lastcomm stunnel' show anything useful? If you don't use threads a new stunnel process starts with each connection.
This showed nothing useful.
- just a guess but remove the socket entries in the config file - maybe they are causing a problem. I don't use them but maybe there is a
good reason to use them.
The socket entries were there because they were in the original config file which I edited for my purposes. They seemed ok to me so I left them in my config when I began experimenting with stunnel. Commenting them out didn't make any difference for this problem.
- try connecting directly to the stunnel box (no router). does that
always work
Maybe not always, but remarkably better!!!
- maybe the NIC card is flaky
The card had worked just fine until then, so I didn't really believe in this. I thought I'd save this for the last.
- run "stunnel -version" to verify all is configured as you think.
Seems all right.
So what the heck could the problem be. It took me a long time to figure out the answer. The fact that almost all connection attempts succeeded when the router was left out of the picture would suggest there was a problem with the router configurations. But no, the router was correctly configured. Instead, the routing tables of the linux work station were not right! That's a problem I've hardly ever had to deal with (and therefore a subject I don't understand enough of) so it took some experimenting to get the routing tables right. Now it looks good. I still can't explain why the original routing tables sometimes worked and sometimes didn't, but I'll study the subject :-)
Tommi