I'm setting up load-balanced service requiring stunnel with SSL on its front end. There are two hosts behind load balancer running stunnel . Service behind stunnel does not speak SSL. Every few minutes load balancer checks if those stunnels are still alive by opening tcp connection to stunnel listening port. Now problem is that stunnel closes those test connections with RST and load balancer takes it as that host is dead. This is without "client = yes" option because service does not speak SSL
load-balancer -> stunnel-host TCP D=1234 S=33007 Syn stunnel-host -> load-balancer TCP D=33007 S=1234 Syn Ack load-balancer -> stunnel-host TCP D=1234 S=33007 Ack load-balancer -> stunnel-host TCP D=1234 S=33007 Fin Ack stunnel-host -> load-balancer TCP D=33007 S=1234 Ack stunnel-host -> load-balancer TCP D=33007 S=1234 Rst
On the other hand with client=yes everything works fine
load-balancer -> stunnel-host TCP D=123 S=33010 Syn stunnel-host -> load-balancer TCP D=33010 S=123 Syn Ack load-balancer -> stunnel-host TCP D=123 S=33010 Ack load-balancer -> stunnel-host TCP D=123 S=33010 Fin Ack stunnel-host -> load-balancer TCP D=33010 S=123 Fin Ack load-balancer -> stunnel-host TCP D=123 S=33010 Ack
Is there any way to make stunnel without "client = yes" close connection "normal way" with FIN instead of RST ?
stunnel is last version, load balancer is some older F5 BigIP
thanks sergei
On 2/7/06, Michal Trojnara Michal.Trojnara@mobi-com.net wrote:
sergei wrote:
Is there any way to make stunnel without "client = yes" close connection "normal way" with FIN instead of RST ?
Stunnel resets connections for a reason. Probably it was reset by the other peer. Check your stunnel log files for details.
One reason I can think of is that load-balancer does not speak SSL and just tries to monitor SSL-speaking stunnel by opening a tcp connection. Its just like if you telnet to SSL-speaking end of stunnel and immediately close connection. After receiving FIN from you - stunnel will send RST back. Telnet does not care but this F5 BigIP does and takes it as a failure nevermind tha it was actually able to open connection. On the other hand, say, Apache with mod-ssl does not behave like that.
2006.02.07 11:03:15 LOG7[12097:0]: CONTEXT 1, FD=4, (IN)->() 2006.02.07 11:03:15 LOG7[12097:0]: CONTEXT 1, FD=6, (IN)->() 2006.02.07 11:03:15 LOG7[12097:0]: CONTEXT 1, FD=7, (IN)->(IN) 2006.02.07 11:03:15 LOG7[12097:1]: Context set: 135 (dropped) -> 1 2006.02.07 11:03:15 LOG7[12097:1]: Current context: 1 2006.02.07 11:03:15 LOG7[12097:1]: Releasing context 135 2006.02.07 11:03:15 LOG7[12097:1]: a_service accepted FD=0 from load_balancer:61681 2006.02.07 11:03:15 LOG7[12097:1]: Creating a new context 2006.02.07 11:03:15 LOG7[12097:1]: Context 136 created 2006.02.07 11:03:15 LOG7[12097:136]: Context swap: 1 -> 136 2006.02.07 11:03:15 LOG7[12097:136]: a_service started 2006.02.07 11:03:15 LOG7[12097:136]: FD 0 in non-blocking mode 2006.02.07 11:03:15 LOG5[12097:136]: a_service connected from load_balancer:61681 2006.02.07 11:03:15 LOG7[12097:136]: SSL state (accept): before/accept initialization 2006.02.07 11:03:15 LOG3[12097:136]: SSL_accept: Peer suddenly disconnected 2006.02.07 11:03:15 LOG7[12097:136]: a_service finished (0 left) 2006.02.07 11:03:15 LOG5[12097:136]: stack_info: size=65536, current=4348 (6%), maximum=10472 (15%) 2006.02.07 11:03:15 LOG7[12097:136]: Context 136 closed 2006.02.07 11:03:15 LOG7[12097:0]: Waiting -1 second(s) for 3 file descriptor(s) 2006.02.07 11:03:15 LOG7[12097:0]: CONTEXT 1, FD=4, (IN)->() 2006.02.07 11:03:15 LOG7[12097:0]: CONTEXT 1, FD=6, (IN)->(IN) 2006.02.07 11:03:15 LOG7[12097:0]: CONTEXT 1, FD=7, (IN)->() 2006.02.07 11:03:15 LOG7[12097:1]: Context set: 136 (dropped) -> 1 2006.02.07 11:03:15 LOG7[12097:1]: Current context: 1 2006.02.07 11:03:15 LOG7[12097:1]: Releasing context 136 2006.02.07 11:03:15 LOG7[12097:1]: snapws accepted FD=0 from load_balancer:61683 2006.02.07 11:03:15 LOG7[12097:1]: Creating a new context 2006.02.07 11:03:15 LOG7[12097:1]: Context 137 created 2006.02.07 11:03:15 LOG7[12097:137]: Context swap: 1 -> 137 2006.02.07 11:03:15 LOG7[12097:137]: snapws started 2006.02.07 11:03:15 LOG7[12097:137]: FD 0 in non-blocking mode 2006.02.07 11:03:15 LOG5[12097:137]: snapws connected from load_balancer:61683 2006.02.07 11:03:15 LOG7[12097:137]: SSL state (accept): before/accept initialization 2006.02.07 11:03:15 LOG3[12097:137]: SSL_accept: Peer suddenly disconnected 2006.02.07 11:03:15 LOG7[12097:137]: snapws finished (0 left) 2006.02.07 11:03:15 LOG5[12097:137]: stack_info: size=65536, current=4348 (6%), maximum=10472 (15%) 2006.02.07 11:03:15 LOG7[12097:137]: Context 137 closed 2006.02.07 11:03:15 LOG7[12097:0]: Waiting -1 second(s) for 3 file descriptor(s)
sergei wrote:
SSL_accept: Peer suddenly disconnected
That's it. The client that was connecting stunnel did not negotiate SSL, but closed the connection instead. Stunnel sent RST packet to let its peers know about this problem.
In your case BIG-IP incorrectly assumed TCP RST on an established connection indicates a server problem. It is an obvious bug in BIG-IP software.
Here is the workaround for you:
*** client.c.orig 2006-02-08 08:53:02.000782136 +0100 --- client.c 2006-02-08 08:53:12.000737865 +0100 *************** *** 1041,1053 **** }
static void reset(int fd, char *txt) { - /* Set lingering on a socket if needed*/ - struct linger l; - - l.l_onoff=1; - l.l_linger=0; - if(setsockopt(fd, SOL_SOCKET, SO_LINGER, (void *)&l, sizeof(l))) - log_error(LOG_DEBUG, get_last_socket_error(), txt); }
/* End of client.c */ --- 1041,1046 ----
Best regards, Mike
On 2/7/06, Michal Trojnara Michal.Trojnara@mobi-com.net wrote:
sergei wrote:
SSL_accept: Peer suddenly disconnected
That's it. The client that was connecting stunnel did not negotiate SSL, but closed the connection instead. Stunnel sent RST packet to let its peers know about this problem.
In your case BIG-IP incorrectly assumed TCP RST on an established connection indicates a server problem. It is an obvious bug in BIG-IP software.
Here is the workaround for you:
Thanks Michal. That took care of those RSTs. Found some more quirks with that load balancer later but anyhow it all works fine now.
sergei
I'm setting up load-balanced service requiring stunnel with SSL on its front end. There are two hosts behind load balancer running stunnel . Service behind stunnel does not speak SSL. Every few minutes load balancer checks if
...
If you send me the stunnel configuration as well as the relevant bigip.conf sections I can probably figure it out.
Is there any way to make stunnel without "client = yes" close connection "normal way" with FIN instead of RST ?
That's not Stunnel, it's the OS (linux/etc). Stunnel doesn't have an entire TCP/IP stack built into it, it relies on the OS handling all that for it. It uses "accept/bind/close/connect" system calls, and that's all it needs.
stunnel is last version, load balancer is some older F5 BigIP
Which version, specifically?
Brian
As I said you can reproduce this with stunnel "client = no" , telnet to "accept" port and tcpdump. As soon as you hit ^] and type "q" to close connection you will see "RST" coming from stunnel.
I understand that tcp/ip is not part of stunnel. Theres got to be some way to close() socket and have OS send RST.
Its very old BigIP version 3.3.1
===== bigip.conf =====
pool appgen_1.1.1.69.8843 { lb_method least_conn member 2.2.2.140:8843 ratio 1 priority 1 member 2.2.2.150:8843 ratio 1 priority 1 } pool appgen_1.1.1.69.8844 { lb_method least_conn member 2.2.2.140:8844 ratio 1 priority 1 member 2.2.2.150:8844 ratio 1 priority 1 }
vip 1.1.1.69:8843 unit 1 { netmask 255.255.255.0 broadcast 1.1.1.255 use pool appgen_1.1.1.69.8843 } vip 1.1.1.69:8844 unit 1 { netmask 255.255.255.0 broadcast 1.1.1.255 use pool appgen_1.1.1.69.8844 }
=========== stunnel.conf ============
setuid = nobody setgid = nogroup
CApath = /usr/local/etc/stunnel/certs cert = /usr/local/etc/stunnel/cacert.pem key = /usr/local/etc/stunnel/privkey-nopass.pem
debug = 2 output = /var/log/stunnel.log
client = no verify = 1 delay = yes
[something1] accept = 8843 connect = 127.0.0.1:11111 TIMEOUTclose = 0
[something2] accept = 8844 connect = 127.0.0.1:22222 TIMEOUTclose = 0
As I said you can reproduce this with stunnel "client = no" , telnet to "accept" port and tcpdump. As soon as you hit ^] and type "q" to close connection you will see "RST" coming from stunnel.
I understand that tcp/ip is not part of stunnel. Theres got to be some way to close() socket and have OS send RST.
Its very old BigIP version 3.3.1
Wow. Yep, that's old.
Version 9 has been out for more than a year.... I'd have no doubts that there are bugs in 3.3.1.