I'm using Stunnel in our environment to tunnel our ZooKeeper connections... it works reasonably well as long as ZooKeeper is running on our ZooKeeper servers properly. Here's an example of our server and client configs:
Server:
cert = /etc/stunnel/zookeeper.pem key = /etc/stunnel/zookeeper.key CAfile = /etc/stunnel/zookeeper_ca.pem verify = 2 sslVersion = TLSv1 setuid = stunnel4 setgid = stunnel4 pid = /var/lib/stunnel4/zookeeper.stunnel4.pid socket = l:TCP_NODELAY=1 socket = r:TCP_NODELAY=1 TIMEOUTconnect = 2 debug = 4 [zookeeper] accept = 0.0.0.0:2182 failover = rr connect = 127.0.0.1:2181
Client:
cert = /etc/stunnel/zookeeper.pem key = /etc/stunnel/zookeeper.key CAfile = /etc/stunnel/zookeeper_ca.pem verify = 2 sslVersion = TLSv1 client = yes setuid = stunnel4 setgid = stunnel4 pid = /var/lib/stunnel4/zookeeper.stunnel4.pid socket = l:TCP_NODELAY=1 socket = r:TCP_NODELAY=1 TIMEOUTconnect = 2 debug = 4 [zookeeper] accept = 127.0.0.1:2182 failover = rr connect = zookeeper1:2182 connect = zookeeper2:2182 connect = zookeeper3:2182
Now the intended behavior is that if 'zookeeper1', 'zookeeper'2, or 'zookeeper3' go down that the client will re-establish a connection with one of the other servers. However, that doesnt seem to be happening. Instead, when the connection dies, Stunnel seems to go into a massive loop trying to re-connect to the original server.
Feb 22 21:41:24 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111)
Whats missing in my config to make the failover work properly?
—Matt
Any thoughts on this? I'm still trying to figure out the behavior by reading the source code, but I'm not a big C programmer.. :/ On Feb 22, 2012, at 1:46 PM, Matt Wise wrote:
I'm using Stunnel in our environment to tunnel our ZooKeeper connections... it works reasonably well as long as ZooKeeper is running on our ZooKeeper servers properly. Here's an example of our server and client configs:
Server:
cert = /etc/stunnel/zookeeper.pem key = /etc/stunnel/zookeeper.key CAfile = /etc/stunnel/zookeeper_ca.pem verify = 2 sslVersion = TLSv1 setuid = stunnel4 setgid = stunnel4 pid = /var/lib/stunnel4/zookeeper.stunnel4.pid socket = l:TCP_NODELAY=1 socket = r:TCP_NODELAY=1 TIMEOUTconnect = 2 debug = 4 [zookeeper] accept = 0.0.0.0:2182 failover = rr connect = 127.0.0.1:2181
Client:
cert = /etc/stunnel/zookeeper.pem key = /etc/stunnel/zookeeper.key CAfile = /etc/stunnel/zookeeper_ca.pem verify = 2 sslVersion = TLSv1 client = yes setuid = stunnel4 setgid = stunnel4 pid = /var/lib/stunnel4/zookeeper.stunnel4.pid socket = l:TCP_NODELAY=1 socket = r:TCP_NODELAY=1 TIMEOUTconnect = 2 debug = 4 [zookeeper] accept = 127.0.0.1:2182 failover = rr connect = zookeeper1:2182 connect = zookeeper2:2182 connect = zookeeper3:2182
Now the intended behavior is that if 'zookeeper1', 'zookeeper'2, or 'zookeeper3' go down that the client will re-establish a connection with one of the other servers. However, that doesnt seem to be happening. Instead, when the connection dies, Stunnel seems to go into a massive loop trying to re-connect to the original server.
Feb 22 21:41:24 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111)
Whats missing in my config to make the failover work properly?
—Matt
Matt Wise wrote:
Any thoughts on this? I'm still trying to figure out the behavior by reading the source code, but I'm not a big C programmer.. :/
Feb 22 21:41:24 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111)
I tried really hard, but I couldn't reproduce your issue in my lab. For me it works as it was supposed to do.
Please include the output of "stunnel - version". Maybe there is something specific with your OS or configuration.
Mike