I'm using Stunnel in our environment to tunnel our ZooKeeper connections... it works reasonably well as long as ZooKeeper is running on our ZooKeeper servers properly. Here's an example of our server and client configs:
Server:
cert = /etc/stunnel/zookeeper.pem key = /etc/stunnel/zookeeper.key CAfile = /etc/stunnel/zookeeper_ca.pem verify = 2 sslVersion = TLSv1 setuid = stunnel4 setgid = stunnel4 pid = /var/lib/stunnel4/zookeeper.stunnel4.pid socket = l:TCP_NODELAY=1 socket = r:TCP_NODELAY=1 TIMEOUTconnect = 2 debug = 4 [zookeeper] accept = 0.0.0.0:2182 failover = rr connect = 127.0.0.1:2181
Client:
cert = /etc/stunnel/zookeeper.pem key = /etc/stunnel/zookeeper.key CAfile = /etc/stunnel/zookeeper_ca.pem verify = 2 sslVersion = TLSv1 client = yes setuid = stunnel4 setgid = stunnel4 pid = /var/lib/stunnel4/zookeeper.stunnel4.pid socket = l:TCP_NODELAY=1 socket = r:TCP_NODELAY=1 TIMEOUTconnect = 2 debug = 4 [zookeeper] accept = 127.0.0.1:2182 failover = rr connect = zookeeper1:2182 connect = zookeeper2:2182 connect = zookeeper3:2182
Now the intended behavior is that if 'zookeeper1', 'zookeeper'2, or 'zookeeper3' go down that the client will re-establish a connection with one of the other servers. However, that doesnt seem to be happening. Instead, when the connection dies, Stunnel seems to go into a massive loop trying to re-connect to the original server.
Feb 22 21:41:24 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111) Feb 22 21:41:25 staging-i-123.xxx stunnel: LOG3[10717:3073788784]: connect_blocking: getsockopt 123.123.123.123:2182: Connection refused (111)
Whats missing in my config to make the failover work properly?
—Matt