First-time post (though long-time satisfied occasional user). Searched the archives but if there is anything relevant, I've obviously missed it.
I really cannot decide if this is an stunnel problem or an OpenSSL problem.
Client is remote and running stunnel 4.12 on i686-pc-linux-gnu UCONTEXT+POLL+IPv4 with OpenSSL 0.9.6i
Server is local, I've been through 4.12 + 0.9.6m 4.13 + 0.9.6m and now trying stunnel 4.13 on sparc-sun-solaris2.8 UCONTEXT+POLL+IPv4 with OpenSSL 0.9.7i (I've resisted the urge to try OpenSSL 0.9.8* as people seem to be experiencing 'issues')
Small amounts of data are fine. But for larger amounts the connection goes to sleep. The way I read it, the server end stops trying to write to the SSL side. The relevant parts of
# truss -w2 -v poll -dl /usr/local/stunnel-4.13/sbin/stunnel
look like this: (apologies for doing it like this, but it seemed the easiest way to combine the debug output and truss output)
(fd=7 is the SSL side, fd=8 the local side, TIMEOUTidle is set to 35 seconds for demo purposes)
/1: 15.5722 read(8, " A B\0\n03\007\t\t\r\n\t".., 8178) = 8178 /1: poll(0x00049DD8, 3, 35000) (sleeping...) /1: fd=3 ev=POLLIN rev=0 /1: fd=6 ev=POLLIN rev=0 /1: fd=7 ev=POLLIN|POLLOUT rev=POLLIN /1: 17.3166 poll(0x00049DD8, 3, 35000) = 1 /1: fd=3 ev=POLLIN rev=0 /1: fd=6 ev=POLLIN rev=0 /1: fd=7 ev=POLLIN|POLLOUT rev=POLLOUT /1: 17.3229 write(7, "1703\0\0188CBEC9 : O\t u".., 16442) = 16442 /1: 17.3234 poll(0x00049DD8, 4, 35000) = 1 /1: fd=3 ev=POLLIN rev=0 /1: fd=6 ev=POLLIN rev=0 /1: fd=8 ev=POLLIN rev=POLLIN /1: fd=7 ev=POLLIN rev=0
So far, so good.
/1: 17.3238 read(8, "\t\t\t\r\n\t\t\t\r\n\t\t".., 16384) = 16384 /1: 17.3240 poll(0x00049DD8, 3, 35000) = 1 /1: fd=3 ev=POLLIN rev=0 /1: fd=6 ev=POLLIN rev=0 /1: fd=7 ev=POLLIN|POLLOUT rev=POLLOUT /1: 17.3299 write(7, "1703\0\018ED 1AD91A888 i".., 16442) = 14480
For whatever reason, the write() was only partial. I'm guessing n/w issues, as the client is a loooong way away ;-)
/1: 17.3303 write(7, " fE3DDA8FE ! i8DD8\fEBEC".., 1962) Err#11 EAGAIN /1: 17.3308 write(2, 0x00071D60, 74) = 74 /1: 2 0 0 5 . 1 0 . 2 4 1 4 : 5 0 : 3 0 L O G 7 [ 2 0 6 4 7 : 2 /1: ] : S S L _ w r i t e r e t u r n e d W A N T _ W R I T E /1: : r e t r y i n g /1: 17.3312 write(2, "\n", 1) = 1 /1: 17.3315 read(7, 0x000778D8, 5) Err#11 EAGAIN /1: 17.3318 write(2, 0x00071D60, 72) = 72 /1: 2 0 0 5 . 1 0 . 2 4 1 4 : 5 0 : 3 0 L O G 7 [ 2 0 6 4 7 : 2 /1: ] : S S L _ r e a d r e t u r n e d W A N T _ R E A D : /1: r e t r y i n g /1: 17.3321 write(2, "\n", 1) = 1
So we've now done an SSL_write, which yielded SSL_ERROR_WANT_WRITE, _and_ an SSL_read, which yielded SSL_ERROR_WANT_READ, (and which I think removes the SSL_ERROR_WANT_WRITE condition, as we can only have one or the other.)
This means that, the next time round the loop (in client.c:transfer()), want_rd is TRUE, want_wr is FALSE, so we don't poll the SSL connection for output, only input:
/1: poll(0x00049DD8, 3, 35000) (sleeping...) /1: fd=3 ev=POLLIN rev=0 /1: fd=6 ev=POLLIN rev=0 /1: fd=7 ev=POLLIN rev=POLLOUT
But there won't be any input (trust me on that!), so we time out ...
/1: 52.3387 poll(0x00049DD8, 3, 35000) = 0 /1: fd=3 ev=POLLIN rev=0 /1: fd=6 ev=POLLIN rev=0 /1: fd=7 ev=POLLIN rev=0 /1: 52.3394 write(2, 0x00071D60, 72) = 72 /1: 2 0 0 5 . 1 0 . 2 4 1 4 : 5 1 : 0 5 L O G 6 [ 2 0 6 4 7 : 2 /1: ] : s _ p o l l _ w a i t t i m e o u t : c o n n e c t i /1: o n r e s e t /1: 52.3399 write(2, "\n", 1) = 1 /1: 52.3403 write(2, 0x00071DD0, 102) = 102 /1: 2 0 0 5 . 1 0 . 2 4 1 4 : 5 1 : 0 5 L O G 5 [ 2 0 6 4 7 : 2 /1: ] : C o n n e c t i o n r e s e t : 5 7 6 1 7 b y t e s /1: s e n t t o S S L , 1 3 1 b y t e s s e n t t o /1: s o c k e t
Now I have never before needed (nor wanted!) to delve into the inner workings of either stunnel or OpenSSL, so I may be in completely the wrong ballpark here. But I cannot understand the logic of the following bit of code in transfer() :-
if((c->sock_ptr && !want_rd) || /* socket input buffer not empty */ ssl_closing==CL_INIT /* need to send close_notify */ || ((c->ssl_ptr<BUFFSIZE || ssl_closing==CL_RETRY) && want_wr)) /* want to SSL_read or SSL_shutdown but write to the * underlying socket needed for the SSL protocol */ s_poll_add(&c->fds, c->ssl_wfd->fd, 0, 1);
In particular, if I change the first line to
if(c->sock_ptr || /* socket input buffer not empty */
then everything starts working (correctly as far as I can see). The OpenSSL man pages seem (to me) to say that it is perfectly permissible to call SSL_write when we have SSL_ERROR_WANT_READ.
Help?
TIA, Richard Hall