Richard.Hall wrote:
> First-time post (though long-time satisfied occasional user). Searched the
> archives but if there is anything relevant, I've obviously missed it.
>
> I really cannot decide if this is an stunnel problem or an OpenSSL
> problem.
>
> Client is remote and running
>   stunnel 4.12 on i686-pc-linux-gnu UCONTEXT+POLL+IPv4 with OpenSSL 0.9.6i
>
> Server is local, I've been through
>   4.12 + 0.9.6m
>   4.13 + 0.9.6m
> and now trying
>   stunnel 4.13 on sparc-sun-solaris2.8 UCONTEXT+POLL+IPv4 with OpenSSL 0.9.7i
> (I've resisted the urge to try OpenSSL 0.9.8* as people seem to be
> experiencing 'issues')
>
> Small amounts of data are fine. But for larger amounts the connection goes
> to sleep. The way I read it, the server end stops trying to write to the
> SSL side. The relevant parts of
>
> # truss -w2 -v poll -dl /usr/local/stunnel-4.13/sbin/stunnel
>
> look like this: (apologies for doing it like this, but it seemed the
> easiest way to combine the debug output and truss output)
>
>   (fd=7 is the SSL side, fd=8 the local side,
>    TIMEOUTidle is set to 35 seconds for demo purposes)
>
> /1:     15.5722 read(8, " A B\0\n03\007\t\t\r\n\t".., 8178)     = 8178
> /1:     poll(0x00049DD8, 3, 35000)      (sleeping...)
> /1:             fd=3  ev=POLLIN rev=0
> /1:             fd=6  ev=POLLIN rev=0
> /1:             fd=7  ev=POLLIN|POLLOUT rev=POLLIN
> /1:     17.3166 poll(0x00049DD8, 3, 35000)                      = 1
> /1:             fd=3  ev=POLLIN rev=0
> /1:             fd=6  ev=POLLIN rev=0
> /1:             fd=7  ev=POLLIN|POLLOUT rev=POLLOUT
> /1:     17.3229 write(7, "1703\0\0188CBEC9 : O\t u".., 16442)   = 16442
> /1:     17.3234 poll(0x00049DD8, 4, 35000)                      = 1
> /1:             fd=3  ev=POLLIN rev=0
> /1:             fd=6  ev=POLLIN rev=0
> /1:             fd=8  ev=POLLIN rev=POLLIN
> /1:             fd=7  ev=POLLIN rev=0
>
> So far, so good.
>
> /1:     17.3238 read(8, "\t\t\t\r\n\t\t\t\r\n\t\t".., 16384)    = 16384
> /1:     17.3240 poll(0x00049DD8, 3, 35000)                      = 1
> /1:             fd=3  ev=POLLIN rev=0
> /1:             fd=6  ev=POLLIN rev=0
> /1:             fd=7  ev=POLLIN|POLLOUT rev=POLLOUT
> /1:     17.3299 write(7, "1703\0\018ED 1AD91A888 i".., 16442)   = 14480
>
> For whatever reason, the write() was only partial. I'm guessing n/w
> issues, as the client is a loooong way away ;-)
>
> /1:     17.3303 write(7, " fE3DDA8FE ! i8DD8\fEBEC".., 1962)    Err#11 EAGAIN
> /1:     17.3308 write(2, 0x00071D60, 74)                        = 74
> /1:        2 0 0 5 . 1 0 . 2 4   1 4 : 5 0 : 3 0   L O G 7 [ 2 0 6 4 7 : 2
> /1:        ] :   S S L _ w r i t e   r e t u r n e d   W A N T _ W R I T E
> /1:        :   r e t r y i n g
> /1:     17.3312 write(2, "\n", 1)                               = 1
> /1:     17.3315 read(7, 0x000778D8, 5)                          Err#11 EAGAIN
> /1:     17.3318 write(2, 0x00071D60, 72)                        = 72
> /1:        2 0 0 5 . 1 0 . 2 4   1 4 : 5 0 : 3 0   L O G 7 [ 2 0 6 4 7 : 2
> /1:        ] :   S S L _ r e a d   r e t u r n e d   W A N T _ R E A D :
> /1:        r e t r y i n g
> /1:     17.3321 write(2, "\n", 1)                               = 1
>
> So we've now done an SSL_write, which yielded SSL_ERROR_WANT_WRITE,
>             _and_ an SSL_read,  which yielded SSL_ERROR_WANT_READ,
> (and which I think removes the SSL_ERROR_WANT_WRITE condition, as we can
> only have one or the other.)
>
> This means that, the next time round the loop (in client.c:transfer()),
> want_rd is TRUE, want_wr is FALSE, so we don't poll the SSL connection for
> output, only input:
>
> /1:     poll(0x00049DD8, 3, 35000)      (sleeping...)
> /1:             fd=3  ev=POLLIN rev=0
> /1:             fd=6  ev=POLLIN rev=0
> /1:             fd=7  ev=POLLIN rev=POLLOUT
>
> But there won't be any input (trust me on that!), so we time out ...
>
> /1:     52.3387 poll(0x00049DD8, 3, 35000)                      = 0
> /1:             fd=3  ev=POLLIN rev=0
> /1:             fd=6  ev=POLLIN rev=0
> /1:             fd=7  ev=POLLIN rev=0
> /1:     52.3394 write(2, 0x00071D60, 72)                        = 72
> /1:        2 0 0 5 . 1 0 . 2 4   1 4 : 5 1 : 0 5   L O G 6 [ 2 0 6 4 7 : 2
> /1:        ] :   s _ p o l l _ w a i t   t i m e o u t :   c o n n e c t i
> /1:        o n   r e s e t
> /1:     52.3399 write(2, "\n", 1)                               = 1
> /1:     52.3403 write(2, 0x00071DD0, 102)                       = 102
> /1:        2 0 0 5 . 1 0 . 2 4   1 4 : 5 1 : 0 5   L O G 5 [ 2 0 6 4 7 : 2
> /1:        ] :   C o n n e c t i o n   r e s e t :   5 7 6 1 7   b y t e s
> /1:          s e n t   t o   S S L ,   1 3 1   b y t e s   s e n t   t o
> /1:        s o c k e t
>
>
> Now I have never before needed (nor wanted!) to delve into the inner
> workings of either stunnel or OpenSSL, so I may be in completely the wrong
> ballpark here. But I cannot understand the logic of the following bit of
> code in transfer() :-
>
>         if((c->sock_ptr && !want_rd) || /* socket input buffer not empty */
>                 ssl_closing==CL_INIT /* need to send close_notify */ ||
>                 ((c->ssl_ptr<BUFFSIZE || ssl_closing==CL_RETRY) && want_wr))
>                 /* want to SSL_read or SSL_shutdown but write to the
>                  * underlying socket needed for the SSL protocol */
>             s_poll_add(&c->fds, c->ssl_wfd->fd, 0, 1);
>
> In particular, if I change the first line to
>
>         if(c->sock_ptr || /* socket input buffer not empty */
>
> then everything starts working (correctly as far as I can see). The
> OpenSSL man pages seem (to me) to say that it is perfectly permissible to
> call SSL_write when we have SSL_ERROR_WANT_READ.
>
> Help?

I actually had started noticing the exact same behavior but hadn't had
time to look into it. Mine seemed only to happen on images, leaving HTML
pages alone but my HTML isn't all that big anyway comparatively.

has notified the sender that this message has been received.