On Thu, 2008-04-03 at 14:47 +1100, PK wrote:
Thanks for the excellent response. It does seem to be a nagle/buffering issue of some sort. I set up stunnel as you suggested (so that all conns are localhost -> remotehost) and the problem disappeared. However, after again explicitly turning TCP_NODELAY off completely in samba and stunnel, restarting, etc, the problem remains in the localhost -> localhost configuration.
Yeah, I'm guessing here that stunnel itself is getting data directly from samba and forming outbound SSL records right away - so this is where you want the coalescing but you're not getting it. Then the fact you don't get buffering of the outbound SSL records into larger TCP packets is probably just bad luck, particularly as the most likely reason you're not fast enough to beat the ACK packets for Nagle is because you're too busy pulling in loads and loads of really small clear-text inputs and producing too many SSL records... (rinse and repeat). It's a "push-back" architecture problem that you can't fix in stunnel. Disabling TCP_NODELAY is unlikely to change anything as it should usually be off by default anyway (ie. Nagle turned on). The problem is that although Nagle is on, it isn't helping on your localhost connection between samba and stunnel.
Hmmm ... this may not work, but have you considered using an external interface address rather than loopback/localhost/127.0.0.1? Eg. rather than the stunnel server connecting to localhost:446, try having it connect to 192.168.0.1:446 (or whatever your NIC address is). If the reason you're getting no buffering is because the connection is too fast, this won't help unless the external interface is much slower than loopback (when bending packets back to itself). But if the reason was that buffering is inherently ignored on the loopback device, then you could get lucky if you use the external device address instead.
It'd be nice to fix it, but none of the socket options seem to change anything so will probably have to find a work around, such as keeping the number of files in the directory low or pushing harder to use cifs instead of stunnel.
If you get *really* stuck, you could always insert a nasty hack into the child-process routine within stunnel, eg. something like;
[...] static int buffering = 0; [...]
[...] /* find wherever we read the clear-text socket */ ret = read(sock); if (ret < 0) /* error-handling */ /* NASTY HACK - THIS IS THE NEW CODE*/ if (!buffering && ((ret > 0) && (ret < 512))) buffering = 5; if (buffering) { msleep(50); buffering--; } [...]
It's hideous, would never get accepted upstream (in any modified form), and would cause any purist to expunge generously into the nearest porcelain ... which is often true of practical solutions for people who have better things to do. ;-)
The idea is that if you get a small read, you're "too fast" and will probably continue to get small reads from the samba server (or samba client, on the stunnel client). With this trick, it'll start sleeping 1/20th of a second after each read for the next 5 reads. That means your reads are less frequent and should pick up more data each time - and after 5 such delays, we stop sleeping so that stunnel can process inbound data at full speed, hopefully the socket has accumulated a back-log of data (so you'll get maximally-sized chunks from your reads until you "catch up"). Eventually you'll catch up, get a small read, and go back into this throttling mechanism. This would be extremely unwise for latency-critical/interactive protocols, but could help significantly for "download or upload"-style protocols. Letting a back-log build up can improve your efficiency. If you're listing 5000 files and it's taking 23 seconds, this seems like a good candidate.
Anyway, might be worth trying if you run out of options ... please let me know how you get on.
Cheers, Geoff