Hi,
We built stunnel-4.20 on solaris, hpux, aix, tru64. irix and various
linux hosts, and it works fine everywhere except hpux10.20 and aix.
stunnel crashes with similar backtraces on AIX-4.3,5.1,5.2 and 5.3,
not on the first client connect (using stunnel client=yes, connecting
to s secure daytime server on another host, because everyone needs a
secure daytime server :-p) but on the second connection. We have not
tried server side yet, I assume results are similar.
The backtrace looks like this:
GDB:
Program terminated with signal 11, Segmentation fault.
#0 0xd0126b1c in _sigsetmask () from
/usr/lib/libpthreads.a(shr_xpg5.o)
(gdb) bt
#0 0xd0126b1c in _sigsetmask () from
/usr/lib/libpthreads.a(shr_xpg5.o)
#1 0xd0127e5c in _p_sigaction () from
/usr/lib/libpthreads.a(shr_xpg5.o)
#2 0xd0315544 in sigaction () from /usr/lib/libc.a(shr.o)
#3 0xd038beac in signal () from /usr/lib/libc.a(shr.o)
#4 0x10006200 in sigchld_handler (sig=0) at network.c:350
#5 0x00004378 in ?? ()
Previous frame inner to this frame (corrupt stack?)
dbx:
sigchld_handler(sig = 0), line 350 in "network.c"
_sigsetmask(??, ??, ??) at 0xd0126b1c
_p_sigaction(??, ??, ??) at 0xd0127e58
raise.sigaction(??, ??, ??) at 0xd0315540
signal(??, ??) at 0xd038bea8
sigchld_handler(sig = 0), line 350 in "network.c"
_sigsetmask(??, ??, ??) at 0xd0126b1c
_p_sigaction(??, ??, ??) at 0xd0127e58
raise.sigaction(??, ??, ??) at 0xd0315540
signal(??, ??) at 0xd038bea8
sigchld_handler(sig = 0), line 350 in "network.c"
_sigsetmask(??, ??, ??) at 0xd0126b1c
_p_sigaction(??, ??, ??) at 0xd0127e58
raise.sigaction(??, ??, ??) at 0xd0315540
signal(??, ??) at 0xd038bea8
repeated forever.
Line 350 of network.c says:
signal(SIGCHLD, sigchld_handler);
And that is in the signal handler, it seems like a normal thing to do,
and I have no idea why, on AIX, it crashes. putting an
#ifndef _AIX
#endif
around the signal() call "fixes" the crash, alternatively setting
SIGCHLD to SIG_IGN on AIX "fixes" it, but I don't think that either
solution is correct. I don't believe the signal handler will get
called again as we do not reset the signal.
(tests that assumtion and finds answer to riddle)
Okay, problem identified. The signal_handler is being called
repeatedly, causing an infinite loop that can only end in a crash.
According to the AIX signal man page:
The signal in libc.a does not set the SA_RESTART flag. It sets the signal mask
to the signal whose action is being specified, and sets flags to SA_OLDSTYLE.
The Berkeley Software Distribution (BSD) version of signal sets the SA_RESTART
flag and preserves the current settings of the signal mask and flags. The BSD
version can be used by compiling with the Berkeley Compatibility Library
(libbsd.a).
And indeed, no patches are required if I add -lbsd. Another solution
would probably be to use sigaction if it is available.
Meanwhile, please find attached a patch to add -lbsd on aix.
Thanks for bearing with me while I figured this out :-)
Peter