Hi, We built stunnel-4.20 on solaris, hpux, aix, tru64. irix and various linux hosts, and it works fine everywhere except hpux10.20 and aix.
stunnel crashes with similar backtraces on AIX-4.3,5.1,5.2 and 5.3, not on the first client connect (using stunnel client=yes, connecting to s secure daytime server on another host, because everyone needs a secure daytime server :-p) but on the second connection. We have not tried server side yet, I assume results are similar.
The backtrace looks like this: GDB: Program terminated with signal 11, Segmentation fault. #0 0xd0126b1c in _sigsetmask () from /usr/lib/libpthreads.a(shr_xpg5.o) (gdb) bt #0 0xd0126b1c in _sigsetmask () from /usr/lib/libpthreads.a(shr_xpg5.o) #1 0xd0127e5c in _p_sigaction () from /usr/lib/libpthreads.a(shr_xpg5.o) #2 0xd0315544 in sigaction () from /usr/lib/libc.a(shr.o) #3 0xd038beac in signal () from /usr/lib/libc.a(shr.o) #4 0x10006200 in sigchld_handler (sig=0) at network.c:350 #5 0x00004378 in ?? () Previous frame inner to this frame (corrupt stack?)
dbx: sigchld_handler(sig = 0), line 350 in "network.c" _sigsetmask(??, ??, ??) at 0xd0126b1c _p_sigaction(??, ??, ??) at 0xd0127e58 raise.sigaction(??, ??, ??) at 0xd0315540 signal(??, ??) at 0xd038bea8 sigchld_handler(sig = 0), line 350 in "network.c" _sigsetmask(??, ??, ??) at 0xd0126b1c _p_sigaction(??, ??, ??) at 0xd0127e58 raise.sigaction(??, ??, ??) at 0xd0315540 signal(??, ??) at 0xd038bea8 sigchld_handler(sig = 0), line 350 in "network.c" _sigsetmask(??, ??, ??) at 0xd0126b1c _p_sigaction(??, ??, ??) at 0xd0127e58 raise.sigaction(??, ??, ??) at 0xd0315540 signal(??, ??) at 0xd038bea8 repeated forever.
Line 350 of network.c says: signal(SIGCHLD, sigchld_handler);
And that is in the signal handler, it seems like a normal thing to do, and I have no idea why, on AIX, it crashes. putting an #ifndef _AIX #endif around the signal() call "fixes" the crash, alternatively setting SIGCHLD to SIG_IGN on AIX "fixes" it, but I don't think that either solution is correct. I don't believe the signal handler will get called again as we do not reset the signal.
(tests that assumtion and finds answer to riddle)
Okay, problem identified. The signal_handler is being called repeatedly, causing an infinite loop that can only end in a crash. According to the AIX signal man page: The signal in libc.a does not set the SA_RESTART flag. It sets the signal mask to the signal whose action is being specified, and sets flags to SA_OLDSTYLE. The Berkeley Software Distribution (BSD) version of signal sets the SA_RESTART flag and preserves the current settings of the signal mask and flags. The BSD version can be used by compiling with the Berkeley Compatibility Library (libbsd.a).
And indeed, no patches are required if I add -lbsd. Another solution would probably be to use sigaction if it is available.
Meanwhile, please find attached a patch to add -lbsd on aix.
Thanks for bearing with me while I figured this out :-)
Peter