There is also a '--with-threads=pthread' option that may help to lower the number of processes.
However, on modern operating systems, fork()ing is not as expensive as it looks like. In most cases, the text segment is shared between the two processes and the pages on data/BSS segments are not copied until changed ('copy-on-write'). I don't know if the usage of OS resources is larger for the multi-threaded or the multi-tasked approach.
Perhaps I am confused, but are you saying that when firing up stunnel twice with two separate configuration files they are sharing resources?
Sorry about not posting to list earlier.
Terry