I'll give you two pieces of advice that almost everyone on the list won't agree with :)
1) Make a static openssl and static stunnel and locate them someplace apart from the standard locations (/usr/local/company_name/lib is what I use). This means if anyone messes with openssl or stunnel you won't be affected - and it will always work as it is static - and part of your application - the user does not even need your libraries. I have a personal distaste for dynamic linking . mostly because I have a lot of customers that update a lot of things (including openssl - a lot - and stunnel sometimes) . and then wonder why things stop working. A 10 year old openssl and stunnel - all static - will still run and work fine past all updates and user messing around. I choose when I want to update openssl and stunnel (meaning I look to see if there is something new I need or want). As a result I missed the keep alive and poodle bugs - I did not update until after both were fixed.
2) Forget hardware implementation - geez - modern computers are so darn fast that I cannot imagine you really need that level of "speed up" versus the grief you are handling. I have customers that exchange millions (4+) XML documents a day, all through stunnel, all through inetd (also not efficient supposedly - just reliable and always works and needs no management) - and have no problems. I am using IBM p Series (AIX) and these machines even at the low level are fast . but I also use some SCO and Linux and certainly with lesser volume they are fine as well.
3) OK - 3 is really - use inetd, so much easier and always works (assuming you have Unix). If inetd crashes Unix crashes so . see number 2 for reasons :)
Of course, these ideas won't help much if you don't have a Unix variation or if you are really that tight on performance (although if you are I'd suggest hardware upgrades!).
Good luck with your project,
Eric
From: stunnel-users [mailto:stunnel-users-bounces@stunnel.org] On Behalf Of Tamar Pedersen Sent: Wednesday, January 16, 2019 1:06 PM To: stunnel-users@stunnel.org Subject: [stunnel-users] How can stunnel use openssl HW cryptodev encryption
Hello,
I am evaluating stunnel, to see if it is a viable solution for providing encryption in a system that contains an Atmel processor which includes a HW accelerated encryption block. I am just ramping up on stunnel, and figured I should capture what I have done so far. My questions will come towards the end of my email.
My research indicates that stunnel incorporates openssl. I have been able to use openssl independently, to access the cryptodev HW encryption engine, in the Linux kernel module located in /lib/modules/4.14.79/extra/cryptodev.ko. When openssl is run without accessing the cryptodev engine (cryptodev module not loaded), I get the pure SW encryption implementation provided by default in openssl. When I run bench mark speed tests using openssl, using SW encryption, I see the following results:
# time -v openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 1689887 aes-128-cbc's in 2.95s
Doing aes-128-cbc for 3s on 64 size blocks: 568389 aes-128-cbc's in 2.95s
Doing aes-128-cbc for 3s on 256 size blocks: 151550 aes-128-cbc's in 2.96s
Doing aes-128-cbc for 3s on 1024 size blocks: 38599 aes-128-cbc's in 2.96s
Doing aes-128-cbc for 3s on 8192 size blocks: 4845 aes-128-cbc's in 2.95s
OpenSSL 1.0.2p-fips 14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O3 -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_leg acy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 9165.49k 12331.15k 13107.03k 13353.17k 13454.32k
Command being timed: "openssl speed -evp aes-128-cbc"
User time (seconds): 14.81
System time (seconds): 0.10
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.06s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 13376
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 145
Voluntary context switches: 0
Involuntary context switches: 721
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
#
When I load the cryptodev module, and take advantage of the accelerated hardware encryption the benchmark tests are significantly faster. Here is what those results look like.
# modprobe cryptodev
# time -v openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 44163 aes-128-cbc's in 0.12s
Doing aes-128-cbc for 3s on 64 size blocks: 31345 aes-128-cbc's in 0.15s
Doing aes-128-cbc for 3s on 256 size blocks: 18923 aes-128-cbc's in 0.11s
Doing aes-128-cbc for 3s on 1024 size blocks: 13847 aes-128-cbc's in 0.13s
Doing aes-128-cbc for 3s on 8192 size blocks: 8427 aes-128-cbc's in 0.06s
OpenSSL 1.0.2p-fips 14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O3 -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_leg acy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 5888.40k 13373.87k 44038.98k 109071.75k 1150566.40k
Command being timed: "openssl speed -evp aes-128-cbc"
User time (seconds): 0.59
System time (seconds): 8.72
Percent of CPU this job got: 61%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.11s
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 13792
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 144
Voluntary context switches: 41154
Involuntary context switches: 3321
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
#
As can be seen in the results (hi-lighted in red), the average speed to do aes-128-cbc encryption jumped from around 2.95 s to 0.10 s. Also of interest is the context switches are significantly higher when running hardware encryption, because of interrupts and overhead to use the hardware engine. I can also look at /proc/interrupts and see significant increases in atmel-aes interrupt counts when using the cryptodev HW acceleration encryption engine. This gives a good indication that the cryptodev module is in use, and is doing encryption.
I would like to try to figure out how to allow stunnel to take advantage of the cryptodev HW acceleration encryption engine available in openssl. I have made some attempts, but so far, I have not been able to determine if stunnel is successfully using the cryptodev engine. Here is what I have done with stunnel. I already have a client and server successfully communicating with each other using stunnel. To verify this I used the "nc" utility to send characters back and forth between two different machines. The stunnel.conf file, on the server, is out of the box. I'm interested in encrypting on the client side. Here is my current client.conf file, in /etc/stunnel:
# cat client.conf
debug = 7
output = /tmp/stunnel-server.log
pid = /tmp/stunnel.pid
engine = cryptodev
[test]
verify = 1
client = yes
accept = 127.0.0.1:2000
connect = 192.168.0.220:30000
CAfile = /etc/stunnel/certificate.crt
engineNum = 1
#
I am attempting to set up the cryptodev to be the configured engine for the client. I am able to start stunnel, using client.conf, as follows:
# stunnel /etc/stunnel/client.conf
#
If I do a "ps" command to display processes, I can see that stunnel is running in the background. At this point, I can use "nc" to send data, as follows:
# nc 127.0.0.1 2000 < /tmp/long_file.txt
I am able to see the text from long_file.txt on the server, which is also running nc. The problem is that I don't see interrupts increasing in /proc/interrupts, which leaves me wondering if I have not configured stunnel correctly to use the cryptodev engine. If I try to remove the cryptodev module as this point, while stunnel is running, I receive a message that it is in use, as follows:
# modprobe -r cryptodev
modprobe: FATAL: Module cryptodev is in use.
#
If I kill the stunnel process, I am able to successfully remove the cryptodev module, which seems to suggest stunnel is the process using the cryptodev module. Also, once I have removed the cryptodev module, I can't restart stunnel. Instead, I get the following errors back:
# stunnel /etc/stunnel/client.conf
[.] stunnel 5.44 on arm-buildroot-linux-gnueabi platform
[.] Compiled/running with OpenSSL 1.0.2p-fips 14 Aug 2018
[.] Threading:FORK Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI
[ ] errno: (*__errno_location ())
[.] Reading configuration from file /etc/stunnel/client.conf
[.] UTF-8 byte order mark not detected
[ ] Enabling support for engine "cryptodev"
[!] error queue: 2606A074: error:2606A074:engine routines:ENGINE_by_id:no such engine
[!] error queue: 260B6084: error:260B6084:engine routines:DYNAMIC_LOAD:dso not found
[!] error queue: 25070067: error:25070067:DSO support routines:DSO_load:could not load the shared library
[!] ENGINE_by_id: 25066067: error:25066067:DSO support routines:DLFCN_LOAD:could not load the shared library
[!] /etc/stunnel/client.conf:5: "engine = cryptodev": Failed to open the engine
#
Again, this suggests stunnel is trying to use cryptodev. I just don't know how to prove I am taking advantage of the HW encryption acceleration engine. I never see interrupts updating in /proc/interrupts when using nc, while stunnel is running.
So, here are my questions:
1.) Does it look like I have things set up correctly in client.conf, to use the cryptodev engine?
2.) If client.conf is correct, how can I prove that stunnel is using the cryptodev engine, since I don't see the expected interrupts?
One idea is that the cryptodev module might not support the type of encryption being requested by the certificate, so openssl falls back to the pure SW encryption implementation. I know the Atmel chip in question supports the following:
# openssl engine -t -c
(cryptodev) cryptodev engine
[RSA, DSA, DH, DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, MD5, SHA1, SHA256, SHA384, SHA512]
[ available ]
(dynamic) Dynamic engine loading support
[ unavailable ]
#
I was able to decode the contents of the certificate, and it says it is sha256WithRSAEncryption. My engine supports SHA256 and RSA, but does it support combining, like SHA256WithRSA? I'm not sure. I'll keep chasing that one.
Thanks for any guidance on how to use the cryptodev in stunnel.
Regards,
Tamar
Thanks for the response. I was wondering if anyone was out there :)
In the end, I may need to do encryption without taking advantage of hardware acceleration, but I was hoping to do some benchmark tests with and without hardware acceleration to see what the difference in measured performance would actually be. Our project is in an embedded system, with a small ARM core running Linux, which has a lot of other jobs to do beyone encrypting data. The CPU utilization looked like it was at 99% when we ran pure software encryption using openssl stand alone. We obviously are not running full throttle all of the time, so we may still be able to keep up. I also still need to characterize the block size of data that needs to be transmitted. The hardware acceleration doesn't buy much for small block sizes. In fact, because of all of the system overhead on context switches, it can actually be slower than pure software encryption for small block sizes. However as block sizes increase, there is an inflection point, where hardware acceleration really starts to kick in. In the end, the value of hardware encryption will depend on block sizes, which will depend on customer needs that have not yet been well defined, but I'd love to have some headroom if we end up requiring large block sizes.
I still have not given up on the idea of trying to get hardware encryption working. I know openssl has the capability to use the hardware encryption in the Atmel chip, when I use it alone. I just need to figure out the right hooks to get stunnel to configure openssl correctly. I'd love to hear from anyone who is using hardware encryption in stunnel, even if you only send the configuration file you are using. Just looking for some examples. I have only seen one example of a cryptography engine in stunnel, in the default stunnel.conf file, but it was for a Microsoft CryptoAPI engine. I'd love to see someone who used a Linux based cryptodev device. Has anyone out there tried this?
Thanks, Tamar
From: Eric Eberhard [mailto:flash@vicsmba.com] Sent: Friday, January 18, 2019 6:23 PM To: Tamar Pedersen tamar.pedersen@bbraunusa.com; stunnel-users@stunnel.org Subject: RE: [stunnel-users] How can stunnel use openssl HW cryptodev encryption
I'll give you two pieces of advice that almost everyone on the list won't agree with :)
1) Make a static openssl and static stunnel and locate them someplace apart from the standard locations (/usr/local/company_name/lib is what I use). This means if anyone messes with openssl or stunnel you won't be affected - and it will always work as it is static - and part of your application - the user does not even need your libraries. I have a personal distaste for dynamic linking ... mostly because I have a lot of customers that update a lot of things (including openssl - a lot - and stunnel sometimes) ... and then wonder why things stop working. A 10 year old openssl and stunnel - all static - will still run and work fine past all updates and user messing around. I choose when I want to update openssl and stunnel (meaning I look to see if there is something new I need or want). As a result I missed the keep alive and poodle bugs - I did not update until after both were fixed.
2) Forget hardware implementation - geez - modern computers are so darn fast that I cannot imagine you really need that level of "speed up" versus the grief you are handling. I have customers that exchange millions (4+) XML documents a day, all through stunnel, all through inetd (also not efficient supposedly - just reliable and always works and needs no management) - and have no problems. I am using IBM p Series (AIX) and these machines even at the low level are fast ... but I also use some SCO and Linux and certainly with lesser volume they are fine as well.
3) OK - 3 is really - use inetd, so much easier and always works (assuming you have Unix). If inetd crashes Unix crashes so ... see number 2 for reasons :)
Of course, these ideas won't help much if you don't have a Unix variation or if you are really that tight on performance (although if you are I'd suggest hardware upgrades!).
Good luck with your project,
Eric
From: stunnel-users [mailto:stunnel-users-bounces@stunnel.org] On Behalf Of Tamar Pedersen Sent: Wednesday, January 16, 2019 1:06 PM To: stunnel-users@stunnel.orgmailto:stunnel-users@stunnel.org Subject: [stunnel-users] How can stunnel use openssl HW cryptodev encryption
Hello, I am evaluating stunnel, to see if it is a viable solution for providing encryption in a system that contains an Atmel processor which includes a HW accelerated encryption block. I am just ramping up on stunnel, and figured I should capture what I have done so far. My questions will come towards the end of my email.
My research indicates that stunnel incorporates openssl. I have been able to use openssl independently, to access the cryptodev HW encryption engine, in the Linux kernel module located in /lib/modules/4.14.79/extra/cryptodev.ko. When openssl is run without accessing the cryptodev engine (cryptodev module not loaded), I get the pure SW encryption implementation provided by default in openssl. When I run bench mark speed tests using openssl, using SW encryption, I see the following results:
# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 1689887 aes-128-cbc's in 2.95s Doing aes-128-cbc for 3s on 64 size blocks: 568389 aes-128-cbc's in 2.95s Doing aes-128-cbc for 3s on 256 size blocks: 151550 aes-128-cbc's in 2.96s Doing aes-128-cbc for 3s on 1024 size blocks: 38599 aes-128-cbc's in 2.96s Doing aes-128-cbc for 3s on 8192 size blocks: 4845 aes-128-cbc's in 2.95s OpenSSL 1.0.2p-fips 14 Aug 2018 built on: reproducible build, date unspecified options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr) compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O3 -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_legacy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc 9165.49k 12331.15k 13107.03k 13353.17k 13454.32k Command being timed: "openssl speed -evp aes-128-cbc" User time (seconds): 14.81 System time (seconds): 0.10 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.06s Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 13376 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 145 Voluntary context switches: 0 Involuntary context switches: 721 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 #
When I load the cryptodev module, and take advantage of the accelerated hardware encryption the benchmark tests are significantly faster. Here is what those results look like.
# modprobe cryptodev # time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 44163 aes-128-cbc's in 0.12s Doing aes-128-cbc for 3s on 64 size blocks: 31345 aes-128-cbc's in 0.15s Doing aes-128-cbc for 3s on 256 size blocks: 18923 aes-128-cbc's in 0.11s Doing aes-128-cbc for 3s on 1024 size blocks: 13847 aes-128-cbc's in 0.13s Doing aes-128-cbc for 3s on 8192 size blocks: 8427 aes-128-cbc's in 0.06s OpenSSL 1.0.2p-fips 14 Aug 2018 built on: reproducible build, date unspecified options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr) compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O3 -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_legacy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc 5888.40k 13373.87k 44038.98k 109071.75k 1150566.40k Command being timed: "openssl speed -evp aes-128-cbc" User time (seconds): 0.59 System time (seconds): 8.72 Percent of CPU this job got: 61% Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.11s Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 13792 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 144 Voluntary context switches: 41154 Involuntary context switches: 3321 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 #
As can be seen in the results (hi-lighted in red), the average speed to do aes-128-cbc encryption jumped from around 2.95 s to 0.10 s. Also of interest is the context switches are significantly higher when running hardware encryption, because of interrupts and overhead to use the hardware engine. I can also look at /proc/interrupts and see significant increases in atmel-aes interrupt counts when using the cryptodev HW acceleration encryption engine. This gives a good indication that the cryptodev module is in use, and is doing encryption.
I would like to try to figure out how to allow stunnel to take advantage of the cryptodev HW acceleration encryption engine available in openssl. I have made some attempts, but so far, I have not been able to determine if stunnel is successfully using the cryptodev engine. Here is what I have done with stunnel. I already have a client and server successfully communicating with each other using stunnel. To verify this I used the "nc" utility to send characters back and forth between two different machines. The stunnel.conf file, on the server, is out of the box. I'm interested in encrypting on the client side. Here is my current client.conf file, in /etc/stunnel:
# cat client.conf debug = 7 output = /tmp/stunnel-server.log pid = /tmp/stunnel.pid
engine = cryptodev
[test] verify = 1 client = yes accept = 127.0.0.1:2000 connect = 192.168.0.220:30000 CAfile = /etc/stunnel/certificate.crt engineNum = 1
#
I am attempting to set up the cryptodev to be the configured engine for the client. I am able to start stunnel, using client.conf, as follows:
# stunnel /etc/stunnel/client.conf #
If I do a "ps" command to display processes, I can see that stunnel is running in the background. At this point, I can use "nc" to send data, as follows:
# nc 127.0.0.1 2000 < /tmp/long_file.txt
I am able to see the text from long_file.txt on the server, which is also running nc. The problem is that I don't see interrupts increasing in /proc/interrupts, which leaves me wondering if I have not configured stunnel correctly to use the cryptodev engine. If I try to remove the cryptodev module as this point, while stunnel is running, I receive a message that it is in use, as follows:
# modprobe -r cryptodev modprobe: FATAL: Module cryptodev is in use. #
If I kill the stunnel process, I am able to successfully remove the cryptodev module, which seems to suggest stunnel is the process using the cryptodev module. Also, once I have removed the cryptodev module, I can't restart stunnel. Instead, I get the following errors back:
# stunnel /etc/stunnel/client.conf [.] stunnel 5.44 on arm-buildroot-linux-gnueabi platform [.] Compiled/running with OpenSSL 1.0.2p-fips 14 Aug 2018 [.] Threading:FORK Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI [ ] errno: (*__errno_location ()) [.] Reading configuration from file /etc/stunnel/client.conf [.] UTF-8 byte order mark not detected [ ] Enabling support for engine "cryptodev" [!] error queue: 2606A074: error:2606A074:engine routines:ENGINE_by_id:no such engine [!] error queue: 260B6084: error:260B6084:engine routines:DYNAMIC_LOAD:dso not found [!] error queue: 25070067: error:25070067:DSO support routines:DSO_load:could not load the shared library [!] ENGINE_by_id: 25066067: error:25066067:DSO support routines:DLFCN_LOAD:could not load the shared library [!] /etc/stunnel/client.conf:5: "engine = cryptodev": Failed to open the engine #
Again, this suggests stunnel is trying to use cryptodev. I just don't know how to prove I am taking advantage of the HW encryption acceleration engine. I never see interrupts updating in /proc/interrupts when using nc, while stunnel is running.
So, here are my questions:
1.) Does it look like I have things set up correctly in client.conf, to use the cryptodev engine?
2.) If client.conf is correct, how can I prove that stunnel is using the cryptodev engine, since I don't see the expected interrupts?
One idea is that the cryptodev module might not support the type of encryption being requested by the certificate, so openssl falls back to the pure SW encryption implementation. I know the Atmel chip in question supports the following:
# openssl engine -t -c (cryptodev) cryptodev engine [RSA, DSA, DH, DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, MD5, SHA1, SHA256, SHA384, SHA512] [ available ] (dynamic) Dynamic engine loading support [ unavailable ] #
I was able to decode the contents of the certificate, and it says it is sha256WithRSAEncryption. My engine supports SHA256 and RSA, but does it support combining, like SHA256WithRSA? I'm not sure. I'll keep chasing that one.
Thanks for any guidance on how to use the cryptodev in stunnel.
Regards, Tamar
After several days of digging around on the web, without success, I finally got something working. I'm not sure if anyone else is having the same problem with hardware acceleration working in stunnel, since I couldn't seem to find much on the topic. However, if I can save anyone the trouble I have been through, I figured I should put it out there as a public service.
I still don't have all the answers, but here is what worked for me. In my testing, I put together two configuration files, one for a client, client.conf, and one for a server, server.conf. Here they are:
# cat /etc/stunnel/client.conf debug = 7 output = /tmp/stunnel-client.log fips = no pid = /var/run/stunnel_client.pid
[my_config] verify = 1 client = yes accept = 127.0.0.1:2000 connect = 192.168.0.127:30000 CAfile = /etc/stunnel/certificate.crt
# cat /etc/stunnel/server.conf debug = 7 outut = /tmp/stunnel-server.log cert = /etc/stunnel/stunnel.pem CAFile = /etc/stunnel/certificate.crt ciphers = ECDHE-RSA-AES256-SHA384 options = NO_SSLv2 options = NO_SSLv3 fips = no engine = auto pid = /var/run/stunnel_server.pid
[test] verify = 1 accept = 30000 connect = 127.0.0.1:60000
The thing that seemed to be the kicker was the line that specified the ciphers in server.conf. I didn't have a line specifying a cipher in my original server.conf file. I noticed in my stunnel-server.log file that the negotiated cipher suite was ECDHE-RSA-AES256-GCM-SHA384. I started some investigation to try to figure out where this came from. I still don't know why this cipher was selected, but I have made some observations. I was able to get a list of ciphers supported by openssl, by typing the following command:
# openssl ciphers -v
In general, there appear to be some ciphers that are supported by the hardware encryption engine, and others that are not. If the hardware encryption engine does not support the cipher, then it looks like you get the pure software encryption provided by openssl. In my case, the hardware acceleration engine is on an Atmel chip, and is accessed by the cryptodev engine. The default negotiated cipher I was getting, ECDHE-RSA-AES256-GCM-SHA384, doesn't seem to be supported by my hardware accelerator. However, if you take GCM out of the name, you get ECDHE-RSA-AES256-SHA384, which does seem to be supported in my system. I imagine the ciphers supported by any particular chip may vary. When I use this cipher, I start to see the atmel-aes interrupts incrementing, as I expected in /proc/interrupts. I believe this verifies the hardware acceleration engine is being used. I'm calling this success, but I am still left with some head scratchers.
I also set engine = auto, in my server.conf file. If I do this, I see lines in the stunnel-server.log that show:
Enabling automatic engine support Engine #1 (cryptodev) registered Engine #2 (dynamic) registered Automatic engine support enabled
If I set engine = cryptodev, then I don't see these lines, but it still seems to work. Instead I get a lines that say:
Enabling support for engine "cryptodev" UI set for engine #1 (cryptodev) Initializing engine #1 (cryptodev) Engine #1 (cryptodev) initialized
The hardware acceleration in cryptodev seems to work whether engine is set to cryptodev or auto. Not sure what the difference is. Any experts know the difference?
I'm also trying to figure out how to know what ciphers I should expect to be supported on my hardware. If I type the command, openssl engine -t -c, I see the following:
# openssl engine -t -c (cryptodev) cryptodev engine [RSA, DSA, DH, DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, MD5, SHA1, SHA256, SHA384, SHA512] [ available ] (dynamic) Dynamic engine loading support [ unavailable ] #
The cipher that works with cryptodev has a name that contains some of the items in this list, but not all. This is not a list of ciphers, but it does contain strings that show up in supported cipher names. For instance, GCM is not in the list. The cipher, with GCM in the name, defaulted to running software encryption. ECDHE, is not in the list, but the cipher that worked on hardware had this in the name. How do I know which ciphers should work, without just going manually through the list of all ciphers and trying each one, to see if it runs on my hardware, or not?
In summary, I got something to work, but I still have some work to do, to understand how everything plays together.
Regards, Tamar
From: Tamar Pedersen Sent: Monday, January 21, 2019 8:56 AM To: Eric Eberhard flash@vicsmba.com; stunnel-users@stunnel.org Subject: RE: [stunnel-users] How can stunnel use openssl HW cryptodev encryption
Thanks for the response. I was wondering if anyone was out there :)
In the end, I may need to do encryption without taking advantage of hardware acceleration, but I was hoping to do some benchmark tests with and without hardware acceleration to see what the difference in measured performance would actually be. Our project is in an embedded system, with a small ARM core running Linux, which has a lot of other jobs to do beyone encrypting data. The CPU utilization looked like it was at 99% when we ran pure software encryption using openssl stand alone. We obviously are not running full throttle all of the time, so we may still be able to keep up. I also still need to characterize the block size of data that needs to be transmitted. The hardware acceleration doesn't buy much for small block sizes. In fact, because of all of the system overhead on context switches, it can actually be slower than pure software encryption for small block sizes. However as block sizes increase, there is an inflection point, where hardware acceleration really starts to kick in. In the end, the value of hardware encryption will depend on block sizes, which will depend on customer needs that have not yet been well defined, but I'd love to have some headroom if we end up requiring large block sizes.
I still have not given up on the idea of trying to get hardware encryption working. I know openssl has the capability to use the hardware encryption in the Atmel chip, when I use it alone. I just need to figure out the right hooks to get stunnel to configure openssl correctly. I'd love to hear from anyone who is using hardware encryption in stunnel, even if you only send the configuration file you are using. Just looking for some examples. I have only seen one example of a cryptography engine in stunnel, in the default stunnel.conf file, but it was for a Microsoft CryptoAPI engine. I'd love to see someone who used a Linux based cryptodev device. Has anyone out there tried this?
Thanks, Tamar
From: Eric Eberhard [mailto:flash@vicsmba.com] Sent: Friday, January 18, 2019 6:23 PM To: Tamar Pedersen <tamar.pedersen@bbraunusa.commailto:tamar.pedersen@bbraunusa.com>; stunnel-users@stunnel.orgmailto:stunnel-users@stunnel.org Subject: RE: [stunnel-users] How can stunnel use openssl HW cryptodev encryption
I'll give you two pieces of advice that almost everyone on the list won't agree with :)
1) Make a static openssl and static stunnel and locate them someplace apart from the standard locations (/usr/local/company_name/lib is what I use). This means if anyone messes with openssl or stunnel you won't be affected - and it will always work as it is static - and part of your application - the user does not even need your libraries. I have a personal distaste for dynamic linking ... mostly because I have a lot of customers that update a lot of things (including openssl - a lot - and stunnel sometimes) ... and then wonder why things stop working. A 10 year old openssl and stunnel - all static - will still run and work fine past all updates and user messing around. I choose when I want to update openssl and stunnel (meaning I look to see if there is something new I need or want). As a result I missed the keep alive and poodle bugs - I did not update until after both were fixed.
2) Forget hardware implementation - geez - modern computers are so darn fast that I cannot imagine you really need that level of "speed up" versus the grief you are handling. I have customers that exchange millions (4+) XML documents a day, all through stunnel, all through inetd (also not efficient supposedly - just reliable and always works and needs no management) - and have no problems. I am using IBM p Series (AIX) and these machines even at the low level are fast ... but I also use some SCO and Linux and certainly with lesser volume they are fine as well.
3) OK - 3 is really - use inetd, so much easier and always works (assuming you have Unix). If inetd crashes Unix crashes so ... see number 2 for reasons :)
Of course, these ideas won't help much if you don't have a Unix variation or if you are really that tight on performance (although if you are I'd suggest hardware upgrades!).
Good luck with your project,
Eric
From: stunnel-users [mailto:stunnel-users-bounces@stunnel.org] On Behalf Of Tamar Pedersen Sent: Wednesday, January 16, 2019 1:06 PM To: stunnel-users@stunnel.orgmailto:stunnel-users@stunnel.org Subject: [stunnel-users] How can stunnel use openssl HW cryptodev encryption
Hello, I am evaluating stunnel, to see if it is a viable solution for providing encryption in a system that contains an Atmel processor which includes a HW accelerated encryption block. I am just ramping up on stunnel, and figured I should capture what I have done so far. My questions will come towards the end of my email.
My research indicates that stunnel incorporates openssl. I have been able to use openssl independently, to access the cryptodev HW encryption engine, in the Linux kernel module located in /lib/modules/4.14.79/extra/cryptodev.ko. When openssl is run without accessing the cryptodev engine (cryptodev module not loaded), I get the pure SW encryption implementation provided by default in openssl. When I run bench mark speed tests using openssl, using SW encryption, I see the following results:
# time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 1689887 aes-128-cbc's in 2.95s Doing aes-128-cbc for 3s on 64 size blocks: 568389 aes-128-cbc's in 2.95s Doing aes-128-cbc for 3s on 256 size blocks: 151550 aes-128-cbc's in 2.96s Doing aes-128-cbc for 3s on 1024 size blocks: 38599 aes-128-cbc's in 2.96s Doing aes-128-cbc for 3s on 8192 size blocks: 4845 aes-128-cbc's in 2.95s OpenSSL 1.0.2p-fips 14 Aug 2018 built on: reproducible build, date unspecified options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr) compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O3 -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_legacy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc 9165.49k 12331.15k 13107.03k 13353.17k 13454.32k Command being timed: "openssl speed -evp aes-128-cbc" User time (seconds): 14.81 System time (seconds): 0.10 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.06s Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 13376 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 145 Voluntary context switches: 0 Involuntary context switches: 721 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 #
When I load the cryptodev module, and take advantage of the accelerated hardware encryption the benchmark tests are significantly faster. Here is what those results look like.
# modprobe cryptodev # time -v openssl speed -evp aes-128-cbc Doing aes-128-cbc for 3s on 16 size blocks: 44163 aes-128-cbc's in 0.12s Doing aes-128-cbc for 3s on 64 size blocks: 31345 aes-128-cbc's in 0.15s Doing aes-128-cbc for 3s on 256 size blocks: 18923 aes-128-cbc's in 0.11s Doing aes-128-cbc for 3s on 1024 size blocks: 13847 aes-128-cbc's in 0.13s Doing aes-128-cbc for 3s on 8192 size blocks: 8427 aes-128-cbc's in 0.06s OpenSSL 1.0.2p-fips 14 Aug 2018 built on: reproducible build, date unspecified options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr) compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O3 -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_legacy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc 5888.40k 13373.87k 44038.98k 109071.75k 1150566.40k Command being timed: "openssl speed -evp aes-128-cbc" User time (seconds): 0.59 System time (seconds): 8.72 Percent of CPU this job got: 61% Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.11s Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 13792 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 144 Voluntary context switches: 41154 Involuntary context switches: 3321 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 #
As can be seen in the results (hi-lighted in red), the average speed to do aes-128-cbc encryption jumped from around 2.95 s to 0.10 s. Also of interest is the context switches are significantly higher when running hardware encryption, because of interrupts and overhead to use the hardware engine. I can also look at /proc/interrupts and see significant increases in atmel-aes interrupt counts when using the cryptodev HW acceleration encryption engine. This gives a good indication that the cryptodev module is in use, and is doing encryption.
I would like to try to figure out how to allow stunnel to take advantage of the cryptodev HW acceleration encryption engine available in openssl. I have made some attempts, but so far, I have not been able to determine if stunnel is successfully using the cryptodev engine. Here is what I have done with stunnel. I already have a client and server successfully communicating with each other using stunnel. To verify this I used the "nc" utility to send characters back and forth between two different machines. The stunnel.conf file, on the server, is out of the box. I'm interested in encrypting on the client side. Here is my current client.conf file, in /etc/stunnel:
# cat client.conf debug = 7 output = /tmp/stunnel-server.log pid = /tmp/stunnel.pid
engine = cryptodev
[test] verify = 1 client = yes accept = 127.0.0.1:2000 connect = 192.168.0.220:30000 CAfile = /etc/stunnel/certificate.crt engineNum = 1
#
I am attempting to set up the cryptodev to be the configured engine for the client. I am able to start stunnel, using client.conf, as follows:
# stunnel /etc/stunnel/client.conf #
If I do a "ps" command to display processes, I can see that stunnel is running in the background. At this point, I can use "nc" to send data, as follows:
# nc 127.0.0.1 2000 < /tmp/long_file.txt
I am able to see the text from long_file.txt on the server, which is also running nc. The problem is that I don't see interrupts increasing in /proc/interrupts, which leaves me wondering if I have not configured stunnel correctly to use the cryptodev engine. If I try to remove the cryptodev module as this point, while stunnel is running, I receive a message that it is in use, as follows:
# modprobe -r cryptodev modprobe: FATAL: Module cryptodev is in use. #
If I kill the stunnel process, I am able to successfully remove the cryptodev module, which seems to suggest stunnel is the process using the cryptodev module. Also, once I have removed the cryptodev module, I can't restart stunnel. Instead, I get the following errors back:
# stunnel /etc/stunnel/client.conf [.] stunnel 5.44 on arm-buildroot-linux-gnueabi platform [.] Compiled/running with OpenSSL 1.0.2p-fips 14 Aug 2018 [.] Threading:FORK Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI [ ] errno: (*__errno_location ()) [.] Reading configuration from file /etc/stunnel/client.conf [.] UTF-8 byte order mark not detected [ ] Enabling support for engine "cryptodev" [!] error queue: 2606A074: error:2606A074:engine routines:ENGINE_by_id:no such engine [!] error queue: 260B6084: error:260B6084:engine routines:DYNAMIC_LOAD:dso not found [!] error queue: 25070067: error:25070067:DSO support routines:DSO_load:could not load the shared library [!] ENGINE_by_id: 25066067: error:25066067:DSO support routines:DLFCN_LOAD:could not load the shared library [!] /etc/stunnel/client.conf:5: "engine = cryptodev": Failed to open the engine #
Again, this suggests stunnel is trying to use cryptodev. I just don't know how to prove I am taking advantage of the HW encryption acceleration engine. I never see interrupts updating in /proc/interrupts when using nc, while stunnel is running.
So, here are my questions:
1.) Does it look like I have things set up correctly in client.conf, to use the cryptodev engine?
2.) If client.conf is correct, how can I prove that stunnel is using the cryptodev engine, since I don't see the expected interrupts?
One idea is that the cryptodev module might not support the type of encryption being requested by the certificate, so openssl falls back to the pure SW encryption implementation. I know the Atmel chip in question supports the following:
# openssl engine -t -c (cryptodev) cryptodev engine [RSA, DSA, DH, DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, MD5, SHA1, SHA256, SHA384, SHA512] [ available ] (dynamic) Dynamic engine loading support [ unavailable ] #
I was able to decode the contents of the certificate, and it says it is sha256WithRSAEncryption. My engine supports SHA256 and RSA, but does it support combining, like SHA256WithRSA? I'm not sure. I'll keep chasing that one.
Thanks for any guidance on how to use the cryptodev in stunnel.
Regards, Tamar