How to tell if SPARC T4 crypto is being used?
- by danx
A question that often comes up when running applications on SPARC T4 systems is "How can I tell if hardware crypto accleration is being used?"
To review, the SPARC T4 processor includes a crypto unit that supports several crypto instructions.
For hardware crypto these include 11 AES instructions, 4 xmul* instructions (for AES GCM carryless multiply), mont for Montgomery multiply (optimizes RSA and DSA), and 5 des_* instructions (for DES3).
For hardware hash algorithm optimization, the T4 has the md5, sha1, sha256, and sha512 instructions (the last two are used for SHA-224 an SHA-384).
First off, it's easy to tell if the processor T4 crypto instructions—use the isainfo -v command and look for "sparcv9" and "aes" (and other hash and crypto algorithms) in the output:
$ isainfo -v
64-bit sparcv9 applications
crc32c cbcond pause mont mpmul sha512 sha256 sha1 md5 camellia kasumi
des aes ima hpc vis3 fmaf asi_blk_init vis2 vis popc
These instructions are not-privileged, so are available for direct use in user-level applications and libraries (such as OpenSSL).
Here is the "openssl speed -evp" command shown with the built-in t4 engine and with the pkcs11 engine.
Both run the T4 AES instructions, but the t4 engine is faster than the pkcs11 engine because it has less overhead (especially for smaller packet sizes):
t-4 $ /usr/bin/openssl version
OpenSSL 1.0.0j 10 May 2012
t-4 $ /usr/bin/openssl engine
(t4) SPARC T4 engine support
(dynamic) Dynamic engine loading support
(pkcs11) PKCS #11 engine support
t-4 $ /usr/bin/openssl speed -evp aes-128-cbc # t4 engine used by default
. . .
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 487777.10k 816822.21k 986012.59k 1017029.97k 1053543.08k
t-4 $ /usr/bin/openssl speed -engine pkcs11 -evp aes-128-cbc
engine "pkcs11" set.
. . .
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 31703.58k 116636.39k 350672.81k 696170.50k 993599.49k
Note: The "-evp" flag indicates use the OpenSSL "EnVeloPe" API, which gives more accurate results.
That's because it tells OpenSSL to use the same API that external programs use when calling OpenSSL libcrypto functions, evp(3openssl).
DTrace Shows if T4 Crypto Functions Are Used
OK, good enough, the isainfo(1) command shows the instructions are present, but how does one know if they are being used?
Chi-Chang Lin, who works on Oracle Solaris performance, wrote a Dtrace script to show if T4 instructions are being executed.
To show the T4 instructions are being used, run the following Dtrace script. Look for functions named "t4" and "yf" in the output. The OpenSSL T4 engine uses functions named "t4" and the PKCS#11 engine uses functions named "yf".
To demonstrate, I'll first run "openssl speed" with the built-in t4 engine then with the pkcs11 engine. The performance numbers are not valid due to dtrace probes slowing things down.
t-4 # dtrace -Z -n '
pid$target::*yf*:entry,pid$target::*t4_*:entry{ @[probemod, probefunc] = count();}' \
-c "/usr/bin/openssl speed -evp aes-128-cbc"
dtrace: description 'pid$target::*yf*:entry' matched 101 probes
. . .
dtrace: pid 2029 has exited
libcrypto.so.1.0.0 ENGINE_load_t4 1
libcrypto.so.1.0.0 t4_DH 1
libcrypto.so.1.0.0 t4_DSA 1
libcrypto.so.1.0.0 t4_RSA 1
libcrypto.so.1.0.0 t4_destroy 1
libcrypto.so.1.0.0 t4_free_aes_ctr_NIDs 1
libcrypto.so.1.0.0 t4_init 1
libcrypto.so.1.0.0 t4_add_NID 3
libcrypto.so.1.0.0 t4_aes_expand128 5
libcrypto.so.1.0.0 t4_cipher_init_aes 5
libcrypto.so.1.0.0 t4_get_all_ciphers 6
libcrypto.so.1.0.0 t4_get_all_digests 59
libcrypto.so.1.0.0 t4_digest_final_sha1 65
libcrypto.so.1.0.0 t4_digest_init_sha1 65
libcrypto.so.1.0.0 t4_sha1_multiblock 126
libcrypto.so.1.0.0 t4_digest_update_sha1 261
libcrypto.so.1.0.0 t4_aes128_cbc_encrypt 1432979
libcrypto.so.1.0.0 t4_aes128_load_keys_for_encrypt 1432979
libcrypto.so.1.0.0 t4_cipher_do_aes_128_cbc 1432979
t-4 # dtrace -Z -n 'pid$target::*yf*:entry{ @[probemod, probefunc] = count();}
pid$target::*yf*:entry,pid$target::*t4_*:entry{ @[probemod, probefunc] = count();}' \
-c "/usr/bin/openssl speed -engine pkcs11 -evp aes-128-cbc"
dtrace: description 'pid$target::*yf*:entry' matched 101 probes
engine "pkcs11" set.
. . .
dtrace: pid 2033 has exited
libcrypto.so.1.0.0 ENGINE_load_t4 1
libcrypto.so.1.0.0 t4_DH 1
libcrypto.so.1.0.0 t4_DSA 1
libcrypto.so.1.0.0 t4_RSA 1
libcrypto.so.1.0.0 t4_destroy 1
libcrypto.so.1.0.0 t4_free_aes_ctr_NIDs 1
libcrypto.so.1.0.0 t4_get_all_ciphers 1
libcrypto.so.1.0.0 t4_get_all_digests 1
libsoftcrypto.so.1 rijndael_key_setup_enc_yf 1
libsoftcrypto.so.1 yf_aes_expand128 1
libcrypto.so.1.0.0 t4_add_NID 3
libsoftcrypto.so.1 yf_aes128_cbc_encrypt 1542330
libsoftcrypto.so.1 yf_aes128_load_keys_for_encrypt 1542330
So, as shown above the OpenSSL built-in t4 engine executes t4_* functions (which are hand-coded assembly executing the T4 AES instructions) and the OpenSSL pkcs11 engine executes *yf* functions.
Programmatic Use of OpenSSL T4 engine
The OpenSSL t4 engine is used automatically with the /usr/bin/openssl command line.
Chi-Chang Lin also points out that
if you're calling the OpenSSL API (libcrypto.so) from a program, you must call
ENGINE_load_built_engines(), otherwise the built-in t4 engine will not be loaded.
You do not call ENGINE_set_default().
That's because "openssl speed -evp" test calls ENGINE_load_built_engines()
even though the "-engine" option wasn't specified.
OpenSSL T4 engine Availability
The OpenSSL t4 engine is available with Solaris 11 and 11.1. For Solaris 10 08/11 (U10), you need to use the OpenSSL pkcs311 engine. The OpenSSL t4 engine is distributed only with the version of OpenSSL distributed with Solaris (and not third-party or self-compiled versions of OpenSSL).
The OpenSSL engine implements the AES cipher for Solaris 11, released 11/2011.
For Solaris 11.1, released 11/2012, the OpenSSL engine adds optimization for the MD5, SHA-1, and SHA-2 hash algorithms, and DES-3.
Although the T4 processor has Camillia and Kasumi block cipher instructions, these are not implemented in the OpenSSL T4 engine.
The following charts may help view availability of optimizations.
The first chart shows what's available with Solaris CLIs and APIs, the second chart shows what's available in Solaris OpenSSL.
Native Solaris Optimization for SPARC T4
This table is shows Solaris native CLI and API support.
As such, they are all available with the OpenSSL pkcs11 engine.
CLIs: "openssl -engine pkcs11", encrypt(1), decrypt(1), mac(1), digest(1), MD5sum(1), SHA1sum(1), SHA224sum(1), SHA256sum(1), SHA384sum(1), SHA512sum(1)
APIs: PKCS#11 library libpkcs11(3LIB) (incluDES Openssl pkcs11 engine), libMD(3LIB), and Solaris kernel modules
AlgorithmSolaris 1008/11 (U10)Solaris 11Solaris 11.1
AES-ECB, AES-CBC, AES-CTR, AES-CBC AES-CFB128
XXX
DES3-ECB, DES3-CBC, DES2-ECB, DES2-CBC, DES-ECB, DES-CBC
XXX
bignum Montgomery multiply (RSA, DSA)
XXX
MD5, SHA-1, SHA-256, SHA-384, SHA-512
XXX
SHA-224
X
ARCFOUR (RC4)
X
Solaris OpenSSL T4 Engine Optimization
This table is for the Solaris OpenSSL built-in t4 engine.
Algorithms listed above are also available through the OpenSSL pkcs11 engine.
CLI: openssl(1openssl)
APIs: openssl(5), engine(3openssl), evp(3openssl), libcrypto crypto(3openssl)
AlgorithmSolaris 11Solaris 11SRU2Solaris 11.1
AES-ECB, AES-CBC, AES-CTR, AES-CBC AES-CFB128
XXX
DES3-ECB, DES3-CBC, DES-ECB, DES-CBC
X
bignum Montgomery multiply (RSA, DSA)
X
MD5, SHA-1, SHA-256, SHA-384, SHA-512
XX
SHA-224
X
Source Code Availability
Solaris
Most of the T4 assembly code that called the new T4 crypto instructions
was written by Ferenc Rákóczi of the Solaris Security group,
with assistance from others.
You can download the Solaris source for this and other parts of Solaris as
a few zip files at the
Oracle Download website.
The relevant source files are generally under directories
usr/src/common/crypto/{aes,arcfour,des,md5,modes,sha1,sha2}}/sun4v/.
and usr/src/common/bignum/sun4v/.
Solaris 11 binary
is available from the
Oracle Solaris 11 download website.
OpenSSL t4 engine
The source for the OpenSSL t4 engine,
which is based on the Solaris source above, is viewable through the
OpenGrok source code browser in directory
src/components/openssl/openssl-1.0.0/engines/t4 .
You can download the source from the same website or through Mercurial source code management, hg(1).
Conclusion
Oracle Solaris with SPARC T4 provides a rich set of accelerated cryptographic and hash algorithms.
Using the latest update, Solaris 11.1, provides the best set of optimized algorithms,
but alternatives are often available, sometimes slightly slower,
for releases back to Solaris 10 08/11 (U10).
Reference
See also these earlier blogs.
SPARC T4 OpenSSL Engine by myself, Dan Anderson (2011), discusses the Openssl T4 engine and reviews the SPARC T4 processor for the Solaris 11 release.
Exciting Crypto Advances with the T4 processor and Oracle Solaris 11 by Valerie Fenwick (2011) discusses crypto algorithms that were optimized for the T4 processor with the Solaris 11 FCS (11/11) and Solaris 10 08/11 (U10) release.
T4 Crypto Cheat Sheet by Stefan Hinker (2012) discusses how to make T4 crypto optimization available to various consumers (such as SSH, Java, OpenSSL, Apache, etc.)
High Performance Security For Oracle Database and Fusion Middleware Applications using SPARC T4 (PDF, 2012)
discusses SPARC T4 and its usage to optimize application security.
Configuring Oracle iPlanet WebServer / Oracle Traffic Director to use crypto accelerators on T4-1 servers by Meena Vyas (2012)