1e1051a39Sopenharmony_ciNotes on engines of 2001-09-24
2e1051a39Sopenharmony_ci==============================
3e1051a39Sopenharmony_ci
4e1051a39Sopenharmony_ciThis "description" (if one chooses to call it that) needed some major updating
5e1051a39Sopenharmony_ciso here goes. This update addresses a change being made at the same time to
6e1051a39Sopenharmony_ciOpenSSL, and it pretty much completely restructures the underlying mechanics of
7e1051a39Sopenharmony_cithe "ENGINE" code. So it serves a double purpose of being a "ENGINE internals
8e1051a39Sopenharmony_cifor masochists" document *and* a rather extensive commit log message. (I'd get
9e1051a39Sopenharmony_cilynched for sticking all this in CHANGES.md or the commit mails :-).
10e1051a39Sopenharmony_ci
11e1051a39Sopenharmony_ciENGINE_TABLE underlies this restructuring, as described in the internal header
12e1051a39Sopenharmony_ci"eng_local.h", implemented in eng_table.c, and used in each of the "class" files;
13e1051a39Sopenharmony_citb_rsa.c, tb_dsa.c, etc.
14e1051a39Sopenharmony_ci
15e1051a39Sopenharmony_ciHowever, "EVP_CIPHER" underlies the motivation and design of ENGINE_TABLE so
16e1051a39Sopenharmony_ciI'll mention a bit about that first. EVP_CIPHER (and most of this applies
17e1051a39Sopenharmony_ciequally to EVP_MD for digests) is both a "method" and a algorithm/mode
18e1051a39Sopenharmony_ciidentifier that, in the current API, "lingers". These cipher description +
19e1051a39Sopenharmony_ciimplementation structures can be defined or obtained directly by applications,
20e1051a39Sopenharmony_cior can be loaded "en masse" into EVP storage so that they can be catalogued and
21e1051a39Sopenharmony_cisearched in various ways, ie. two ways of encrypting with the "des_cbc"
22e1051a39Sopenharmony_cialgorithm/mode pair are;
23e1051a39Sopenharmony_ci
24e1051a39Sopenharmony_ci    (i) directly;
25e1051a39Sopenharmony_ci         const EVP_CIPHER *cipher = EVP_des_cbc();
26e1051a39Sopenharmony_ci         EVP_EncryptInit(&ctx, cipher, key, iv);
27e1051a39Sopenharmony_ci         [ ... use EVP_EncryptUpdate() and EVP_EncryptFinal() ...]
28e1051a39Sopenharmony_ci
29e1051a39Sopenharmony_ci    (ii) indirectly;
30e1051a39Sopenharmony_ci         OpenSSL_add_all_ciphers();
31e1051a39Sopenharmony_ci         cipher = EVP_get_cipherbyname("des_cbc");
32e1051a39Sopenharmony_ci         EVP_EncryptInit(&ctx, cipher, key, iv);
33e1051a39Sopenharmony_ci         [ ... etc ... ]
34e1051a39Sopenharmony_ci
35e1051a39Sopenharmony_ciThe latter is more generally used because it also allows ciphers/digests to be
36e1051a39Sopenharmony_cilooked up based on other identifiers which can be useful for automatic cipher
37e1051a39Sopenharmony_ciselection, eg. in SSL/TLS, or by user-controllable configuration.
38e1051a39Sopenharmony_ci
39e1051a39Sopenharmony_ciThe important point about this is that EVP_CIPHER definitions and structures are
40e1051a39Sopenharmony_cipassed around with impunity and there is no safe way, without requiring massive
41e1051a39Sopenharmony_cirewrites of many applications, to assume that EVP_CIPHERs can be reference
42e1051a39Sopenharmony_cicounted. One an EVP_CIPHER is exposed to the caller, neither it nor anything it
43e1051a39Sopenharmony_cicomes from can "safely" be destroyed. Unless of course the way of getting to
44e1051a39Sopenharmony_cisuch ciphers is via entirely distinct API calls that didn't exist before.
45e1051a39Sopenharmony_ciHowever existing API usage cannot be made to understand when an EVP_CIPHER
46e1051a39Sopenharmony_cipointer, that has been passed to the caller, is no longer being used.
47e1051a39Sopenharmony_ci
48e1051a39Sopenharmony_ciThe other problem with the existing API w.r.t. to hooking EVP_CIPHER support
49e1051a39Sopenharmony_ciinto ENGINE is storage - the OBJ_NAME-based storage used by EVP to register
50e1051a39Sopenharmony_ciciphers simultaneously registers cipher *types* and cipher *implementations* -
51e1051a39Sopenharmony_cithey are effectively the same thing, an "EVP_CIPHER" pointer. The problem with
52e1051a39Sopenharmony_cihooking in ENGINEs is that multiple ENGINEs may implement the same ciphers. The
53e1051a39Sopenharmony_cisolution is necessarily that ENGINE-provided ciphers simply are not registered,
54e1051a39Sopenharmony_cistored, or exposed to the caller in the same manner as existing ciphers. This is
55e1051a39Sopenharmony_ciespecially necessary considering the fact ENGINE uses reference counts to allow
56e1051a39Sopenharmony_cifor cleanup, modularity, and DSO support - yet EVP_CIPHERs, as exposed to
57e1051a39Sopenharmony_cicallers in the current API, support no such controls.
58e1051a39Sopenharmony_ci
59e1051a39Sopenharmony_ciAnother sticking point for integrating cipher support into ENGINE is linkage.
60e1051a39Sopenharmony_ciAlready there is a problem with the way ENGINE supports RSA, DSA, etc whereby
61e1051a39Sopenharmony_cithey are available *because* they're part of a giant ENGINE called "openssl".
62e1051a39Sopenharmony_ciIe. all implementations *have* to come from an ENGINE, but we get round that by
63e1051a39Sopenharmony_cihaving a giant ENGINE with all the software support encapsulated. This creates
64e1051a39Sopenharmony_cilinker hassles if nothing else - linking a 1-line application that calls 2 basic
65e1051a39Sopenharmony_ciRSA functions (eg. "RSA_free(RSA_new());") will result in large quantities of
66e1051a39Sopenharmony_ciENGINE code being linked in *and* because of that DSA, DH, and RAND also. If we
67e1051a39Sopenharmony_cicontinue with this approach for EVP_CIPHER support (even if it *was* possible)
68e1051a39Sopenharmony_ciwe would lose our ability to link selectively by selectively loading certain
69e1051a39Sopenharmony_ciimplementations of certain functionality. Touching any part of any kind of
70e1051a39Sopenharmony_cicrypto would result in massive static linkage of everything else. So the
71e1051a39Sopenharmony_cisolution is to change the way ENGINE feeds existing "classes", ie. how the
72e1051a39Sopenharmony_cihooking to ENGINE works from RSA, DSA, DH, RAND, as well as adding new hooking
73e1051a39Sopenharmony_cifor EVP_CIPHER, and EVP_MD.
74e1051a39Sopenharmony_ci
75e1051a39Sopenharmony_ciThe way this is now being done is by mostly reverting back to how things used to
76e1051a39Sopenharmony_ciwork prior to ENGINE :-). Ie. RSA now has a "RSA_METHOD" pointer again - this
77e1051a39Sopenharmony_ciwas previously replaced by an "ENGINE" pointer and all RSA code that required
78e1051a39Sopenharmony_cithe RSA_METHOD would call ENGINE_get_RSA() each time on its ENGINE handle to
79e1051a39Sopenharmony_citemporarily get and use the ENGINE's RSA implementation. Apart from being more
80e1051a39Sopenharmony_ciefficient, switching back to each RSA having an RSA_METHOD pointer also allows
81e1051a39Sopenharmony_cius to conceivably operate with *no* ENGINE. As we'll see, this removes any need
82e1051a39Sopenharmony_cifor a fallback ENGINE that encapsulates default implementations - we can simply
83e1051a39Sopenharmony_cihave our RSA structure pointing its RSA_METHOD pointer to the software
84e1051a39Sopenharmony_ciimplementation and have its ENGINE pointer set to NULL.
85e1051a39Sopenharmony_ci
86e1051a39Sopenharmony_ciA look at the EVP_CIPHER hooking is most explanatory, the RSA, DSA (etc) cases
87e1051a39Sopenharmony_citurn out to be degenerate forms of the same thing. The EVP storage of ciphers,
88e1051a39Sopenharmony_ciand the existing EVP API functions that return "software" implementations and
89e1051a39Sopenharmony_cidescriptions remain untouched. However, the storage takes more meaning in terms
90e1051a39Sopenharmony_ciof "cipher description" and less meaning in terms of "implementation". When an
91e1051a39Sopenharmony_ciEVP_CIPHER_CTX is actually initialised with an EVP_CIPHER method and is about to
92e1051a39Sopenharmony_cibegin en/decryption, the hooking to ENGINE comes into play. What happens is that
93e1051a39Sopenharmony_cicipher-specific ENGINE code is asked for an ENGINE pointer (a functional
94e1051a39Sopenharmony_cireference) for any ENGINE that is registered to perform the algo/mode that the
95e1051a39Sopenharmony_ciprovided EVP_CIPHER structure represents. Under normal circumstances, that
96e1051a39Sopenharmony_ciENGINE code will return NULL because no ENGINEs will have had any cipher
97e1051a39Sopenharmony_ciimplementations *registered*. As such, a NULL ENGINE pointer is stored in the
98e1051a39Sopenharmony_ciEVP_CIPHER_CTX context, and the EVP_CIPHER structure is left hooked into the
99e1051a39Sopenharmony_cicontext and so is used as the implementation. Pretty much how things work now
100e1051a39Sopenharmony_ciexcept we'd have a redundant ENGINE pointer set to NULL and doing nothing.
101e1051a39Sopenharmony_ci
102e1051a39Sopenharmony_ciConversely, if an ENGINE *has* been registered to perform the algorithm/mode
103e1051a39Sopenharmony_cicombination represented by the provided EVP_CIPHER, then a functional reference
104e1051a39Sopenharmony_cito that ENGINE will be returned to the EVP_CIPHER_CTX during initialisation.
105e1051a39Sopenharmony_ciThat functional reference will be stored in the context (and released on
106e1051a39Sopenharmony_cicleanup) - and having that reference provides a *safe* way to use an EVP_CIPHER
107e1051a39Sopenharmony_cidefinition that is private to the ENGINE. Ie. the EVP_CIPHER provided by the
108e1051a39Sopenharmony_ciapplication will actually be replaced by an EVP_CIPHER from the registered
109e1051a39Sopenharmony_ciENGINE - it will support the same algorithm/mode as the original but will be a
110e1051a39Sopenharmony_cicompletely different implementation. Because this EVP_CIPHER isn't stored in the
111e1051a39Sopenharmony_ciEVP storage, nor is it returned to applications from traditional API functions,
112e1051a39Sopenharmony_cithere is no associated problem with it not having reference counts. And of
113e1051a39Sopenharmony_cicourse, when one of these "private" cipher implementations is hooked into
114e1051a39Sopenharmony_ciEVP_CIPHER_CTX, it is done whilst the EVP_CIPHER_CTX holds a functional
115e1051a39Sopenharmony_cireference to the ENGINE that owns it, thus the use of the ENGINE's EVP_CIPHER is
116e1051a39Sopenharmony_cisafe.
117e1051a39Sopenharmony_ci
118e1051a39Sopenharmony_ciThe "cipher-specific ENGINE code" I mentioned is implemented in tb_cipher.c but
119e1051a39Sopenharmony_ciin essence it is simply an instantiation of "ENGINE_TABLE" code for use by
120e1051a39Sopenharmony_ciEVP_CIPHER code. tb_digest.c is virtually identical but, of course, it is for
121e1051a39Sopenharmony_ciuse by EVP_MD code. Ditto for tb_rsa.c, tb_dsa.c, etc. These instantiations of
122e1051a39Sopenharmony_ciENGINE_TABLE essentially provide linker-separation of the classes so that even
123e1051a39Sopenharmony_ciif ENGINEs implement *all* possible algorithms, an application using only
124e1051a39Sopenharmony_ciEVP_CIPHER code will link at most code relating to EVP_CIPHER, tb_cipher.c, core
125e1051a39Sopenharmony_ciENGINE code that is independent of class, and of course the ENGINE
126e1051a39Sopenharmony_ciimplementation that the application loaded. It will *not* however link any
127e1051a39Sopenharmony_ciclass-specific ENGINE code for digests, RSA, etc nor will it bleed over into
128e1051a39Sopenharmony_ciother APIs, such as the RSA/DSA/etc library code.
129e1051a39Sopenharmony_ci
130e1051a39Sopenharmony_ciENGINE_TABLE is a little more complicated than may seem necessary but this is
131e1051a39Sopenharmony_cimostly to avoid a lot of "init()"-thrashing on ENGINEs (that may have to load
132e1051a39Sopenharmony_ciDSOs, and other expensive setup that shouldn't be thrashed unnecessarily) *and*
133e1051a39Sopenharmony_cito duplicate "default" behaviour. Basically an ENGINE_TABLE instantiation, for
134e1051a39Sopenharmony_ciexample tb_cipher.c, implements a hash-table keyed by integer "nid" values.
135e1051a39Sopenharmony_ciThese nids provide the uniquenness of an algorithm/mode - and each nid will hash
136e1051a39Sopenharmony_cito a potentially NULL "ENGINE_PILE". An ENGINE_PILE is essentially a list of
137e1051a39Sopenharmony_cipointers to ENGINEs that implement that particular 'nid'. Each "pile" uses some
138e1051a39Sopenharmony_cicaching tricks such that requests on that 'nid' will be cached and all future
139e1051a39Sopenharmony_cirequests will return immediately (well, at least with minimal operation) unless
140e1051a39Sopenharmony_cia change is made to the pile, eg. perhaps an ENGINE was unloaded. The reason is
141e1051a39Sopenharmony_cithat an application could have support for 10 ENGINEs statically linked
142e1051a39Sopenharmony_ciin, and the machine in question may not have any of the hardware those 10
143e1051a39Sopenharmony_ciENGINEs support. If each of those ENGINEs has a "des_cbc" implementation, we
144e1051a39Sopenharmony_ciwant to avoid every EVP_CIPHER_CTX setup from trying (and failing) to initialise
145e1051a39Sopenharmony_cieach of those 10 ENGINEs. Instead, the first such request will try to do that
146e1051a39Sopenharmony_ciand will either return (and cache) a NULL ENGINE pointer or will return a
147e1051a39Sopenharmony_cifunctional reference to the first that successfully initialised. In the latter
148e1051a39Sopenharmony_cicase it will also cache an extra functional reference to the ENGINE as a
149e1051a39Sopenharmony_ci"default" for that 'nid'. The caching is acknowledged by a 'uptodate' variable
150e1051a39Sopenharmony_cithat is unset only if un/registration takes place on that pile. Ie. if
151e1051a39Sopenharmony_ciimplementations of "des_cbc" are added or removed. This behaviour can be
152e1051a39Sopenharmony_citweaked; the ENGINE_TABLE_FLAG_NOINIT value can be passed to
153e1051a39Sopenharmony_ciENGINE_set_table_flags(), in which case the only ENGINEs that tb_cipher.c will
154e1051a39Sopenharmony_citry to initialise from the "pile" will be those that are already initialised
155e1051a39Sopenharmony_ci(ie. it's simply an increment of the functional reference count, and no real
156e1051a39Sopenharmony_ci"initialisation" will take place).
157e1051a39Sopenharmony_ci
158e1051a39Sopenharmony_ciRSA, DSA, DH, and RAND all have their own ENGINE_TABLE code as well, and the
159e1051a39Sopenharmony_cidifference is that they all use an implicit 'nid' of 1. Whereas EVP_CIPHERs are
160e1051a39Sopenharmony_ciactually qualitatively different depending on 'nid' (the "des_cbc" EVP_CIPHER is
161e1051a39Sopenharmony_cinot an interoperable implementation of "aes_256_cbc"), RSA_METHODs are
162e1051a39Sopenharmony_cinecessarily interoperable and don't have different flavours, only different
163e1051a39Sopenharmony_ciimplementations. In other words, the ENGINE_TABLE for RSA will either be empty,
164e1051a39Sopenharmony_cior will have a single ENGINE_PILE hashed to by the 'nid' 1 and that pile
165e1051a39Sopenharmony_cirepresents ENGINEs that implement the single "type" of RSA there is.
166e1051a39Sopenharmony_ci
167e1051a39Sopenharmony_ciCleanup - the registration and unregistration may pose questions about how
168e1051a39Sopenharmony_cicleanup works with the ENGINE_PILE doing all this caching nonsense (ie. when the
169e1051a39Sopenharmony_ciapplication or EVP_CIPHER code releases its last reference to an ENGINE, the
170e1051a39Sopenharmony_ciENGINE_PILE code may still have references and thus those ENGINEs will stay
171e1051a39Sopenharmony_cihooked in forever). The way this is handled is via "unregistration". With these
172e1051a39Sopenharmony_cinew ENGINE changes, an abstract ENGINE can be loaded and initialised, but that
173e1051a39Sopenharmony_ciis an algorithm-agnostic process. Even if initialised, it will not have
174e1051a39Sopenharmony_ciregistered any of its implementations (to do so would link all class "table"
175e1051a39Sopenharmony_cicode despite the fact the application may use only ciphers, for example). This
176e1051a39Sopenharmony_ciis deliberately a distinct step. Moreover, registration and unregistration has
177e1051a39Sopenharmony_cinothing to do with whether an ENGINE is *functional* or not (ie. you can even
178e1051a39Sopenharmony_ciregister an ENGINE and its implementations without it being operational, you may
179e1051a39Sopenharmony_cinot even have the drivers to make it operate). What actually happens with
180e1051a39Sopenharmony_cirespect to cleanup is managed inside eng_lib.c with the `engine_cleanup_***`
181e1051a39Sopenharmony_cifunctions. These functions are internal-only and each part of ENGINE code that
182e1051a39Sopenharmony_cicould require cleanup will, upon performing its first allocation, register a
183e1051a39Sopenharmony_cicallback with the "engine_cleanup" code. The other part of this that makes it
184e1051a39Sopenharmony_citick is that the ENGINE_TABLE instantiations (tb_***.c) use NULL as their
185e1051a39Sopenharmony_ciinitialised state. So if RSA code asks for an ENGINE and no ENGINE has
186e1051a39Sopenharmony_ciregistered an implementation, the code will simply return NULL and the tb_rsa.c
187e1051a39Sopenharmony_cistate will be unchanged. Thus, no cleanup is required unless registration takes
188e1051a39Sopenharmony_ciplace. ENGINE_cleanup() will simply iterate across a list of registered cleanup
189e1051a39Sopenharmony_cicallbacks calling each in turn, and will then internally delete its own storage
190e1051a39Sopenharmony_ci(a STACK). When a cleanup callback is next registered (eg. if the cleanup() is
191e1051a39Sopenharmony_cipart of a graceful restart and the application wants to cleanup all state then
192e1051a39Sopenharmony_cistart again), the internal STACK storage will be freshly allocated. This is much
193e1051a39Sopenharmony_cithe same as the situation in the ENGINE_TABLE instantiations ... NULL is the
194e1051a39Sopenharmony_ciinitialised state, so only modification operations (not queries) will cause that
195e1051a39Sopenharmony_cicode to have to register a cleanup.
196e1051a39Sopenharmony_ci
197e1051a39Sopenharmony_ciWhat else? The bignum callbacks and associated ENGINE functions have been
198e1051a39Sopenharmony_ciremoved for two obvious reasons; (i) there was no way to generalise them to the
199e1051a39Sopenharmony_cimechanism now used by RSA/DSA/..., because there's no such thing as a BIGNUM
200e1051a39Sopenharmony_cimethod, and (ii) because of (i), there was no meaningful way for library or
201e1051a39Sopenharmony_ciapplication code to automatically hook and use ENGINE supplied bignum functions
202e1051a39Sopenharmony_cianyway. Also, ENGINE_cpy() has been removed (although an internal-only version
203e1051a39Sopenharmony_ciexists) - the idea of providing an ENGINE_cpy() function probably wasn't a good
204e1051a39Sopenharmony_cione and now certainly doesn't make sense in any generalised way. Some of the
205e1051a39Sopenharmony_ciRSA, DSA, DH, and RAND functions that were fiddled during the original ENGINE
206e1051a39Sopenharmony_cichanges have now, as a consequence, been reverted back. This is because the
207e1051a39Sopenharmony_cihooking of ENGINE is now automatic (and passive, it can internally use a NULL
208e1051a39Sopenharmony_ciENGINE pointer to simply ignore ENGINE from then on).
209e1051a39Sopenharmony_ci
210e1051a39Sopenharmony_ciHell, that should be enough for now ... comments welcome.
211