Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
freebsd
GitHub Repository: freebsd/freebsd-src
Path: blob/main/sys/contrib/openzfs/module/os/linux/zfs/zio_crypt.c
48774 views
1
// SPDX-License-Identifier: CDDL-1.0
2
/*
3
* CDDL HEADER START
4
*
5
* This file and its contents are supplied under the terms of the
6
* Common Development and Distribution License ("CDDL"), version 1.0.
7
* You may only use this file in accordance with the terms of version
8
* 1.0 of the CDDL.
9
*
10
* A full copy of the text of the CDDL should have accompanied this
11
* source. A copy of the CDDL is also available via the Internet at
12
* http://www.illumos.org/license/CDDL.
13
*
14
* CDDL HEADER END
15
*/
16
17
/*
18
* Copyright (c) 2017, Datto, Inc. All rights reserved.
19
*/
20
21
#include <sys/zio_crypt.h>
22
#include <sys/dmu.h>
23
#include <sys/dmu_objset.h>
24
#include <sys/dnode.h>
25
#include <sys/fs/zfs.h>
26
#include <sys/zio.h>
27
#include <sys/zil.h>
28
#include <sys/sha2.h>
29
#include <sys/hkdf.h>
30
#include <sys/qat.h>
31
32
/*
33
* This file is responsible for handling all of the details of generating
34
* encryption parameters and performing encryption and authentication.
35
*
36
* BLOCK ENCRYPTION PARAMETERS:
37
* Encryption /Authentication Algorithm Suite (crypt):
38
* The encryption algorithm, mode, and key length we are going to use. We
39
* currently support AES in either GCM or CCM modes with 128, 192, and 256 bit
40
* keys. All authentication is currently done with SHA512-HMAC.
41
*
42
* Plaintext:
43
* The unencrypted data that we want to encrypt.
44
*
45
* Initialization Vector (IV):
46
* An initialization vector for the encryption algorithms. This is used to
47
* "tweak" the encryption algorithms so that two blocks of the same data are
48
* encrypted into different ciphertext outputs, thus obfuscating block patterns.
49
* The supported encryption modes (AES-GCM and AES-CCM) require that an IV is
50
* never reused with the same encryption key. This value is stored unencrypted
51
* and must simply be provided to the decryption function. We use a 96 bit IV
52
* (as recommended by NIST) for all block encryption. For non-dedup blocks we
53
* derive the IV randomly. The first 64 bits of the IV are stored in the second
54
* word of DVA[2] and the remaining 32 bits are stored in the upper 32 bits of
55
* blk_fill. This is safe because encrypted blocks can't use the upper 32 bits
56
* of blk_fill. We only encrypt level 0 blocks, which normally have a fill count
57
* of 1. The only exception is for DMU_OT_DNODE objects, where the fill count of
58
* level 0 blocks is the number of allocated dnodes in that block. The on-disk
59
* format supports at most 2^15 slots per L0 dnode block, because the maximum
60
* block size is 16MB (2^24). In either case, for level 0 blocks this number
61
* will still be smaller than UINT32_MAX so it is safe to store the IV in the
62
* top 32 bits of blk_fill, while leaving the bottom 32 bits of the fill count
63
* for the dnode code.
64
*
65
* Master key:
66
* This is the most important secret data of an encrypted dataset. It is used
67
* along with the salt to generate that actual encryption keys via HKDF. We
68
* do not use the master key to directly encrypt any data because there are
69
* theoretical limits on how much data can actually be safely encrypted with
70
* any encryption mode. The master key is stored encrypted on disk with the
71
* user's wrapping key. Its length is determined by the encryption algorithm.
72
* For details on how this is stored see the block comment in dsl_crypt.c
73
*
74
* Salt:
75
* Used as an input to the HKDF function, along with the master key. We use a
76
* 64 bit salt, stored unencrypted in the first word of DVA[2]. Any given salt
77
* can be used for encrypting many blocks, so we cache the current salt and the
78
* associated derived key in zio_crypt_t so we do not need to derive it again
79
* needlessly.
80
*
81
* Encryption Key:
82
* A secret binary key, generated from an HKDF function used to encrypt and
83
* decrypt data.
84
*
85
* Message Authentication Code (MAC)
86
* The MAC is an output of authenticated encryption modes such as AES-GCM and
87
* AES-CCM. Its purpose is to ensure that an attacker cannot modify encrypted
88
* data on disk and return garbage to the application. Effectively, it is a
89
* checksum that can not be reproduced by an attacker. We store the MAC in the
90
* second 128 bits of blk_cksum, leaving the first 128 bits for a truncated
91
* regular checksum of the ciphertext which can be used for scrubbing.
92
*
93
* OBJECT AUTHENTICATION:
94
* Some object types, such as DMU_OT_MASTER_NODE cannot be encrypted because
95
* they contain some info that always needs to be readable. To prevent this
96
* data from being altered, we authenticate this data using SHA512-HMAC. This
97
* will produce a MAC (similar to the one produced via encryption) which can
98
* be used to verify the object was not modified. HMACs do not require key
99
* rotation or IVs, so we can keep up to the full 3 copies of authenticated
100
* data.
101
*
102
* ZIL ENCRYPTION:
103
* ZIL blocks have their bp written to disk ahead of the associated data, so we
104
* cannot store the MAC there as we normally do. For these blocks the MAC is
105
* stored in the embedded checksum within the zil_chain_t header. The salt and
106
* IV are generated for the block on bp allocation instead of at encryption
107
* time. In addition, ZIL blocks have some pieces that must be left in plaintext
108
* for claiming even though all of the sensitive user data still needs to be
109
* encrypted. The function zio_crypt_init_uios_zil() handles parsing which
110
* pieces of the block need to be encrypted. All data that is not encrypted is
111
* authenticated using the AAD mechanisms that the supported encryption modes
112
* provide for. In order to preserve the semantics of the ZIL for encrypted
113
* datasets, the ZIL is not protected at the objset level as described below.
114
*
115
* DNODE ENCRYPTION:
116
* Similarly to ZIL blocks, the core part of each dnode_phys_t needs to be left
117
* in plaintext for scrubbing and claiming, but the bonus buffers might contain
118
* sensitive user data. The function zio_crypt_init_uios_dnode() handles parsing
119
* which pieces of the block need to be encrypted. For more details about
120
* dnode authentication and encryption, see zio_crypt_init_uios_dnode().
121
*
122
* OBJECT SET AUTHENTICATION:
123
* Up to this point, everything we have encrypted and authenticated has been
124
* at level 0 (or -2 for the ZIL). If we did not do any further work the
125
* on-disk format would be susceptible to attacks that deleted or rearranged
126
* the order of level 0 blocks. Ideally, the cleanest solution would be to
127
* maintain a tree of authentication MACs going up the bp tree. However, this
128
* presents a problem for raw sends. Send files do not send information about
129
* indirect blocks so there would be no convenient way to transfer the MACs and
130
* they cannot be recalculated on the receive side without the master key which
131
* would defeat one of the purposes of raw sends in the first place. Instead,
132
* for the indirect levels of the bp tree, we use a regular SHA512 of the MACs
133
* from the level below. We also include some portable fields from blk_prop such
134
* as the lsize and compression algorithm to prevent the data from being
135
* misinterpreted.
136
*
137
* At the objset level, we maintain 2 separate 256 bit MACs in the
138
* objset_phys_t. The first one is "portable" and is the logical root of the
139
* MAC tree maintained in the metadnode's bps. The second, is "local" and is
140
* used as the root MAC for the user accounting objects, which are also not
141
* transferred via "zfs send". The portable MAC is sent in the DRR_BEGIN payload
142
* of the send file. The useraccounting code ensures that the useraccounting
143
* info is not present upon a receive, so the local MAC can simply be cleared
144
* out at that time. For more info about objset_phys_t authentication, see
145
* zio_crypt_do_objset_hmacs().
146
*
147
* CONSIDERATIONS FOR DEDUP:
148
* In order for dedup to work, blocks that we want to dedup with one another
149
* need to use the same IV and encryption key, so that they will have the same
150
* ciphertext. Normally, one should never reuse an IV with the same encryption
151
* key or else AES-GCM and AES-CCM can both actually leak the plaintext of both
152
* blocks. In this case, however, since we are using the same plaintext as
153
* well all that we end up with is a duplicate of the original ciphertext we
154
* already had. As a result, an attacker with read access to the raw disk will
155
* be able to tell which blocks are the same but this information is given away
156
* by dedup anyway. In order to get the same IVs and encryption keys for
157
* equivalent blocks of data we use an HMAC of the plaintext. We use an HMAC
158
* here so that a reproducible checksum of the plaintext is never available to
159
* the attacker. The HMAC key is kept alongside the master key, encrypted on
160
* disk. The first 64 bits of the HMAC are used in place of the random salt, and
161
* the next 96 bits are used as the IV. As a result of this mechanism, dedup
162
* will only work within a clone family since encrypted dedup requires use of
163
* the same master and HMAC keys.
164
*/
165
166
/*
167
* After encrypting many blocks with the same key we may start to run up
168
* against the theoretical limits of how much data can securely be encrypted
169
* with a single key using the supported encryption modes. The most obvious
170
* limitation is that our risk of generating 2 equivalent 96 bit IVs increases
171
* the more IVs we generate (which both GCM and CCM modes strictly forbid).
172
* This risk actually grows surprisingly quickly over time according to the
173
* Birthday Problem. With a total IV space of 2^(96 bits), and assuming we have
174
* generated n IVs with a cryptographically secure RNG, the approximate
175
* probability p(n) of a collision is given as:
176
*
177
* p(n) ~= e^(-n*(n-1)/(2*(2^96)))
178
*
179
* [http://www.math.cornell.edu/~mec/2008-2009/TianyiZheng/Birthday.html]
180
*
181
* Assuming that we want to ensure that p(n) never goes over 1 / 1 trillion
182
* we must not write more than 398,065,730 blocks with the same encryption key.
183
* Therefore, we rotate our keys after 400,000,000 blocks have been written by
184
* generating a new random 64 bit salt for our HKDF encryption key generation
185
* function.
186
*/
187
#define ZFS_KEY_MAX_SALT_USES_DEFAULT 400000000
188
#define ZFS_CURRENT_MAX_SALT_USES \
189
(MIN(zfs_key_max_salt_uses, ZFS_KEY_MAX_SALT_USES_DEFAULT))
190
static unsigned long zfs_key_max_salt_uses = ZFS_KEY_MAX_SALT_USES_DEFAULT;
191
192
typedef struct blkptr_auth_buf {
193
uint64_t bab_prop; /* blk_prop - portable mask */
194
uint8_t bab_mac[ZIO_DATA_MAC_LEN]; /* MAC from blk_cksum */
195
uint64_t bab_pad; /* reserved for future use */
196
} blkptr_auth_buf_t;
197
198
const zio_crypt_info_t zio_crypt_table[ZIO_CRYPT_FUNCTIONS] = {
199
{"", ZC_TYPE_NONE, 0, "inherit"},
200
{"", ZC_TYPE_NONE, 0, "on"},
201
{"", ZC_TYPE_NONE, 0, "off"},
202
{SUN_CKM_AES_CCM, ZC_TYPE_CCM, 16, "aes-128-ccm"},
203
{SUN_CKM_AES_CCM, ZC_TYPE_CCM, 24, "aes-192-ccm"},
204
{SUN_CKM_AES_CCM, ZC_TYPE_CCM, 32, "aes-256-ccm"},
205
{SUN_CKM_AES_GCM, ZC_TYPE_GCM, 16, "aes-128-gcm"},
206
{SUN_CKM_AES_GCM, ZC_TYPE_GCM, 24, "aes-192-gcm"},
207
{SUN_CKM_AES_GCM, ZC_TYPE_GCM, 32, "aes-256-gcm"}
208
};
209
210
void
211
zio_crypt_key_destroy(zio_crypt_key_t *key)
212
{
213
rw_destroy(&key->zk_salt_lock);
214
215
/* free crypto templates */
216
crypto_destroy_ctx_template(key->zk_current_tmpl);
217
crypto_destroy_ctx_template(key->zk_hmac_tmpl);
218
219
/* zero out sensitive data */
220
memset(key, 0, sizeof (zio_crypt_key_t));
221
}
222
223
int
224
zio_crypt_key_init(uint64_t crypt, zio_crypt_key_t *key)
225
{
226
int ret;
227
crypto_mechanism_t mech = {0};
228
uint_t keydata_len;
229
230
ASSERT(key != NULL);
231
ASSERT3U(crypt, <, ZIO_CRYPT_FUNCTIONS);
232
233
/*
234
* Workaround for GCC 12+ with UBSan enabled deficencies.
235
*
236
* GCC 12+ invoked with -fsanitize=undefined incorrectly reports the code
237
* below as violating -Warray-bounds
238
*/
239
#if defined(__GNUC__) && !defined(__clang__) && \
240
((!defined(_KERNEL) && defined(ZFS_UBSAN_ENABLED)) || \
241
defined(CONFIG_UBSAN))
242
#pragma GCC diagnostic push
243
#pragma GCC diagnostic ignored "-Warray-bounds"
244
#endif
245
keydata_len = zio_crypt_table[crypt].ci_keylen;
246
#if defined(__GNUC__) && !defined(__clang__) && \
247
((!defined(_KERNEL) && defined(ZFS_UBSAN_ENABLED)) || \
248
defined(CONFIG_UBSAN))
249
#pragma GCC diagnostic pop
250
#endif
251
memset(key, 0, sizeof (zio_crypt_key_t));
252
rw_init(&key->zk_salt_lock, NULL, RW_DEFAULT, NULL);
253
254
/* fill keydata buffers and salt with random data */
255
ret = random_get_bytes((uint8_t *)&key->zk_guid, sizeof (uint64_t));
256
if (ret != 0)
257
goto error;
258
259
ret = random_get_bytes(key->zk_master_keydata, keydata_len);
260
if (ret != 0)
261
goto error;
262
263
ret = random_get_bytes(key->zk_hmac_keydata, SHA512_HMAC_KEYLEN);
264
if (ret != 0)
265
goto error;
266
267
ret = random_get_bytes(key->zk_salt, ZIO_DATA_SALT_LEN);
268
if (ret != 0)
269
goto error;
270
271
/* derive the current key from the master key */
272
ret = hkdf_sha512(key->zk_master_keydata, keydata_len, NULL, 0,
273
key->zk_salt, ZIO_DATA_SALT_LEN, key->zk_current_keydata,
274
keydata_len);
275
if (ret != 0)
276
goto error;
277
278
/* initialize keys for the ICP */
279
key->zk_current_key.ck_data = key->zk_current_keydata;
280
key->zk_current_key.ck_length = CRYPTO_BYTES2BITS(keydata_len);
281
282
key->zk_hmac_key.ck_data = &key->zk_hmac_key;
283
key->zk_hmac_key.ck_length = CRYPTO_BYTES2BITS(SHA512_HMAC_KEYLEN);
284
285
/*
286
* Initialize the crypto templates. It's ok if this fails because
287
* this is just an optimization.
288
*/
289
mech.cm_type = crypto_mech2id(zio_crypt_table[crypt].ci_mechname);
290
ret = crypto_create_ctx_template(&mech, &key->zk_current_key,
291
&key->zk_current_tmpl);
292
if (ret != CRYPTO_SUCCESS)
293
key->zk_current_tmpl = NULL;
294
295
mech.cm_type = crypto_mech2id(SUN_CKM_SHA512_HMAC);
296
ret = crypto_create_ctx_template(&mech, &key->zk_hmac_key,
297
&key->zk_hmac_tmpl);
298
if (ret != CRYPTO_SUCCESS)
299
key->zk_hmac_tmpl = NULL;
300
301
key->zk_crypt = crypt;
302
key->zk_version = ZIO_CRYPT_KEY_CURRENT_VERSION;
303
key->zk_salt_count = 0;
304
305
return (0);
306
307
error:
308
zio_crypt_key_destroy(key);
309
return (ret);
310
}
311
312
static int
313
zio_crypt_key_change_salt(zio_crypt_key_t *key)
314
{
315
int ret = 0;
316
uint8_t salt[ZIO_DATA_SALT_LEN];
317
crypto_mechanism_t mech;
318
uint_t keydata_len = zio_crypt_table[key->zk_crypt].ci_keylen;
319
320
/* generate a new salt */
321
ret = random_get_bytes(salt, ZIO_DATA_SALT_LEN);
322
if (ret != 0)
323
goto error;
324
325
rw_enter(&key->zk_salt_lock, RW_WRITER);
326
327
/* someone beat us to the salt rotation, just unlock and return */
328
if (key->zk_salt_count < ZFS_CURRENT_MAX_SALT_USES)
329
goto out_unlock;
330
331
/* derive the current key from the master key and the new salt */
332
ret = hkdf_sha512(key->zk_master_keydata, keydata_len, NULL, 0,
333
salt, ZIO_DATA_SALT_LEN, key->zk_current_keydata, keydata_len);
334
if (ret != 0)
335
goto out_unlock;
336
337
/* assign the salt and reset the usage count */
338
memcpy(key->zk_salt, salt, ZIO_DATA_SALT_LEN);
339
key->zk_salt_count = 0;
340
341
/* destroy the old context template and create the new one */
342
crypto_destroy_ctx_template(key->zk_current_tmpl);
343
ret = crypto_create_ctx_template(&mech, &key->zk_current_key,
344
&key->zk_current_tmpl);
345
if (ret != CRYPTO_SUCCESS)
346
key->zk_current_tmpl = NULL;
347
348
rw_exit(&key->zk_salt_lock);
349
350
return (0);
351
352
out_unlock:
353
rw_exit(&key->zk_salt_lock);
354
error:
355
return (ret);
356
}
357
358
/* See comment above zfs_key_max_salt_uses definition for details */
359
int
360
zio_crypt_key_get_salt(zio_crypt_key_t *key, uint8_t *salt)
361
{
362
int ret;
363
boolean_t salt_change;
364
365
rw_enter(&key->zk_salt_lock, RW_READER);
366
367
memcpy(salt, key->zk_salt, ZIO_DATA_SALT_LEN);
368
salt_change = (atomic_inc_64_nv(&key->zk_salt_count) >=
369
ZFS_CURRENT_MAX_SALT_USES);
370
371
rw_exit(&key->zk_salt_lock);
372
373
if (salt_change) {
374
ret = zio_crypt_key_change_salt(key);
375
if (ret != 0)
376
goto error;
377
}
378
379
return (0);
380
381
error:
382
return (ret);
383
}
384
385
/*
386
* This function handles all encryption and decryption in zfs. When
387
* encrypting it expects puio to reference the plaintext and cuio to
388
* reference the ciphertext. cuio must have enough space for the
389
* ciphertext + room for a MAC. datalen should be the length of the
390
* plaintext / ciphertext alone.
391
*/
392
static int
393
zio_do_crypt_uio(boolean_t encrypt, uint64_t crypt, crypto_key_t *key,
394
crypto_ctx_template_t tmpl, uint8_t *ivbuf, uint_t datalen,
395
zfs_uio_t *puio, zfs_uio_t *cuio, uint8_t *authbuf, uint_t auth_len)
396
{
397
int ret;
398
crypto_data_t plaindata, cipherdata;
399
CK_AES_CCM_PARAMS ccmp;
400
CK_AES_GCM_PARAMS gcmp;
401
crypto_mechanism_t mech;
402
zio_crypt_info_t crypt_info;
403
uint_t plain_full_len, maclen;
404
405
ASSERT3U(crypt, <, ZIO_CRYPT_FUNCTIONS);
406
407
/* lookup the encryption info */
408
crypt_info = zio_crypt_table[crypt];
409
410
/* the mac will always be the last iovec_t in the cipher uio */
411
maclen = cuio->uio_iov[cuio->uio_iovcnt - 1].iov_len;
412
413
ASSERT(maclen <= ZIO_DATA_MAC_LEN);
414
415
/* setup encryption mechanism (same as crypt) */
416
mech.cm_type = crypto_mech2id(crypt_info.ci_mechname);
417
418
/*
419
* Strangely, the ICP requires that plain_full_len must include
420
* the MAC length when decrypting, even though the UIO does not
421
* need to have the extra space allocated.
422
*/
423
if (encrypt) {
424
plain_full_len = datalen;
425
} else {
426
plain_full_len = datalen + maclen;
427
}
428
429
/*
430
* setup encryption params (currently only AES CCM and AES GCM
431
* are supported)
432
*/
433
if (crypt_info.ci_crypt_type == ZC_TYPE_CCM) {
434
ccmp.ulNonceSize = ZIO_DATA_IV_LEN;
435
ccmp.ulAuthDataSize = auth_len;
436
ccmp.authData = authbuf;
437
ccmp.ulMACSize = maclen;
438
ccmp.nonce = ivbuf;
439
ccmp.ulDataSize = plain_full_len;
440
441
mech.cm_param = (char *)(&ccmp);
442
mech.cm_param_len = sizeof (CK_AES_CCM_PARAMS);
443
} else {
444
gcmp.ulIvLen = ZIO_DATA_IV_LEN;
445
gcmp.ulIvBits = CRYPTO_BYTES2BITS(ZIO_DATA_IV_LEN);
446
gcmp.ulAADLen = auth_len;
447
gcmp.pAAD = authbuf;
448
gcmp.ulTagBits = CRYPTO_BYTES2BITS(maclen);
449
gcmp.pIv = ivbuf;
450
451
mech.cm_param = (char *)(&gcmp);
452
mech.cm_param_len = sizeof (CK_AES_GCM_PARAMS);
453
}
454
455
/* populate the cipher and plain data structs. */
456
plaindata.cd_format = CRYPTO_DATA_UIO;
457
plaindata.cd_offset = 0;
458
plaindata.cd_uio = puio;
459
plaindata.cd_length = plain_full_len;
460
461
cipherdata.cd_format = CRYPTO_DATA_UIO;
462
cipherdata.cd_offset = 0;
463
cipherdata.cd_uio = cuio;
464
cipherdata.cd_length = datalen + maclen;
465
466
/* perform the actual encryption */
467
if (encrypt) {
468
ret = crypto_encrypt(&mech, &plaindata, key, tmpl, &cipherdata);
469
if (ret != CRYPTO_SUCCESS) {
470
ret = SET_ERROR(EIO);
471
goto error;
472
}
473
} else {
474
ret = crypto_decrypt(&mech, &cipherdata, key, tmpl, &plaindata);
475
if (ret != CRYPTO_SUCCESS) {
476
ASSERT3U(ret, ==, CRYPTO_INVALID_MAC);
477
ret = SET_ERROR(ECKSUM);
478
goto error;
479
}
480
}
481
482
return (0);
483
484
error:
485
return (ret);
486
}
487
488
int
489
zio_crypt_key_wrap(crypto_key_t *cwkey, zio_crypt_key_t *key, uint8_t *iv,
490
uint8_t *mac, uint8_t *keydata_out, uint8_t *hmac_keydata_out)
491
{
492
int ret;
493
zfs_uio_t puio, cuio;
494
uint64_t aad[3];
495
iovec_t plain_iovecs[2], cipher_iovecs[3];
496
uint64_t crypt = key->zk_crypt;
497
uint_t enc_len, keydata_len, aad_len;
498
499
ASSERT3U(crypt, <, ZIO_CRYPT_FUNCTIONS);
500
501
keydata_len = zio_crypt_table[crypt].ci_keylen;
502
503
/* generate iv for wrapping the master and hmac key */
504
ret = random_get_pseudo_bytes(iv, WRAPPING_IV_LEN);
505
if (ret != 0)
506
goto error;
507
508
/* initialize zfs_uio_ts */
509
plain_iovecs[0].iov_base = key->zk_master_keydata;
510
plain_iovecs[0].iov_len = keydata_len;
511
plain_iovecs[1].iov_base = key->zk_hmac_keydata;
512
plain_iovecs[1].iov_len = SHA512_HMAC_KEYLEN;
513
514
cipher_iovecs[0].iov_base = keydata_out;
515
cipher_iovecs[0].iov_len = keydata_len;
516
cipher_iovecs[1].iov_base = hmac_keydata_out;
517
cipher_iovecs[1].iov_len = SHA512_HMAC_KEYLEN;
518
cipher_iovecs[2].iov_base = mac;
519
cipher_iovecs[2].iov_len = WRAPPING_MAC_LEN;
520
521
/*
522
* Although we don't support writing to the old format, we do
523
* support rewrapping the key so that the user can move and
524
* quarantine datasets on the old format.
525
*/
526
if (key->zk_version == 0) {
527
aad_len = sizeof (uint64_t);
528
aad[0] = LE_64(key->zk_guid);
529
} else {
530
ASSERT3U(key->zk_version, ==, ZIO_CRYPT_KEY_CURRENT_VERSION);
531
aad_len = sizeof (uint64_t) * 3;
532
aad[0] = LE_64(key->zk_guid);
533
aad[1] = LE_64(crypt);
534
aad[2] = LE_64(key->zk_version);
535
}
536
537
enc_len = zio_crypt_table[crypt].ci_keylen + SHA512_HMAC_KEYLEN;
538
puio.uio_iov = plain_iovecs;
539
puio.uio_iovcnt = 2;
540
puio.uio_segflg = UIO_SYSSPACE;
541
cuio.uio_iov = cipher_iovecs;
542
cuio.uio_iovcnt = 3;
543
cuio.uio_segflg = UIO_SYSSPACE;
544
545
/* encrypt the keys and store the resulting ciphertext and mac */
546
ret = zio_do_crypt_uio(B_TRUE, crypt, cwkey, NULL, iv, enc_len,
547
&puio, &cuio, (uint8_t *)aad, aad_len);
548
if (ret != 0)
549
goto error;
550
551
return (0);
552
553
error:
554
return (ret);
555
}
556
557
int
558
zio_crypt_key_unwrap(crypto_key_t *cwkey, uint64_t crypt, uint64_t version,
559
uint64_t guid, uint8_t *keydata, uint8_t *hmac_keydata, uint8_t *iv,
560
uint8_t *mac, zio_crypt_key_t *key)
561
{
562
crypto_mechanism_t mech;
563
zfs_uio_t puio, cuio;
564
uint64_t aad[3];
565
iovec_t plain_iovecs[2], cipher_iovecs[3];
566
uint_t enc_len, keydata_len, aad_len;
567
int ret;
568
569
ASSERT3U(crypt, <, ZIO_CRYPT_FUNCTIONS);
570
571
rw_init(&key->zk_salt_lock, NULL, RW_DEFAULT, NULL);
572
573
keydata_len = zio_crypt_table[crypt].ci_keylen;
574
575
/* initialize zfs_uio_ts */
576
plain_iovecs[0].iov_base = key->zk_master_keydata;
577
plain_iovecs[0].iov_len = keydata_len;
578
plain_iovecs[1].iov_base = key->zk_hmac_keydata;
579
plain_iovecs[1].iov_len = SHA512_HMAC_KEYLEN;
580
581
cipher_iovecs[0].iov_base = keydata;
582
cipher_iovecs[0].iov_len = keydata_len;
583
cipher_iovecs[1].iov_base = hmac_keydata;
584
cipher_iovecs[1].iov_len = SHA512_HMAC_KEYLEN;
585
cipher_iovecs[2].iov_base = mac;
586
cipher_iovecs[2].iov_len = WRAPPING_MAC_LEN;
587
588
if (version == 0) {
589
aad_len = sizeof (uint64_t);
590
aad[0] = LE_64(guid);
591
} else {
592
ASSERT3U(version, ==, ZIO_CRYPT_KEY_CURRENT_VERSION);
593
aad_len = sizeof (uint64_t) * 3;
594
aad[0] = LE_64(guid);
595
aad[1] = LE_64(crypt);
596
aad[2] = LE_64(version);
597
}
598
599
enc_len = keydata_len + SHA512_HMAC_KEYLEN;
600
puio.uio_iov = plain_iovecs;
601
puio.uio_segflg = UIO_SYSSPACE;
602
puio.uio_iovcnt = 2;
603
cuio.uio_iov = cipher_iovecs;
604
cuio.uio_iovcnt = 3;
605
cuio.uio_segflg = UIO_SYSSPACE;
606
607
/* decrypt the keys and store the result in the output buffers */
608
ret = zio_do_crypt_uio(B_FALSE, crypt, cwkey, NULL, iv, enc_len,
609
&puio, &cuio, (uint8_t *)aad, aad_len);
610
if (ret != 0)
611
goto error;
612
613
/* generate a fresh salt */
614
ret = random_get_bytes(key->zk_salt, ZIO_DATA_SALT_LEN);
615
if (ret != 0)
616
goto error;
617
618
/* derive the current key from the master key */
619
ret = hkdf_sha512(key->zk_master_keydata, keydata_len, NULL, 0,
620
key->zk_salt, ZIO_DATA_SALT_LEN, key->zk_current_keydata,
621
keydata_len);
622
if (ret != 0)
623
goto error;
624
625
/* initialize keys for ICP */
626
key->zk_current_key.ck_data = key->zk_current_keydata;
627
key->zk_current_key.ck_length = CRYPTO_BYTES2BITS(keydata_len);
628
629
key->zk_hmac_key.ck_data = key->zk_hmac_keydata;
630
key->zk_hmac_key.ck_length = CRYPTO_BYTES2BITS(SHA512_HMAC_KEYLEN);
631
632
/*
633
* Initialize the crypto templates. It's ok if this fails because
634
* this is just an optimization.
635
*/
636
mech.cm_type = crypto_mech2id(zio_crypt_table[crypt].ci_mechname);
637
ret = crypto_create_ctx_template(&mech, &key->zk_current_key,
638
&key->zk_current_tmpl);
639
if (ret != CRYPTO_SUCCESS)
640
key->zk_current_tmpl = NULL;
641
642
mech.cm_type = crypto_mech2id(SUN_CKM_SHA512_HMAC);
643
ret = crypto_create_ctx_template(&mech, &key->zk_hmac_key,
644
&key->zk_hmac_tmpl);
645
if (ret != CRYPTO_SUCCESS)
646
key->zk_hmac_tmpl = NULL;
647
648
key->zk_crypt = crypt;
649
key->zk_version = version;
650
key->zk_guid = guid;
651
key->zk_salt_count = 0;
652
653
return (0);
654
655
error:
656
zio_crypt_key_destroy(key);
657
return (ret);
658
}
659
660
int
661
zio_crypt_generate_iv(uint8_t *ivbuf)
662
{
663
int ret;
664
665
/* randomly generate the IV */
666
ret = random_get_pseudo_bytes(ivbuf, ZIO_DATA_IV_LEN);
667
if (ret != 0)
668
goto error;
669
670
return (0);
671
672
error:
673
memset(ivbuf, 0, ZIO_DATA_IV_LEN);
674
return (ret);
675
}
676
677
int
678
zio_crypt_do_hmac(zio_crypt_key_t *key, uint8_t *data, uint_t datalen,
679
uint8_t *digestbuf, uint_t digestlen)
680
{
681
int ret;
682
crypto_mechanism_t mech;
683
crypto_data_t in_data, digest_data;
684
uint8_t raw_digestbuf[SHA512_DIGEST_LENGTH];
685
686
ASSERT3U(digestlen, <=, SHA512_DIGEST_LENGTH);
687
688
/* initialize sha512-hmac mechanism and crypto data */
689
mech.cm_type = crypto_mech2id(SUN_CKM_SHA512_HMAC);
690
mech.cm_param = NULL;
691
mech.cm_param_len = 0;
692
693
/* initialize the crypto data */
694
in_data.cd_format = CRYPTO_DATA_RAW;
695
in_data.cd_offset = 0;
696
in_data.cd_length = datalen;
697
in_data.cd_raw.iov_base = (char *)data;
698
in_data.cd_raw.iov_len = in_data.cd_length;
699
700
digest_data.cd_format = CRYPTO_DATA_RAW;
701
digest_data.cd_offset = 0;
702
digest_data.cd_length = SHA512_DIGEST_LENGTH;
703
digest_data.cd_raw.iov_base = (char *)raw_digestbuf;
704
digest_data.cd_raw.iov_len = digest_data.cd_length;
705
706
/* generate the hmac */
707
ret = crypto_mac(&mech, &in_data, &key->zk_hmac_key, key->zk_hmac_tmpl,
708
&digest_data);
709
if (ret != CRYPTO_SUCCESS) {
710
ret = SET_ERROR(EIO);
711
goto error;
712
}
713
714
memcpy(digestbuf, raw_digestbuf, digestlen);
715
716
return (0);
717
718
error:
719
memset(digestbuf, 0, digestlen);
720
return (ret);
721
}
722
723
int
724
zio_crypt_generate_iv_salt_dedup(zio_crypt_key_t *key, uint8_t *data,
725
uint_t datalen, uint8_t *ivbuf, uint8_t *salt)
726
{
727
int ret;
728
uint8_t digestbuf[SHA512_DIGEST_LENGTH];
729
730
ret = zio_crypt_do_hmac(key, data, datalen,
731
digestbuf, SHA512_DIGEST_LENGTH);
732
if (ret != 0)
733
return (ret);
734
735
memcpy(salt, digestbuf, ZIO_DATA_SALT_LEN);
736
memcpy(ivbuf, digestbuf + ZIO_DATA_SALT_LEN, ZIO_DATA_IV_LEN);
737
738
return (0);
739
}
740
741
/*
742
* The following functions are used to encode and decode encryption parameters
743
* into blkptr_t and zil_header_t. The ICP wants to use these parameters as
744
* byte strings, which normally means that these strings would not need to deal
745
* with byteswapping at all. However, both blkptr_t and zil_header_t may be
746
* byteswapped by lower layers and so we must "undo" that byteswap here upon
747
* decoding and encoding in a non-native byteorder. These functions require
748
* that the byteorder bit is correct before being called.
749
*/
750
void
751
zio_crypt_encode_params_bp(blkptr_t *bp, uint8_t *salt, uint8_t *iv)
752
{
753
uint64_t val64;
754
uint32_t val32;
755
756
ASSERT(BP_IS_ENCRYPTED(bp));
757
758
if (!BP_SHOULD_BYTESWAP(bp)) {
759
memcpy(&bp->blk_dva[2].dva_word[0], salt, sizeof (uint64_t));
760
memcpy(&bp->blk_dva[2].dva_word[1], iv, sizeof (uint64_t));
761
memcpy(&val32, iv + sizeof (uint64_t), sizeof (uint32_t));
762
BP_SET_IV2(bp, val32);
763
} else {
764
memcpy(&val64, salt, sizeof (uint64_t));
765
bp->blk_dva[2].dva_word[0] = BSWAP_64(val64);
766
767
memcpy(&val64, iv, sizeof (uint64_t));
768
bp->blk_dva[2].dva_word[1] = BSWAP_64(val64);
769
770
memcpy(&val32, iv + sizeof (uint64_t), sizeof (uint32_t));
771
BP_SET_IV2(bp, BSWAP_32(val32));
772
}
773
}
774
775
void
776
zio_crypt_decode_params_bp(const blkptr_t *bp, uint8_t *salt, uint8_t *iv)
777
{
778
uint64_t val64;
779
uint32_t val32;
780
781
ASSERT(BP_IS_PROTECTED(bp));
782
783
/* for convenience, so callers don't need to check */
784
if (BP_IS_AUTHENTICATED(bp)) {
785
memset(salt, 0, ZIO_DATA_SALT_LEN);
786
memset(iv, 0, ZIO_DATA_IV_LEN);
787
return;
788
}
789
790
if (!BP_SHOULD_BYTESWAP(bp)) {
791
memcpy(salt, &bp->blk_dva[2].dva_word[0], sizeof (uint64_t));
792
memcpy(iv, &bp->blk_dva[2].dva_word[1], sizeof (uint64_t));
793
794
val32 = (uint32_t)BP_GET_IV2(bp);
795
memcpy(iv + sizeof (uint64_t), &val32, sizeof (uint32_t));
796
} else {
797
val64 = BSWAP_64(bp->blk_dva[2].dva_word[0]);
798
memcpy(salt, &val64, sizeof (uint64_t));
799
800
val64 = BSWAP_64(bp->blk_dva[2].dva_word[1]);
801
memcpy(iv, &val64, sizeof (uint64_t));
802
803
val32 = BSWAP_32((uint32_t)BP_GET_IV2(bp));
804
memcpy(iv + sizeof (uint64_t), &val32, sizeof (uint32_t));
805
}
806
}
807
808
void
809
zio_crypt_encode_mac_bp(blkptr_t *bp, uint8_t *mac)
810
{
811
uint64_t val64;
812
813
ASSERT(BP_USES_CRYPT(bp));
814
ASSERT3U(BP_GET_TYPE(bp), !=, DMU_OT_OBJSET);
815
816
if (!BP_SHOULD_BYTESWAP(bp)) {
817
memcpy(&bp->blk_cksum.zc_word[2], mac, sizeof (uint64_t));
818
memcpy(&bp->blk_cksum.zc_word[3], mac + sizeof (uint64_t),
819
sizeof (uint64_t));
820
} else {
821
memcpy(&val64, mac, sizeof (uint64_t));
822
bp->blk_cksum.zc_word[2] = BSWAP_64(val64);
823
824
memcpy(&val64, mac + sizeof (uint64_t), sizeof (uint64_t));
825
bp->blk_cksum.zc_word[3] = BSWAP_64(val64);
826
}
827
}
828
829
void
830
zio_crypt_decode_mac_bp(const blkptr_t *bp, uint8_t *mac)
831
{
832
uint64_t val64;
833
834
ASSERT(BP_USES_CRYPT(bp) || BP_IS_HOLE(bp));
835
836
/* for convenience, so callers don't need to check */
837
if (BP_GET_TYPE(bp) == DMU_OT_OBJSET) {
838
memset(mac, 0, ZIO_DATA_MAC_LEN);
839
return;
840
}
841
842
if (!BP_SHOULD_BYTESWAP(bp)) {
843
memcpy(mac, &bp->blk_cksum.zc_word[2], sizeof (uint64_t));
844
memcpy(mac + sizeof (uint64_t), &bp->blk_cksum.zc_word[3],
845
sizeof (uint64_t));
846
} else {
847
val64 = BSWAP_64(bp->blk_cksum.zc_word[2]);
848
memcpy(mac, &val64, sizeof (uint64_t));
849
850
val64 = BSWAP_64(bp->blk_cksum.zc_word[3]);
851
memcpy(mac + sizeof (uint64_t), &val64, sizeof (uint64_t));
852
}
853
}
854
855
void
856
zio_crypt_encode_mac_zil(void *data, uint8_t *mac)
857
{
858
zil_chain_t *zilc = data;
859
860
memcpy(&zilc->zc_eck.zec_cksum.zc_word[2], mac, sizeof (uint64_t));
861
memcpy(&zilc->zc_eck.zec_cksum.zc_word[3], mac + sizeof (uint64_t),
862
sizeof (uint64_t));
863
}
864
865
void
866
zio_crypt_decode_mac_zil(const void *data, uint8_t *mac)
867
{
868
/*
869
* The ZIL MAC is embedded in the block it protects, which will
870
* not have been byteswapped by the time this function has been called.
871
* As a result, we don't need to worry about byteswapping the MAC.
872
*/
873
const zil_chain_t *zilc = data;
874
875
memcpy(mac, &zilc->zc_eck.zec_cksum.zc_word[2], sizeof (uint64_t));
876
memcpy(mac + sizeof (uint64_t), &zilc->zc_eck.zec_cksum.zc_word[3],
877
sizeof (uint64_t));
878
}
879
880
/*
881
* This routine takes a block of dnodes (src_abd) and copies only the bonus
882
* buffers to the same offsets in the dst buffer. datalen should be the size
883
* of both the src_abd and the dst buffer (not just the length of the bonus
884
* buffers).
885
*/
886
void
887
zio_crypt_copy_dnode_bonus(abd_t *src_abd, uint8_t *dst, uint_t datalen)
888
{
889
uint_t i, max_dnp = datalen >> DNODE_SHIFT;
890
uint8_t *src;
891
dnode_phys_t *dnp, *sdnp, *ddnp;
892
893
src = abd_borrow_buf_copy(src_abd, datalen);
894
895
sdnp = (dnode_phys_t *)src;
896
ddnp = (dnode_phys_t *)dst;
897
898
for (i = 0; i < max_dnp; i += sdnp[i].dn_extra_slots + 1) {
899
dnp = &sdnp[i];
900
if (dnp->dn_type != DMU_OT_NONE &&
901
DMU_OT_IS_ENCRYPTED(dnp->dn_bonustype) &&
902
dnp->dn_bonuslen != 0) {
903
memcpy(DN_BONUS(&ddnp[i]), DN_BONUS(dnp),
904
DN_MAX_BONUS_LEN(dnp));
905
}
906
}
907
908
abd_return_buf(src_abd, src, datalen);
909
}
910
911
/*
912
* This function decides what fields from blk_prop are included in
913
* the on-disk various MAC algorithms.
914
*/
915
static void
916
zio_crypt_bp_zero_nonportable_blkprop(blkptr_t *bp, uint64_t version)
917
{
918
/*
919
* Version 0 did not properly zero out all non-portable fields
920
* as it should have done. We maintain this code so that we can
921
* do read-only imports of pools on this version.
922
*/
923
if (version == 0) {
924
BP_SET_DEDUP(bp, 0);
925
BP_SET_CHECKSUM(bp, 0);
926
BP_SET_PSIZE(bp, SPA_MINBLOCKSIZE);
927
return;
928
}
929
930
ASSERT3U(version, ==, ZIO_CRYPT_KEY_CURRENT_VERSION);
931
932
/*
933
* The hole_birth feature might set these fields even if this bp
934
* is a hole. We zero them out here to guarantee that raw sends
935
* will function with or without the feature.
936
*/
937
if (BP_IS_HOLE(bp)) {
938
bp->blk_prop = 0ULL;
939
return;
940
}
941
942
/*
943
* At L0 we want to verify these fields to ensure that data blocks
944
* can not be reinterpreted. For instance, we do not want an attacker
945
* to trick us into returning raw lz4 compressed data to the user
946
* by modifying the compression bits. At higher levels, we cannot
947
* enforce this policy since raw sends do not convey any information
948
* about indirect blocks, so these values might be different on the
949
* receive side. Fortunately, this does not open any new attack
950
* vectors, since any alterations that can be made to a higher level
951
* bp must still verify the correct order of the layer below it.
952
*/
953
if (BP_GET_LEVEL(bp) != 0) {
954
BP_SET_BYTEORDER(bp, 0);
955
BP_SET_COMPRESS(bp, 0);
956
957
/*
958
* psize cannot be set to zero or it will trigger
959
* asserts, but the value doesn't really matter as
960
* long as it is constant.
961
*/
962
BP_SET_PSIZE(bp, SPA_MINBLOCKSIZE);
963
}
964
965
BP_SET_DEDUP(bp, 0);
966
BP_SET_CHECKSUM(bp, 0);
967
}
968
969
static void
970
zio_crypt_bp_auth_init(uint64_t version, boolean_t should_bswap, blkptr_t *bp,
971
blkptr_auth_buf_t *bab, uint_t *bab_len)
972
{
973
blkptr_t tmpbp = *bp;
974
975
if (should_bswap)
976
byteswap_uint64_array(&tmpbp, sizeof (blkptr_t));
977
978
ASSERT(BP_USES_CRYPT(&tmpbp) || BP_IS_HOLE(&tmpbp));
979
ASSERT0(BP_IS_EMBEDDED(&tmpbp));
980
981
zio_crypt_decode_mac_bp(&tmpbp, bab->bab_mac);
982
983
/*
984
* We always MAC blk_prop in LE to ensure portability. This
985
* must be done after decoding the mac, since the endianness
986
* will get zero'd out here.
987
*/
988
zio_crypt_bp_zero_nonportable_blkprop(&tmpbp, version);
989
bab->bab_prop = LE_64(tmpbp.blk_prop);
990
bab->bab_pad = 0ULL;
991
992
/* version 0 did not include the padding */
993
*bab_len = sizeof (blkptr_auth_buf_t);
994
if (version == 0)
995
*bab_len -= sizeof (uint64_t);
996
}
997
998
static int
999
zio_crypt_bp_do_hmac_updates(crypto_context_t ctx, uint64_t version,
1000
boolean_t should_bswap, blkptr_t *bp)
1001
{
1002
int ret;
1003
uint_t bab_len;
1004
blkptr_auth_buf_t bab;
1005
crypto_data_t cd;
1006
1007
zio_crypt_bp_auth_init(version, should_bswap, bp, &bab, &bab_len);
1008
cd.cd_format = CRYPTO_DATA_RAW;
1009
cd.cd_offset = 0;
1010
cd.cd_length = bab_len;
1011
cd.cd_raw.iov_base = (char *)&bab;
1012
cd.cd_raw.iov_len = cd.cd_length;
1013
1014
ret = crypto_mac_update(ctx, &cd);
1015
if (ret != CRYPTO_SUCCESS) {
1016
ret = SET_ERROR(EIO);
1017
goto error;
1018
}
1019
1020
return (0);
1021
1022
error:
1023
return (ret);
1024
}
1025
1026
static void
1027
zio_crypt_bp_do_indrect_checksum_updates(SHA2_CTX *ctx, uint64_t version,
1028
boolean_t should_bswap, blkptr_t *bp)
1029
{
1030
uint_t bab_len;
1031
blkptr_auth_buf_t bab;
1032
1033
zio_crypt_bp_auth_init(version, should_bswap, bp, &bab, &bab_len);
1034
SHA2Update(ctx, &bab, bab_len);
1035
}
1036
1037
static void
1038
zio_crypt_bp_do_aad_updates(uint8_t **aadp, uint_t *aad_len, uint64_t version,
1039
boolean_t should_bswap, blkptr_t *bp)
1040
{
1041
uint_t bab_len;
1042
blkptr_auth_buf_t bab;
1043
1044
zio_crypt_bp_auth_init(version, should_bswap, bp, &bab, &bab_len);
1045
memcpy(*aadp, &bab, bab_len);
1046
*aadp += bab_len;
1047
*aad_len += bab_len;
1048
}
1049
1050
static int
1051
zio_crypt_do_dnode_hmac_updates(crypto_context_t ctx, uint64_t version,
1052
boolean_t should_bswap, dnode_phys_t *dnp)
1053
{
1054
int ret, i;
1055
dnode_phys_t *adnp, tmp_dncore;
1056
size_t dn_core_size = offsetof(dnode_phys_t, dn_blkptr);
1057
boolean_t le_bswap = (should_bswap == ZFS_HOST_BYTEORDER);
1058
crypto_data_t cd;
1059
1060
cd.cd_format = CRYPTO_DATA_RAW;
1061
cd.cd_offset = 0;
1062
1063
/*
1064
* Authenticate the core dnode (masking out non-portable bits).
1065
* We only copy the first 64 bytes we operate on to avoid the overhead
1066
* of copying 512-64 unneeded bytes. The compiler seems to be fine
1067
* with that.
1068
*/
1069
memcpy(&tmp_dncore, dnp, dn_core_size);
1070
adnp = &tmp_dncore;
1071
1072
if (le_bswap) {
1073
adnp->dn_datablkszsec = BSWAP_16(adnp->dn_datablkszsec);
1074
adnp->dn_bonuslen = BSWAP_16(adnp->dn_bonuslen);
1075
adnp->dn_maxblkid = BSWAP_64(adnp->dn_maxblkid);
1076
adnp->dn_used = BSWAP_64(adnp->dn_used);
1077
}
1078
adnp->dn_flags &= DNODE_CRYPT_PORTABLE_FLAGS_MASK;
1079
adnp->dn_used = 0;
1080
1081
cd.cd_length = dn_core_size;
1082
cd.cd_raw.iov_base = (char *)adnp;
1083
cd.cd_raw.iov_len = cd.cd_length;
1084
1085
ret = crypto_mac_update(ctx, &cd);
1086
if (ret != CRYPTO_SUCCESS) {
1087
ret = SET_ERROR(EIO);
1088
goto error;
1089
}
1090
1091
for (i = 0; i < dnp->dn_nblkptr; i++) {
1092
ret = zio_crypt_bp_do_hmac_updates(ctx, version,
1093
should_bswap, &dnp->dn_blkptr[i]);
1094
if (ret != 0)
1095
goto error;
1096
}
1097
1098
if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) {
1099
ret = zio_crypt_bp_do_hmac_updates(ctx, version,
1100
should_bswap, DN_SPILL_BLKPTR(dnp));
1101
if (ret != 0)
1102
goto error;
1103
}
1104
1105
return (0);
1106
1107
error:
1108
return (ret);
1109
}
1110
1111
/*
1112
* objset_phys_t blocks introduce a number of exceptions to the normal
1113
* authentication process. objset_phys_t's contain 2 separate HMACS for
1114
* protecting the integrity of their data. The portable_mac protects the
1115
* metadnode. This MAC can be sent with a raw send and protects against
1116
* reordering of data within the metadnode. The local_mac protects the user
1117
* accounting objects which are not sent from one system to another.
1118
*
1119
* In addition, objset blocks are the only blocks that can be modified and
1120
* written to disk without the key loaded under certain circumstances. During
1121
* zil_claim() we need to be able to update the zil_header_t to complete
1122
* claiming log blocks and during raw receives we need to write out the
1123
* portable_mac from the send file. Both of these actions are possible
1124
* because these fields are not protected by either MAC so neither one will
1125
* need to modify the MACs without the key. However, when the modified blocks
1126
* are written out they will be byteswapped into the host machine's native
1127
* endianness which will modify fields protected by the MAC. As a result, MAC
1128
* calculation for objset blocks works slightly differently from other block
1129
* types. Where other block types MAC the data in whatever endianness is
1130
* written to disk, objset blocks always MAC little endian version of their
1131
* values. In the code, should_bswap is the value from BP_SHOULD_BYTESWAP()
1132
* and le_bswap indicates whether a byteswap is needed to get this block
1133
* into little endian format.
1134
*/
1135
int
1136
zio_crypt_do_objset_hmacs(zio_crypt_key_t *key, void *data, uint_t datalen,
1137
boolean_t should_bswap, uint8_t *portable_mac, uint8_t *local_mac)
1138
{
1139
int ret;
1140
crypto_mechanism_t mech;
1141
crypto_context_t ctx;
1142
crypto_data_t cd;
1143
objset_phys_t *osp = data;
1144
uint64_t intval;
1145
boolean_t le_bswap = (should_bswap == ZFS_HOST_BYTEORDER);
1146
uint8_t raw_portable_mac[SHA512_DIGEST_LENGTH];
1147
uint8_t raw_local_mac[SHA512_DIGEST_LENGTH];
1148
1149
/* initialize HMAC mechanism */
1150
mech.cm_type = crypto_mech2id(SUN_CKM_SHA512_HMAC);
1151
mech.cm_param = NULL;
1152
mech.cm_param_len = 0;
1153
1154
cd.cd_format = CRYPTO_DATA_RAW;
1155
cd.cd_offset = 0;
1156
1157
/* calculate the portable MAC from the portable fields and metadnode */
1158
ret = crypto_mac_init(&mech, &key->zk_hmac_key, NULL, &ctx);
1159
if (ret != CRYPTO_SUCCESS) {
1160
ret = SET_ERROR(EIO);
1161
goto error;
1162
}
1163
1164
/* add in the os_type */
1165
intval = (le_bswap) ? osp->os_type : BSWAP_64(osp->os_type);
1166
cd.cd_length = sizeof (uint64_t);
1167
cd.cd_raw.iov_base = (char *)&intval;
1168
cd.cd_raw.iov_len = cd.cd_length;
1169
1170
ret = crypto_mac_update(ctx, &cd);
1171
if (ret != CRYPTO_SUCCESS) {
1172
ret = SET_ERROR(EIO);
1173
goto error;
1174
}
1175
1176
/* add in the portable os_flags */
1177
intval = osp->os_flags;
1178
if (should_bswap)
1179
intval = BSWAP_64(intval);
1180
intval &= OBJSET_CRYPT_PORTABLE_FLAGS_MASK;
1181
if (!ZFS_HOST_BYTEORDER)
1182
intval = BSWAP_64(intval);
1183
1184
cd.cd_length = sizeof (uint64_t);
1185
cd.cd_raw.iov_base = (char *)&intval;
1186
cd.cd_raw.iov_len = cd.cd_length;
1187
1188
ret = crypto_mac_update(ctx, &cd);
1189
if (ret != CRYPTO_SUCCESS) {
1190
ret = SET_ERROR(EIO);
1191
goto error;
1192
}
1193
1194
/* add in fields from the metadnode */
1195
ret = zio_crypt_do_dnode_hmac_updates(ctx, key->zk_version,
1196
should_bswap, &osp->os_meta_dnode);
1197
if (ret)
1198
goto error;
1199
1200
/* store the final digest in a temporary buffer and copy what we need */
1201
cd.cd_length = SHA512_DIGEST_LENGTH;
1202
cd.cd_raw.iov_base = (char *)raw_portable_mac;
1203
cd.cd_raw.iov_len = cd.cd_length;
1204
1205
ret = crypto_mac_final(ctx, &cd);
1206
if (ret != CRYPTO_SUCCESS) {
1207
ret = SET_ERROR(EIO);
1208
goto error;
1209
}
1210
1211
memcpy(portable_mac, raw_portable_mac, ZIO_OBJSET_MAC_LEN);
1212
1213
/*
1214
* This is necessary here as we check next whether
1215
* OBJSET_FLAG_USERACCOUNTING_COMPLETE is set in order to
1216
* decide if the local_mac should be zeroed out. That flag will always
1217
* be set by dmu_objset_id_quota_upgrade_cb() and
1218
* dmu_objset_userspace_upgrade_cb() if useraccounting has been
1219
* completed.
1220
*/
1221
intval = osp->os_flags;
1222
if (should_bswap)
1223
intval = BSWAP_64(intval);
1224
boolean_t uacct_incomplete =
1225
!(intval & OBJSET_FLAG_USERACCOUNTING_COMPLETE);
1226
1227
/*
1228
* The local MAC protects the user, group and project accounting.
1229
* If these objects are not present, the local MAC is zeroed out.
1230
*/
1231
if (uacct_incomplete ||
1232
(datalen >= OBJSET_PHYS_SIZE_V3 &&
1233
osp->os_userused_dnode.dn_type == DMU_OT_NONE &&
1234
osp->os_groupused_dnode.dn_type == DMU_OT_NONE &&
1235
osp->os_projectused_dnode.dn_type == DMU_OT_NONE) ||
1236
(datalen >= OBJSET_PHYS_SIZE_V2 &&
1237
osp->os_userused_dnode.dn_type == DMU_OT_NONE &&
1238
osp->os_groupused_dnode.dn_type == DMU_OT_NONE) ||
1239
(datalen <= OBJSET_PHYS_SIZE_V1)) {
1240
memset(local_mac, 0, ZIO_OBJSET_MAC_LEN);
1241
return (0);
1242
}
1243
1244
/* calculate the local MAC from the userused and groupused dnodes */
1245
ret = crypto_mac_init(&mech, &key->zk_hmac_key, NULL, &ctx);
1246
if (ret != CRYPTO_SUCCESS) {
1247
ret = SET_ERROR(EIO);
1248
goto error;
1249
}
1250
1251
/* add in the non-portable os_flags */
1252
intval = osp->os_flags;
1253
if (should_bswap)
1254
intval = BSWAP_64(intval);
1255
intval &= ~OBJSET_CRYPT_PORTABLE_FLAGS_MASK;
1256
if (!ZFS_HOST_BYTEORDER)
1257
intval = BSWAP_64(intval);
1258
1259
cd.cd_length = sizeof (uint64_t);
1260
cd.cd_raw.iov_base = (char *)&intval;
1261
cd.cd_raw.iov_len = cd.cd_length;
1262
1263
ret = crypto_mac_update(ctx, &cd);
1264
if (ret != CRYPTO_SUCCESS) {
1265
ret = SET_ERROR(EIO);
1266
goto error;
1267
}
1268
1269
/* add in fields from the user accounting dnodes */
1270
if (osp->os_userused_dnode.dn_type != DMU_OT_NONE) {
1271
ret = zio_crypt_do_dnode_hmac_updates(ctx, key->zk_version,
1272
should_bswap, &osp->os_userused_dnode);
1273
if (ret)
1274
goto error;
1275
}
1276
1277
if (osp->os_groupused_dnode.dn_type != DMU_OT_NONE) {
1278
ret = zio_crypt_do_dnode_hmac_updates(ctx, key->zk_version,
1279
should_bswap, &osp->os_groupused_dnode);
1280
if (ret)
1281
goto error;
1282
}
1283
1284
if (osp->os_projectused_dnode.dn_type != DMU_OT_NONE &&
1285
datalen >= OBJSET_PHYS_SIZE_V3) {
1286
ret = zio_crypt_do_dnode_hmac_updates(ctx, key->zk_version,
1287
should_bswap, &osp->os_projectused_dnode);
1288
if (ret)
1289
goto error;
1290
}
1291
1292
/* store the final digest in a temporary buffer and copy what we need */
1293
cd.cd_length = SHA512_DIGEST_LENGTH;
1294
cd.cd_raw.iov_base = (char *)raw_local_mac;
1295
cd.cd_raw.iov_len = cd.cd_length;
1296
1297
ret = crypto_mac_final(ctx, &cd);
1298
if (ret != CRYPTO_SUCCESS) {
1299
ret = SET_ERROR(EIO);
1300
goto error;
1301
}
1302
1303
memcpy(local_mac, raw_local_mac, ZIO_OBJSET_MAC_LEN);
1304
1305
return (0);
1306
1307
error:
1308
memset(portable_mac, 0, ZIO_OBJSET_MAC_LEN);
1309
memset(local_mac, 0, ZIO_OBJSET_MAC_LEN);
1310
return (ret);
1311
}
1312
1313
static void
1314
zio_crypt_destroy_uio(zfs_uio_t *uio)
1315
{
1316
if (uio->uio_iov)
1317
kmem_free(uio->uio_iov, uio->uio_iovcnt * sizeof (iovec_t));
1318
}
1319
1320
/*
1321
* This function parses an uncompressed indirect block and returns a checksum
1322
* of all the portable fields from all of the contained bps. The portable
1323
* fields are the MAC and all of the fields from blk_prop except for the dedup,
1324
* checksum, and psize bits. For an explanation of the purpose of this, see
1325
* the comment block on object set authentication.
1326
*/
1327
static int
1328
zio_crypt_do_indirect_mac_checksum_impl(boolean_t generate, void *buf,
1329
uint_t datalen, uint64_t version, boolean_t byteswap, uint8_t *cksum)
1330
{
1331
blkptr_t *bp;
1332
int i, epb = datalen >> SPA_BLKPTRSHIFT;
1333
SHA2_CTX ctx;
1334
uint8_t digestbuf[SHA512_DIGEST_LENGTH];
1335
1336
/* checksum all of the MACs from the layer below */
1337
SHA2Init(SHA512, &ctx);
1338
for (i = 0, bp = buf; i < epb; i++, bp++) {
1339
zio_crypt_bp_do_indrect_checksum_updates(&ctx, version,
1340
byteswap, bp);
1341
}
1342
SHA2Final(digestbuf, &ctx);
1343
1344
if (generate) {
1345
memcpy(cksum, digestbuf, ZIO_DATA_MAC_LEN);
1346
return (0);
1347
}
1348
1349
if (memcmp(digestbuf, cksum, ZIO_DATA_MAC_LEN) != 0)
1350
return (SET_ERROR(ECKSUM));
1351
1352
return (0);
1353
}
1354
1355
int
1356
zio_crypt_do_indirect_mac_checksum(boolean_t generate, void *buf,
1357
uint_t datalen, boolean_t byteswap, uint8_t *cksum)
1358
{
1359
int ret;
1360
1361
/*
1362
* Unfortunately, callers of this function will not always have
1363
* easy access to the on-disk format version. This info is
1364
* normally found in the DSL Crypto Key, but the checksum-of-MACs
1365
* is expected to be verifiable even when the key isn't loaded.
1366
* Here, instead of doing a ZAP lookup for the version for each
1367
* zio, we simply try both existing formats.
1368
*/
1369
ret = zio_crypt_do_indirect_mac_checksum_impl(generate, buf,
1370
datalen, ZIO_CRYPT_KEY_CURRENT_VERSION, byteswap, cksum);
1371
if (ret == ECKSUM) {
1372
ASSERT(!generate);
1373
ret = zio_crypt_do_indirect_mac_checksum_impl(generate,
1374
buf, datalen, 0, byteswap, cksum);
1375
}
1376
1377
return (ret);
1378
}
1379
1380
int
1381
zio_crypt_do_indirect_mac_checksum_abd(boolean_t generate, abd_t *abd,
1382
uint_t datalen, boolean_t byteswap, uint8_t *cksum)
1383
{
1384
int ret;
1385
void *buf;
1386
1387
buf = abd_borrow_buf_copy(abd, datalen);
1388
ret = zio_crypt_do_indirect_mac_checksum(generate, buf, datalen,
1389
byteswap, cksum);
1390
abd_return_buf(abd, buf, datalen);
1391
1392
return (ret);
1393
}
1394
1395
/*
1396
* Special case handling routine for encrypting / decrypting ZIL blocks.
1397
* We do not check for the older ZIL chain because the encryption feature
1398
* was not available before the newer ZIL chain was introduced. The goal
1399
* here is to encrypt everything except the blkptr_t of a lr_write_t and
1400
* the zil_chain_t header. Everything that is not encrypted is authenticated.
1401
*/
1402
static int
1403
zio_crypt_init_uios_zil(boolean_t encrypt, uint8_t *plainbuf,
1404
uint8_t *cipherbuf, uint_t datalen, boolean_t byteswap, zfs_uio_t *puio,
1405
zfs_uio_t *cuio, uint_t *enc_len, uint8_t **authbuf, uint_t *auth_len,
1406
boolean_t *no_crypt)
1407
{
1408
int ret;
1409
uint64_t txtype, lr_len, nused;
1410
uint_t nr_src, nr_dst, crypt_len;
1411
uint_t aad_len = 0, nr_iovecs = 0, total_len = 0;
1412
iovec_t *src_iovecs = NULL, *dst_iovecs = NULL;
1413
uint8_t *src, *dst, *slrp, *dlrp, *blkend, *aadp;
1414
zil_chain_t *zilc;
1415
lr_t *lr;
1416
uint8_t *aadbuf = zio_buf_alloc(datalen);
1417
1418
/* cipherbuf always needs an extra iovec for the MAC */
1419
if (encrypt) {
1420
src = plainbuf;
1421
dst = cipherbuf;
1422
nr_src = 0;
1423
nr_dst = 1;
1424
} else {
1425
src = cipherbuf;
1426
dst = plainbuf;
1427
nr_src = 1;
1428
nr_dst = 0;
1429
}
1430
memset(dst, 0, datalen);
1431
1432
/* find the start and end record of the log block */
1433
zilc = (zil_chain_t *)src;
1434
slrp = src + sizeof (zil_chain_t);
1435
aadp = aadbuf;
1436
nused = ((byteswap) ? BSWAP_64(zilc->zc_nused) : zilc->zc_nused);
1437
ASSERT3U(nused, >=, sizeof (zil_chain_t));
1438
ASSERT3U(nused, <=, datalen);
1439
blkend = src + nused;
1440
1441
/* calculate the number of encrypted iovecs we will need */
1442
for (; slrp < blkend; slrp += lr_len) {
1443
lr = (lr_t *)slrp;
1444
1445
if (!byteswap) {
1446
txtype = lr->lrc_txtype;
1447
lr_len = lr->lrc_reclen;
1448
} else {
1449
txtype = BSWAP_64(lr->lrc_txtype);
1450
lr_len = BSWAP_64(lr->lrc_reclen);
1451
}
1452
ASSERT3U(lr_len, >=, sizeof (lr_t));
1453
ASSERT3U(lr_len, <=, blkend - slrp);
1454
1455
nr_iovecs++;
1456
if (txtype == TX_WRITE && lr_len != sizeof (lr_write_t))
1457
nr_iovecs++;
1458
}
1459
1460
nr_src += nr_iovecs;
1461
nr_dst += nr_iovecs;
1462
1463
/* allocate the iovec arrays */
1464
if (nr_src != 0) {
1465
src_iovecs = kmem_alloc(nr_src * sizeof (iovec_t), KM_SLEEP);
1466
if (src_iovecs == NULL) {
1467
ret = SET_ERROR(ENOMEM);
1468
goto error;
1469
}
1470
}
1471
1472
if (nr_dst != 0) {
1473
dst_iovecs = kmem_alloc(nr_dst * sizeof (iovec_t), KM_SLEEP);
1474
if (dst_iovecs == NULL) {
1475
ret = SET_ERROR(ENOMEM);
1476
goto error;
1477
}
1478
}
1479
1480
/*
1481
* Copy the plain zil header over and authenticate everything except
1482
* the checksum that will store our MAC. If we are writing the data
1483
* the embedded checksum will not have been calculated yet, so we don't
1484
* authenticate that.
1485
*/
1486
memcpy(dst, src, sizeof (zil_chain_t));
1487
memcpy(aadp, src, sizeof (zil_chain_t) - sizeof (zio_eck_t));
1488
aadp += sizeof (zil_chain_t) - sizeof (zio_eck_t);
1489
aad_len += sizeof (zil_chain_t) - sizeof (zio_eck_t);
1490
1491
/* loop over records again, filling in iovecs */
1492
nr_iovecs = 0;
1493
slrp = src + sizeof (zil_chain_t);
1494
dlrp = dst + sizeof (zil_chain_t);
1495
1496
for (; slrp < blkend; slrp += lr_len, dlrp += lr_len) {
1497
lr = (lr_t *)slrp;
1498
1499
if (!byteswap) {
1500
txtype = lr->lrc_txtype;
1501
lr_len = lr->lrc_reclen;
1502
} else {
1503
txtype = BSWAP_64(lr->lrc_txtype);
1504
lr_len = BSWAP_64(lr->lrc_reclen);
1505
}
1506
1507
/* copy the common lr_t */
1508
memcpy(dlrp, slrp, sizeof (lr_t));
1509
memcpy(aadp, slrp, sizeof (lr_t));
1510
aadp += sizeof (lr_t);
1511
aad_len += sizeof (lr_t);
1512
1513
ASSERT3P(src_iovecs, !=, NULL);
1514
ASSERT3P(dst_iovecs, !=, NULL);
1515
1516
/*
1517
* If this is a TX_WRITE record we want to encrypt everything
1518
* except the bp if exists. If the bp does exist we want to
1519
* authenticate it.
1520
*/
1521
if (txtype == TX_WRITE) {
1522
const size_t o = offsetof(lr_write_t, lr_blkptr);
1523
crypt_len = o - sizeof (lr_t);
1524
src_iovecs[nr_iovecs].iov_base = slrp + sizeof (lr_t);
1525
src_iovecs[nr_iovecs].iov_len = crypt_len;
1526
dst_iovecs[nr_iovecs].iov_base = dlrp + sizeof (lr_t);
1527
dst_iovecs[nr_iovecs].iov_len = crypt_len;
1528
1529
/* copy the bp now since it will not be encrypted */
1530
memcpy(dlrp + o, slrp + o, sizeof (blkptr_t));
1531
memcpy(aadp, slrp + o, sizeof (blkptr_t));
1532
aadp += sizeof (blkptr_t);
1533
aad_len += sizeof (blkptr_t);
1534
nr_iovecs++;
1535
total_len += crypt_len;
1536
1537
if (lr_len != sizeof (lr_write_t)) {
1538
crypt_len = lr_len - sizeof (lr_write_t);
1539
src_iovecs[nr_iovecs].iov_base =
1540
slrp + sizeof (lr_write_t);
1541
src_iovecs[nr_iovecs].iov_len = crypt_len;
1542
dst_iovecs[nr_iovecs].iov_base =
1543
dlrp + sizeof (lr_write_t);
1544
dst_iovecs[nr_iovecs].iov_len = crypt_len;
1545
nr_iovecs++;
1546
total_len += crypt_len;
1547
}
1548
} else if (txtype == TX_CLONE_RANGE) {
1549
const size_t o = offsetof(lr_clone_range_t, lr_nbps);
1550
crypt_len = o - sizeof (lr_t);
1551
src_iovecs[nr_iovecs].iov_base = slrp + sizeof (lr_t);
1552
src_iovecs[nr_iovecs].iov_len = crypt_len;
1553
dst_iovecs[nr_iovecs].iov_base = dlrp + sizeof (lr_t);
1554
dst_iovecs[nr_iovecs].iov_len = crypt_len;
1555
1556
/* copy the bps now since they will not be encrypted */
1557
memcpy(dlrp + o, slrp + o, lr_len - o);
1558
memcpy(aadp, slrp + o, lr_len - o);
1559
aadp += lr_len - o;
1560
aad_len += lr_len - o;
1561
nr_iovecs++;
1562
total_len += crypt_len;
1563
} else {
1564
crypt_len = lr_len - sizeof (lr_t);
1565
src_iovecs[nr_iovecs].iov_base = slrp + sizeof (lr_t);
1566
src_iovecs[nr_iovecs].iov_len = crypt_len;
1567
dst_iovecs[nr_iovecs].iov_base = dlrp + sizeof (lr_t);
1568
dst_iovecs[nr_iovecs].iov_len = crypt_len;
1569
nr_iovecs++;
1570
total_len += crypt_len;
1571
}
1572
}
1573
1574
*no_crypt = (nr_iovecs == 0);
1575
*enc_len = total_len;
1576
*authbuf = aadbuf;
1577
*auth_len = aad_len;
1578
1579
if (encrypt) {
1580
puio->uio_iov = src_iovecs;
1581
puio->uio_iovcnt = nr_src;
1582
cuio->uio_iov = dst_iovecs;
1583
cuio->uio_iovcnt = nr_dst;
1584
} else {
1585
puio->uio_iov = dst_iovecs;
1586
puio->uio_iovcnt = nr_dst;
1587
cuio->uio_iov = src_iovecs;
1588
cuio->uio_iovcnt = nr_src;
1589
}
1590
1591
return (0);
1592
1593
error:
1594
zio_buf_free(aadbuf, datalen);
1595
if (src_iovecs != NULL)
1596
kmem_free(src_iovecs, nr_src * sizeof (iovec_t));
1597
if (dst_iovecs != NULL)
1598
kmem_free(dst_iovecs, nr_dst * sizeof (iovec_t));
1599
1600
*enc_len = 0;
1601
*authbuf = NULL;
1602
*auth_len = 0;
1603
*no_crypt = B_FALSE;
1604
puio->uio_iov = NULL;
1605
puio->uio_iovcnt = 0;
1606
cuio->uio_iov = NULL;
1607
cuio->uio_iovcnt = 0;
1608
return (ret);
1609
}
1610
1611
/*
1612
* Special case handling routine for encrypting / decrypting dnode blocks.
1613
*/
1614
static int
1615
zio_crypt_init_uios_dnode(boolean_t encrypt, uint64_t version,
1616
uint8_t *plainbuf, uint8_t *cipherbuf, uint_t datalen, boolean_t byteswap,
1617
zfs_uio_t *puio, zfs_uio_t *cuio, uint_t *enc_len, uint8_t **authbuf,
1618
uint_t *auth_len, boolean_t *no_crypt)
1619
{
1620
int ret;
1621
uint_t nr_src, nr_dst, crypt_len;
1622
uint_t aad_len = 0, nr_iovecs = 0, total_len = 0;
1623
uint_t i, j, max_dnp = datalen >> DNODE_SHIFT;
1624
iovec_t *src_iovecs = NULL, *dst_iovecs = NULL;
1625
uint8_t *src, *dst, *aadp;
1626
dnode_phys_t *dnp, *adnp, *sdnp, *ddnp;
1627
uint8_t *aadbuf = zio_buf_alloc(datalen);
1628
1629
if (encrypt) {
1630
src = plainbuf;
1631
dst = cipherbuf;
1632
nr_src = 0;
1633
nr_dst = 1;
1634
} else {
1635
src = cipherbuf;
1636
dst = plainbuf;
1637
nr_src = 1;
1638
nr_dst = 0;
1639
}
1640
1641
sdnp = (dnode_phys_t *)src;
1642
ddnp = (dnode_phys_t *)dst;
1643
aadp = aadbuf;
1644
1645
/*
1646
* Count the number of iovecs we will need to do the encryption by
1647
* counting the number of bonus buffers that need to be encrypted.
1648
*/
1649
for (i = 0; i < max_dnp; i += sdnp[i].dn_extra_slots + 1) {
1650
/*
1651
* This block may still be byteswapped. However, all of the
1652
* values we use are either uint8_t's (for which byteswapping
1653
* is a noop) or a * != 0 check, which will work regardless
1654
* of whether or not we byteswap.
1655
*/
1656
if (sdnp[i].dn_type != DMU_OT_NONE &&
1657
DMU_OT_IS_ENCRYPTED(sdnp[i].dn_bonustype) &&
1658
sdnp[i].dn_bonuslen != 0) {
1659
nr_iovecs++;
1660
}
1661
}
1662
1663
nr_src += nr_iovecs;
1664
nr_dst += nr_iovecs;
1665
1666
if (nr_src != 0) {
1667
src_iovecs = kmem_alloc(nr_src * sizeof (iovec_t), KM_SLEEP);
1668
if (src_iovecs == NULL) {
1669
ret = SET_ERROR(ENOMEM);
1670
goto error;
1671
}
1672
}
1673
1674
if (nr_dst != 0) {
1675
dst_iovecs = kmem_alloc(nr_dst * sizeof (iovec_t), KM_SLEEP);
1676
if (dst_iovecs == NULL) {
1677
ret = SET_ERROR(ENOMEM);
1678
goto error;
1679
}
1680
}
1681
1682
nr_iovecs = 0;
1683
1684
/*
1685
* Iterate through the dnodes again, this time filling in the uios
1686
* we allocated earlier. We also concatenate any data we want to
1687
* authenticate onto aadbuf.
1688
*/
1689
for (i = 0; i < max_dnp; i += sdnp[i].dn_extra_slots + 1) {
1690
dnp = &sdnp[i];
1691
1692
/* copy over the core fields and blkptrs (kept as plaintext) */
1693
memcpy(&ddnp[i], dnp,
1694
(uint8_t *)DN_BONUS(dnp) - (uint8_t *)dnp);
1695
1696
if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) {
1697
memcpy(DN_SPILL_BLKPTR(&ddnp[i]), DN_SPILL_BLKPTR(dnp),
1698
sizeof (blkptr_t));
1699
}
1700
1701
/*
1702
* Handle authenticated data. We authenticate everything in
1703
* the dnode that can be brought over when we do a raw send.
1704
* This includes all of the core fields as well as the MACs
1705
* stored in the bp checksums and all of the portable bits
1706
* from blk_prop. We include the dnode padding here in case it
1707
* ever gets used in the future. Some dn_flags and dn_used are
1708
* not portable so we mask those out values out of the
1709
* authenticated data.
1710
*/
1711
crypt_len = offsetof(dnode_phys_t, dn_blkptr);
1712
memcpy(aadp, dnp, crypt_len);
1713
adnp = (dnode_phys_t *)aadp;
1714
adnp->dn_flags &= DNODE_CRYPT_PORTABLE_FLAGS_MASK;
1715
adnp->dn_used = 0;
1716
aadp += crypt_len;
1717
aad_len += crypt_len;
1718
1719
for (j = 0; j < dnp->dn_nblkptr; j++) {
1720
zio_crypt_bp_do_aad_updates(&aadp, &aad_len,
1721
version, byteswap, &dnp->dn_blkptr[j]);
1722
}
1723
1724
if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) {
1725
zio_crypt_bp_do_aad_updates(&aadp, &aad_len,
1726
version, byteswap, DN_SPILL_BLKPTR(dnp));
1727
}
1728
1729
/*
1730
* If this bonus buffer needs to be encrypted, we prepare an
1731
* iovec_t. The encryption / decryption functions will fill
1732
* this in for us with the encrypted or decrypted data.
1733
* Otherwise we add the bonus buffer to the authenticated
1734
* data buffer and copy it over to the destination. The
1735
* encrypted iovec extends to DN_MAX_BONUS_LEN(dnp) so that
1736
* we can guarantee alignment with the AES block size
1737
* (128 bits).
1738
*/
1739
crypt_len = DN_MAX_BONUS_LEN(dnp);
1740
if (dnp->dn_type != DMU_OT_NONE &&
1741
DMU_OT_IS_ENCRYPTED(dnp->dn_bonustype) &&
1742
dnp->dn_bonuslen != 0) {
1743
ASSERT3U(nr_iovecs, <, nr_src);
1744
ASSERT3U(nr_iovecs, <, nr_dst);
1745
ASSERT3P(src_iovecs, !=, NULL);
1746
ASSERT3P(dst_iovecs, !=, NULL);
1747
src_iovecs[nr_iovecs].iov_base = DN_BONUS(dnp);
1748
src_iovecs[nr_iovecs].iov_len = crypt_len;
1749
dst_iovecs[nr_iovecs].iov_base = DN_BONUS(&ddnp[i]);
1750
dst_iovecs[nr_iovecs].iov_len = crypt_len;
1751
1752
nr_iovecs++;
1753
total_len += crypt_len;
1754
} else {
1755
memcpy(DN_BONUS(&ddnp[i]), DN_BONUS(dnp), crypt_len);
1756
memcpy(aadp, DN_BONUS(dnp), crypt_len);
1757
aadp += crypt_len;
1758
aad_len += crypt_len;
1759
}
1760
}
1761
1762
*no_crypt = (nr_iovecs == 0);
1763
*enc_len = total_len;
1764
*authbuf = aadbuf;
1765
*auth_len = aad_len;
1766
1767
if (encrypt) {
1768
puio->uio_iov = src_iovecs;
1769
puio->uio_iovcnt = nr_src;
1770
cuio->uio_iov = dst_iovecs;
1771
cuio->uio_iovcnt = nr_dst;
1772
} else {
1773
puio->uio_iov = dst_iovecs;
1774
puio->uio_iovcnt = nr_dst;
1775
cuio->uio_iov = src_iovecs;
1776
cuio->uio_iovcnt = nr_src;
1777
}
1778
1779
return (0);
1780
1781
error:
1782
zio_buf_free(aadbuf, datalen);
1783
if (src_iovecs != NULL)
1784
kmem_free(src_iovecs, nr_src * sizeof (iovec_t));
1785
if (dst_iovecs != NULL)
1786
kmem_free(dst_iovecs, nr_dst * sizeof (iovec_t));
1787
1788
*enc_len = 0;
1789
*authbuf = NULL;
1790
*auth_len = 0;
1791
*no_crypt = B_FALSE;
1792
puio->uio_iov = NULL;
1793
puio->uio_iovcnt = 0;
1794
cuio->uio_iov = NULL;
1795
cuio->uio_iovcnt = 0;
1796
return (ret);
1797
}
1798
1799
static int
1800
zio_crypt_init_uios_normal(boolean_t encrypt, uint8_t *plainbuf,
1801
uint8_t *cipherbuf, uint_t datalen, zfs_uio_t *puio, zfs_uio_t *cuio,
1802
uint_t *enc_len)
1803
{
1804
(void) encrypt;
1805
int ret;
1806
uint_t nr_plain = 1, nr_cipher = 2;
1807
iovec_t *plain_iovecs = NULL, *cipher_iovecs = NULL;
1808
1809
/* allocate the iovecs for the plain and cipher data */
1810
plain_iovecs = kmem_alloc(nr_plain * sizeof (iovec_t),
1811
KM_SLEEP);
1812
if (!plain_iovecs) {
1813
ret = SET_ERROR(ENOMEM);
1814
goto error;
1815
}
1816
1817
cipher_iovecs = kmem_alloc(nr_cipher * sizeof (iovec_t),
1818
KM_SLEEP);
1819
if (!cipher_iovecs) {
1820
ret = SET_ERROR(ENOMEM);
1821
goto error;
1822
}
1823
1824
plain_iovecs[0].iov_base = plainbuf;
1825
plain_iovecs[0].iov_len = datalen;
1826
cipher_iovecs[0].iov_base = cipherbuf;
1827
cipher_iovecs[0].iov_len = datalen;
1828
1829
*enc_len = datalen;
1830
puio->uio_iov = plain_iovecs;
1831
puio->uio_iovcnt = nr_plain;
1832
cuio->uio_iov = cipher_iovecs;
1833
cuio->uio_iovcnt = nr_cipher;
1834
1835
return (0);
1836
1837
error:
1838
if (plain_iovecs != NULL)
1839
kmem_free(plain_iovecs, nr_plain * sizeof (iovec_t));
1840
if (cipher_iovecs != NULL)
1841
kmem_free(cipher_iovecs, nr_cipher * sizeof (iovec_t));
1842
1843
*enc_len = 0;
1844
puio->uio_iov = NULL;
1845
puio->uio_iovcnt = 0;
1846
cuio->uio_iov = NULL;
1847
cuio->uio_iovcnt = 0;
1848
return (ret);
1849
}
1850
1851
/*
1852
* This function builds up the plaintext (puio) and ciphertext (cuio) uios so
1853
* that they can be used for encryption and decryption by zio_do_crypt_uio().
1854
* Most blocks will use zio_crypt_init_uios_normal(), with ZIL and dnode blocks
1855
* requiring special handling to parse out pieces that are to be encrypted. The
1856
* authbuf is used by these special cases to store additional authenticated
1857
* data (AAD) for the encryption modes.
1858
*/
1859
static int
1860
zio_crypt_init_uios(boolean_t encrypt, uint64_t version, dmu_object_type_t ot,
1861
uint8_t *plainbuf, uint8_t *cipherbuf, uint_t datalen, boolean_t byteswap,
1862
uint8_t *mac, zfs_uio_t *puio, zfs_uio_t *cuio, uint_t *enc_len,
1863
uint8_t **authbuf, uint_t *auth_len, boolean_t *no_crypt)
1864
{
1865
int ret;
1866
iovec_t *mac_iov;
1867
1868
ASSERT(DMU_OT_IS_ENCRYPTED(ot) || ot == DMU_OT_NONE);
1869
1870
/* route to handler */
1871
switch (ot) {
1872
case DMU_OT_INTENT_LOG:
1873
ret = zio_crypt_init_uios_zil(encrypt, plainbuf, cipherbuf,
1874
datalen, byteswap, puio, cuio, enc_len, authbuf, auth_len,
1875
no_crypt);
1876
break;
1877
case DMU_OT_DNODE:
1878
ret = zio_crypt_init_uios_dnode(encrypt, version, plainbuf,
1879
cipherbuf, datalen, byteswap, puio, cuio, enc_len, authbuf,
1880
auth_len, no_crypt);
1881
break;
1882
default:
1883
ret = zio_crypt_init_uios_normal(encrypt, plainbuf, cipherbuf,
1884
datalen, puio, cuio, enc_len);
1885
*authbuf = NULL;
1886
*auth_len = 0;
1887
*no_crypt = B_FALSE;
1888
break;
1889
}
1890
1891
if (ret != 0)
1892
goto error;
1893
1894
/* populate the uios */
1895
puio->uio_segflg = UIO_SYSSPACE;
1896
cuio->uio_segflg = UIO_SYSSPACE;
1897
1898
mac_iov = ((iovec_t *)&cuio->uio_iov[cuio->uio_iovcnt - 1]);
1899
mac_iov->iov_base = mac;
1900
mac_iov->iov_len = ZIO_DATA_MAC_LEN;
1901
1902
return (0);
1903
1904
error:
1905
return (ret);
1906
}
1907
1908
/*
1909
* Primary encryption / decryption entrypoint for zio data.
1910
*/
1911
int
1912
zio_do_crypt_data(boolean_t encrypt, zio_crypt_key_t *key,
1913
dmu_object_type_t ot, boolean_t byteswap, uint8_t *salt, uint8_t *iv,
1914
uint8_t *mac, uint_t datalen, uint8_t *plainbuf, uint8_t *cipherbuf,
1915
boolean_t *no_crypt)
1916
{
1917
int ret;
1918
boolean_t locked = B_FALSE;
1919
uint64_t crypt = key->zk_crypt;
1920
uint_t keydata_len = zio_crypt_table[crypt].ci_keylen;
1921
uint_t enc_len, auth_len;
1922
zfs_uio_t puio, cuio;
1923
uint8_t enc_keydata[MASTER_KEY_MAX_LEN];
1924
crypto_key_t tmp_ckey, *ckey = NULL;
1925
crypto_ctx_template_t tmpl;
1926
uint8_t *authbuf = NULL;
1927
1928
memset(&puio, 0, sizeof (puio));
1929
memset(&cuio, 0, sizeof (cuio));
1930
1931
/*
1932
* If the needed key is the current one, just use it. Otherwise we
1933
* need to generate a temporary one from the given salt + master key.
1934
* If we are encrypting, we must return a copy of the current salt
1935
* so that it can be stored in the blkptr_t.
1936
*/
1937
rw_enter(&key->zk_salt_lock, RW_READER);
1938
locked = B_TRUE;
1939
1940
if (memcmp(salt, key->zk_salt, ZIO_DATA_SALT_LEN) == 0) {
1941
ckey = &key->zk_current_key;
1942
tmpl = key->zk_current_tmpl;
1943
} else {
1944
rw_exit(&key->zk_salt_lock);
1945
locked = B_FALSE;
1946
1947
ret = hkdf_sha512(key->zk_master_keydata, keydata_len, NULL, 0,
1948
salt, ZIO_DATA_SALT_LEN, enc_keydata, keydata_len);
1949
if (ret != 0)
1950
goto error;
1951
1952
tmp_ckey.ck_data = enc_keydata;
1953
tmp_ckey.ck_length = CRYPTO_BYTES2BITS(keydata_len);
1954
1955
ckey = &tmp_ckey;
1956
tmpl = NULL;
1957
}
1958
1959
/*
1960
* Attempt to use QAT acceleration if we can. We currently don't
1961
* do this for metadnode and ZIL blocks, since they have a much
1962
* more involved buffer layout and the qat_crypt() function only
1963
* works in-place.
1964
*/
1965
if (qat_crypt_use_accel(datalen) &&
1966
ot != DMU_OT_INTENT_LOG && ot != DMU_OT_DNODE) {
1967
uint8_t *srcbuf, *dstbuf;
1968
1969
if (encrypt) {
1970
srcbuf = plainbuf;
1971
dstbuf = cipherbuf;
1972
} else {
1973
srcbuf = cipherbuf;
1974
dstbuf = plainbuf;
1975
}
1976
1977
ret = qat_crypt((encrypt) ? QAT_ENCRYPT : QAT_DECRYPT, srcbuf,
1978
dstbuf, NULL, 0, iv, mac, ckey, key->zk_crypt, datalen);
1979
if (ret == CPA_STATUS_SUCCESS) {
1980
if (locked) {
1981
rw_exit(&key->zk_salt_lock);
1982
locked = B_FALSE;
1983
}
1984
1985
return (0);
1986
}
1987
/* If the hardware implementation fails fall back to software */
1988
}
1989
1990
/* create uios for encryption */
1991
ret = zio_crypt_init_uios(encrypt, key->zk_version, ot, plainbuf,
1992
cipherbuf, datalen, byteswap, mac, &puio, &cuio, &enc_len,
1993
&authbuf, &auth_len, no_crypt);
1994
if (ret != 0)
1995
goto error;
1996
1997
/* perform the encryption / decryption in software */
1998
ret = zio_do_crypt_uio(encrypt, key->zk_crypt, ckey, tmpl, iv, enc_len,
1999
&puio, &cuio, authbuf, auth_len);
2000
if (ret != 0)
2001
goto error;
2002
2003
if (locked) {
2004
rw_exit(&key->zk_salt_lock);
2005
}
2006
2007
if (authbuf != NULL)
2008
zio_buf_free(authbuf, datalen);
2009
if (ckey == &tmp_ckey)
2010
memset(enc_keydata, 0, keydata_len);
2011
zio_crypt_destroy_uio(&puio);
2012
zio_crypt_destroy_uio(&cuio);
2013
2014
return (0);
2015
2016
error:
2017
if (locked)
2018
rw_exit(&key->zk_salt_lock);
2019
if (authbuf != NULL)
2020
zio_buf_free(authbuf, datalen);
2021
if (ckey == &tmp_ckey)
2022
memset(enc_keydata, 0, keydata_len);
2023
zio_crypt_destroy_uio(&puio);
2024
zio_crypt_destroy_uio(&cuio);
2025
2026
return (ret);
2027
}
2028
2029
/*
2030
* Simple wrapper around zio_do_crypt_data() to work with abd's instead of
2031
* linear buffers.
2032
*/
2033
int
2034
zio_do_crypt_abd(boolean_t encrypt, zio_crypt_key_t *key, dmu_object_type_t ot,
2035
boolean_t byteswap, uint8_t *salt, uint8_t *iv, uint8_t *mac,
2036
uint_t datalen, abd_t *pabd, abd_t *cabd, boolean_t *no_crypt)
2037
{
2038
int ret;
2039
void *ptmp, *ctmp;
2040
2041
if (encrypt) {
2042
ptmp = abd_borrow_buf_copy(pabd, datalen);
2043
ctmp = abd_borrow_buf(cabd, datalen);
2044
} else {
2045
ptmp = abd_borrow_buf(pabd, datalen);
2046
ctmp = abd_borrow_buf_copy(cabd, datalen);
2047
}
2048
2049
ret = zio_do_crypt_data(encrypt, key, ot, byteswap, salt, iv, mac,
2050
datalen, ptmp, ctmp, no_crypt);
2051
if (ret != 0)
2052
goto error;
2053
2054
if (encrypt) {
2055
abd_return_buf(pabd, ptmp, datalen);
2056
abd_return_buf_copy(cabd, ctmp, datalen);
2057
} else {
2058
abd_return_buf_copy(pabd, ptmp, datalen);
2059
abd_return_buf(cabd, ctmp, datalen);
2060
}
2061
2062
return (0);
2063
2064
error:
2065
if (encrypt) {
2066
abd_return_buf(pabd, ptmp, datalen);
2067
abd_return_buf_copy(cabd, ctmp, datalen);
2068
} else {
2069
abd_return_buf_copy(pabd, ptmp, datalen);
2070
abd_return_buf(cabd, ctmp, datalen);
2071
}
2072
2073
return (ret);
2074
}
2075
2076
#if defined(_KERNEL)
2077
module_param(zfs_key_max_salt_uses, ulong, 0644);
2078
MODULE_PARM_DESC(zfs_key_max_salt_uses, "Max number of times a salt value "
2079
"can be used for generating encryption keys before it is rotated");
2080
#endif
2081
2082