Vault Cryptography
File Header Encryption
Note
The following section only applies to vaults with the cipher combo SIV_GCM
in the decoded JWT payload. For vaults with SIV_CTRMAC
, have a look at our 1.6 documentation.
The file header stores certain metadata, which is needed for file content encryption. It consists of 68 bytes.
12 bytes nonce used during header payload encryption.
40 bytes AES-GCM encrypted payload consisting of:
8 bytes filled with 1 for future use (formerly used for file size) and
32 bytes file content key.
16 bytes tag of the encrypted payload.
headerNonce := createRandomBytes(12)
contentKey := createRandomBytes(32)
cleartextPayload := 0xFFFFFFFFFFFFFFFF . contentKey
ciphertextPayload, tag := aesGcm(cleartextPayload, encryptionMasterKey, headerNonce)
File Content Encryption
This is where your actual file contents get encrypted.
The cleartext is broken down into multiple chunks, each up to 32 KiB + 28 bytes consisting of:
12 bytes nonce,
up to 32 KiB encrypted payload using AES-GCM with the file content key, and
16 bytes tag computed by GCM with the following AAD:
chunk number as 64 bit big endian integer (to prevent undetected reordering),
file header nonce (to bind this chunk to the file header),
Afterwards, the encrypted chunks are joined preserving the order of the cleartext chunks. The payload of the last chunk may be smaller than 32 KiB.
cleartextChunks[] := split(cleartext, 32KiB)
for (int i = 0; i < length(cleartextChunks); i++) {
chunkNonce := createRandomBytes(12)
aad := [bigEndian(i), headerNonce]
ciphertextPayload, tag := aesGcm(cleartextChunks[i], contentKey, chunkNonce, aad)
ciphertextChunks[i] := chunkNonce . ciphertextPayload . tag
}
ciphertextFileContent := join(ciphertextChunks[])
Directory IDs
Each directory has a unique ID that is required during filename encryption. For historical reasons, the directory ID is a string, even though any byte sequence would do the job.
The directory ID for the root directory is the empty string. For all other directories, it is a random sequence of at most 36 ASCII chars. We recommend using random UUID.
dirId := createUuid()
When traversing directories, the directory ID of a given subdirectory is processed in four steps to determine the storage path inside the vault:
Encrypting the directory ID using AES-SIV in order to encrypt directory hierarchies.
Creating a SHA1 hash of the encrypted directory ID in order to get a uniform length.
Encoding the hash with Base32 to get a string of printable chars.
Constructing the directory path out of the Base32-encoded hash.
dirIdHash := base32(sha1(aesSiv(dirId, null, encryptionMasterKey, macMasterKey)))
dirPath := vaultRoot + '/d/' + substr(dirIdHash, 0, 2) + '/' + substr(dirIdHash, 2, 30)
Regardless of the hierarchy of cleartext paths, ciphertext directories are always stored in a flattened structure. All directories will therefore effectively be siblings (or cousins, to be precise).
Filename Encryption
The cleartext name of a file gets encoded using UTF-8 in Normalization Form C to get a unique binary representation.
Cryptomator uses AES-SIV to encrypt names. The directory ID of the parent folder is passed as associated data. This prevents undetected movement of files between directories.
ciphertextName := base64url(aesSiv(cleartextName, parentDirId, encryptionMasterKey, macMasterKey)) + '.c9r'
Depending on the kind of node, the encrypted name is then either used to create a file or a directory.
Files are stored as files.
Non-files are stored as directories. The type of the node then depends on the directory content.
Directories are denoted by a file called
dir.c9r
containing aforementioned directory ID.Symlinks are denoted by a file called
symlink.c9r
containing the encrypted link target.Further types may be appended in future releases.
Thus, a cleartext directory structure like this:
.
├─ File.txt
├─ SymlinkToFile.txt
├─ Subdirectory
│ └─ ...
└─ ...
Becomes a ciphertext directory structure like this:
.
├─ d
│ ├─ BZ
│ │ └─ R4VZSS5PEF7TU3PMFIMON5GJRNBDWA
│ │ ├─ 5TyvCyF255sRtfrIv**83ucADQ==.c9r # File.txt
│ │ ├─ FHTa55bH*sUfVDbEb0gTL9hZ8nho.c9r # Subdirectory
│ │ │ └─ dir.c9r # contains dirId
│ │ └─ gLeOGMCN358*UBf2Qk9cWCQl.c9r # SymlinkToFile.txt
│ │ └─ symlink.c9r # contains link target
│ └─ FC
│ └─ ZKZRLZUODUUYTYA4457CSBPZXB5A77 # contains contents of Subdirectory
│ └─ ...
├─ masterkey.cryptomator
├─ masterkey.cryptomator.DFD9B248.bkup
└─ vault.cryptomator
Name Shortening
Note
This layer doesn’t provide any additional security. Its sole purpose is to maximize compatibility.
To maximize compatibility, we need to make sure the ciphertext names don’t exceed a length of 255 chars. As some cloud sync services might want to add a suffix to a file in case of conflicts, we decided to use at most 220 chars.
If an encrypted name (including its .c9r
extension) exceeds these 220 chars, we will instead create a directory named after its much shorter SHA-1 hash and the .c9s
extension.
Additionally we will create a reverse-mapping file named name.c9s
containing the original file inside of this directory.
if (length(ciphertextName) > 220) {
deflatedName := base64url(sha1(ciphertextName)) + '.c9s'
inflatedNameFilePath := deflatedName + '/name.c9s'
fileContentsPath := deflatedName + '/contents.c9r'
symlinkFilePath := deflatedName + '/symlink.c9r'
dirIdFilePath := deflatedName + '/dir.c9r'
}
Again, we have to distinguish the kind of a node.
Non-files (such as symlinks or directories) are stored as a directory anyway. Nothing changes for them.
Files, on the other hand, need a different place to store their contents. Therefore, we introduce the
contents.c9r
file inside the.c9s
directory.
A vault containing several nodes with very long names might result in a ciphertext structure like this:
.
├─ d
│ ├─ BZ
│ │ └─ R4VZSS5PEF7TU3PMFIMON5GJRNBDWA
│ │ ├─ 5TyvCyF255sRtfrIv**83ucADQ==.c9r
│ │ ├─ FHTa55bH*sUfVDbEb0gTL9hZ8nho.c9r
│ │ │ └─ dir.c9r
│ │ ├─ gLeOGMCN358*UBf2Qk9cWCQl.c9r
│ │ │ └─ symlink.c9r
│ │ ├─ IjTsXtReTy6bAAuxzLPV9T0k2vg=.c9s # shortened name...
│ │ │ ├─ contents.c9r # ...node is a regular file
│ │ │ └─ name.c9s # ...mapping to this full name
│ │ ├─ q2nx5XeNCenHyQvkFD4mxYNrWpQ=.c9s # shortened name...
│ │ │ ├─ dir.c9r # ...node is a directory
│ │ │ └─ name.c9s # ...mapping to this full name
│ │ └─ u*JJCJE-T4IH-EBYASUp1u3p7mA=.c9s # shortened name...
│ │ ├─ name.c9s # ...mapping to this full name
│ │ └─ symlink.c9r # ...node is a symlink
│ └─ FC
│ └─ ZKZRLZUODUUYTYA4457CSBPZXB5A77
│ └─ ...
├─ masterkey.cryptomator
├─ masterkey.cryptomator.DFD9B248.bkup
└─ vault.cryptomator
Backup Directory IDs
Note
This layer is optional and not required for a complete implementation of the Cryptomator Encryption Scheme. It doesn’t provide any additional security. Its sole purpose is to increase data recoverability in case of missing or damaged directory files.
By obfuscating the hierarchy of cleartext paths using dir.c9r
files, which contain directory IDs, the directory structure is more vulnerable to problems like incomplete synchronization or bit rotting.
When a directory file is missing or damaged, the dirPath
cannot be computed, which effectively makes the directory content inaccessible in the virtual filesystem.
In theory, the contents of the encrypted content of these files can be recovered.
But since the filename encryption is dependent on the directory ID of the parent folder, which is only stored in the directory file, names of all items (files, directories, or symlinks) are lost.
To alleviate this issue, a backup directory file will be stored during the creation of a directory.
Inside the ciphertext directory, a file named dirid.c9r
will be created, which contains the directory ID of its parent folder.
It is encrypted like a regular ciphertext file.