Cryptopals Set 2
Posted May 13, 2017; 15 min read
This is the second installment of a mini-series where I walk through the Cryptopals Challenges. This challenge focuses on block cipher cryptography. I suggest reading previous walk-through posts before reading this one.
Cryptopals Sets:
Warning: There are spoilers (solutions) below!
9. Implement PKCS#7 padding
Block ciphers work by encrypting single blocks of plaintext or decrypting single blocks of ciphertext. However, most messages we want to encrypt are irregularly sized and need to be padded to be a multiple the block size (usually 8 or 16 bytes).
This exercise asks us to implement PKCS#7 padding. In PKCS#7, we append a byte representing the number N
, N
times to the end of buffer such that the buffer length is a multiple of the block size. Notice that this PKCS#7 does not work with blocks of size greater than 255.
Unpad PKCS#7
We’re going to need an inverse function to “unpad” a potentially padded buffer later on, so let’s just write one up now.
Remember, if there are not N
bytes, each representing the number N
, at the end of the buffer, then the buffer has not been padded (at least not with PKCS#7)!
10. Implement CBC mode
This exercise involves writing AES-128 functions to encrypt and decrypt in CBC mode by using the AES-128 in ECB mode as done in exercise 7.
Cipherblock Chaining (CBC) mode
In CBC mode, each plaintext block is XOR’d with the previous ciphertext block before being encrypted, essentially chaining the cipherblocks together. The first block is XOR’d with an initialization vector (IV) before being encrypted and the IV is usually considered public knowledge.
This solves the ECB issue where identical plaintext blocks are encrypted to identical ciphertext blocks. However, a disadvantage (compared to EBC and other modes), is that each block cannot be encrypted in parallel and that the message must be padded to be a multiple of the block size.
We can test the correctness of the encrypt and decrypt methods by ensuring the encryption and then decryption of a plaintext returns said plaintext (i.e. d_k(e_k(p)) == p
where e_k
is an encryption function using a key k
and d_k
is an decryption function using a key k
).
The code below is a simplified version of the original version.
Decrypting the ciphertext
Ok, let’s use the aes_123_cbc_dec
defined above to decrypt the ciphertext given in this exercise.
11. An ECB/CBC detection oracle
In this exercise, we are asked to write an function to randomly encrypt a buffer with AES-128 in either ECB or CBC mode and with a random 128 bit key, and then to write an oracle that determines if a ciphertext was encrypted using ECB or CBC mode.
Let’s go step by step.
Generating a random key
I just used Python’s random
module to create N
random bytes where N
is the length of the key in bytes. However, this is almost certainly not a secure way to generate keys randomly…
Encrypting data under an unknown key
The next step is to write a function that encrypts a string. The exercise gives an exact specification of what the function should do, so let’s write it down:
- Prepend and append 5-10 bytes (the exact number chosen randomly) to the plaintext. I assume the number of bytes prepended is the same as the number of bytes appended, but that they are different and chosen randomly.
- Encrypt with AES-128 in ECB mode half the time and in CBC mode the rest of the time.
- If encrypting with AES-128 in CBC mode, use a random IV.
- The key to encrypt the input should be chosen randomly.
Note: Although not in the original specification, I decided to return a tuple containing the ciphertext and a bit representing the mode in which the plaintext was encrypted. This will help testing in the next step.
Detect the block cipher mode
To determine if a ciphertext was encrypted in ECB or CBC mode, I used the repeated_blocks
function we wrote in exercise 8. If a ciphertext has repeated blocks, I assumed it was encrypted in ECB mode. In order to test this heuristic, I ran the encryption oracle a thousand times and ensured I detected the correct mode every time.
It seems to work.
Note: I provided the lyrics to Rappers Delight as the plaintext to be encrypted in the file 11.txt
.
12. Byte-at-a-time ECB decryption (Simple)
Obviously these challenges aren’t meant to be easy, but I think the walkthrough for this question is slightly underspecified. Then again, this is the first attack that will break real crypto…
In this exercise, we break AES-128 encryption in ECB mode! We can do this with a specially crafted attack that works if we can control the first N
bytes of the plaintext.
Encryption oracle 2.0
Unlike the first from the previous exercise, the new encryption_oracle
function that should not add 5-10 random bytes to the start and end of the plaintext. I was momentarily confused by this.
The code below is a simplified version of the original version.
Detect the block size
This part of the exercise stumped for few moments. At first, I thought about reusing the same code to determine the block size from exercise 6. However, this felt like cheating since we could only do this previously by assuming that the ciphertext was encrypted in ECB mode and in this exercise, we can’t assume that (yet).
My solution was actually much easier than the previous technique. We can just increase the size of the plaintext byte-by-byte until the ciphertext increases. The jump (difference) between the ciphertext sizes should be of block size. This works because we are padding each plaintext to be a multiple of the block size.
Check if ECB mode
This attack (and exercise) is to decrypt a ciphertext encrypted in ECB mode. Needless to say, in order to carry out this attack, we must first know that the ciphertext was encrypted in ECB mode.
We can use the is_ecb_mode
function from the previous exercise but must make sure that the plaintext prefix (the part of the plaintext that we control) contains two identical blocks as the rest of the plaintext might not. I decided to use "YELLOW SUBMARINEYELLOW SUBMARINE"
as the plaintext prefix as it’s 32 bytes long (2 blocks) and both blocks are identical.
Byte-by-byte decryption
Now we can start looking at cracking the unknown string. As explained in the challenge description, we can do this by crafting special data to feed into the oracle and discovering the unknown string byte-by-byte.
If we feed data that is of length block_size - 1
, the last byte in the block of the plaintext before it is encrypted in the oracle will be the first byte of the unknown string.
Here’s a beautiful ASCII art diagram depicting the plaintext before it is encrypted (for demonstration, the block size is 8 bytes, the unknown string is "THE UKNOWN STRING..."
and the specially crafted input data is "AAAAAAA"
).
+-------------------------------
|AAAAAAAT|HE UNKNO|WN STRIN|G...
+-------------------------------
We can compare the first block of the ciphertext (of the above plaintext encrypted with our oracle) with the ciphertext returned by the oracle when passed "AAAAAAAA"
as the input data to determine if the first byte of the unknown string is "A"
. If it is not "A"
, we could try again with "AAAAAAAB"
as the input data and so on (trying all possible bytes).
Voila! We now know the first byte of the unknown string. We can discover the second by passing "AAAAAA"
(or block_size - 2
) as the specially crafted input data. We would then compare this to the ciphertext produced by the oracle when passing in "AAAAAATA"
, "AAAAAATB"
, "AAAAAATc"
and so on (assuming "T"
is the first byte of the unknown string).
I know what you’re thinking. What happens when we “run out” of bytes to pass in as the input to the oracle? Surely this will only discover the first block of the unknown string. You’re right! To solve this, we shouldn’t pass in 1 block as the input to the oracle, but instead N
bytes where N
is the length of the unknown string rounded up to the nearest block size multiple!
Note: The remaining bytes in the input don’t have to be "A"
. They just need to be the identical in both ciphertexts in order to compare the ciphertexts correctly.
Detect the unknown string size
We can detect the length of the unknown string in almost the exact same way as we detected the block size.
Cracking the unknown string
The code below is a simplified version of the original version.
13. ECB cut-and-paste
The goal of this exercise is to change the content of a ciphertext (produced using AES-128 in ECB mode) such that when it is decrypted, you have replaced some of the plaintext (that you were not in control of) with your own content.
In particular, given a function profile_for
that takes an email (say foo@bar.com
), creates a profile which it then encodes in the form email=foo@bar.com&uid=10&role=user
and then returns it encrypted, change the encrypted encoded profile to be an admin (i.e. replace role=user
with role=admin
).
The first part of the exercise isn’t very interesting. Here’s the setup functions I wrote.
Cutting-and-pasting
The actual interesting part of this exercise is replacing role=user
with role=admin
. I did this in two steps.
Firstly, I created an arbitrary email, say MY_EMAIL
, such that email=MY_EMAIL&uid=10&role=
was block aligned (i.e. the length of that string was a multiple of the block size). I then passed it to the oracle (the profile_for
function) to get a valid ciphertext and clipped it so that only the encrypted blocks of email=MY_EMAIL&uid=10&role=
were kept. I called this clipped ciphertext the profile_prefix
.
Secondly, I created another arbitrary email, say MY_SECOND_EMAIL
, where MY_SECOND_EMAIL = EMAIL_PREFIX || "admin" || EMAIL_POSTFIX
(||
denotes string concatenation). The constraints were as follows:
- The
email=EMAIL_PREFIX
string must be block aligned. - The
"admin" || EMAIL_POSTFIX
string must also be block aligned andEMAIL_POSTFIX
must be a valid PKCS#7 padding.
With these constraints, I construct the profile_postfix
by taking only the ciphertext blocks corresponding to "admin" || EMAIL_POSTFIX
. The new admin profile is then profile_prefix || profile_postfix
and we can decrypt it to check!
The code below is a simplified version of the original version.
Note: I computed everything in terms of a computed block_size
, so this function should work if we were to, say, increase the block size to 32
. Also, although I set the email to AAAAAAAAAAAAA
, the content is arbitrary as long as it is that length (13 bytes).
14. Byte-at-a-time ECB decryption (Harder)
This exercise is a rehash of exercise 12 but instead of the encryption oracle encrypting user_input || unknown_string
, the oracle encrypts random_prefix || user_input || unknown_string
. The aim of this exercise is still to decrypt the unknown_string
however the presence of the random_prefix
makes it harder. I assume the random_prefix
is constant (as the unknown_string
is constant).
This isn’t too much harder than the first. Once you determine the size of the unknown random_prefix
, you can pad it to be a multiple of the block size, and then just offset the part of the ciphertext you care about by the length of the padded prefix.
Discovering the size of the unknown random_prefix
using only the encryption oracle is a bit more difficult. I came up with a trick that works in the following way:
- Create a buffer of block size length
- Concatenate your buffer with itself
N
times (whereN
is a large positive integer) and use this as your input to the encryption oracle. For example, ifN = 2
and your buffer is"YELLOW SUBMARINE"
, your input would be"YELLOW SUBMARINEYELLOW SUBMARINE"
. - Pass your input to the encryption oracle to obtain a ciphertext. Search the ciphertext for
N
consecutive identical blocks. The index of the first byte of theN
blocks is the beginning of your input and also the length of therandom_prefix
! - If you can’t find
N
consecutive identical blocks in the ciphertext, it’s probably because therandom_prefix
is not block aligned. Prepend 1 byte to your input and go back to step 3.
Note: To get the true size of therandom_prefix
, you must subtract the number of prepended padding bytes. - If you have prepended
block_size - 1
bytes to the input and you still cannot findN
consecutive identical blocks, the ciphertext wasn’t encrypted in ECB mode and we’re out of luck.
The code below is a simplified version of the original version.
15. PKCS#7 padding validation
In this exercise, we are instructed to determine if a plaintext has a valid PKCS#7 padding. If it does, we should strip it off and if not, throw an exception. We can do this with minimal changes to the unpad_pkcs7
function we wrote in exercise 9.
16. CBC bitflipping attacks
This exercise involves cracking encryption in CBC mode! Essentially, we want to change some ciphertext, produced by an encryption oracle, such that a decryption oracle sees the string ";admin=true;"
is present in it’s plaintext.
Oracles
The oracles are fairly simple to write so there’s not much to say. The key thing is that the bytes ";"
and "="
are escaped so we can’t just pass ";admin=true;"
as the user input to the encryption oracle.
Flipping off the bits
Since we XOR each decrypted block with the previous ciphertext block in CBC mode, we can change a byte in the plaintext by changing the byte at the same index in the previous block (though this completely corrupts the previous plaintext block!). We can use this to produce the unescaped characters ";"
and "="
in the plaintext!
My solution involved passing a buffer with a throwaway block (I don’t mind if it gets corrupted) concatenated with "AadminAtrueA"
as the user input. I managed to change the "A"
bytes in "AadminAtrueA"
to either ";"
or "="
by changing the bytes at the same offset in the throwaway block using simple XOR magic.
The code below is a simplified version of the original version.
Note: I assumed that I know the exact length of the prefix (32 bytes or 2 blocks) as encryption algorithms are usually public knowledge. However, if I didn’t know the length, I could calculate it using the get_prefix_size
function that I wrote in exercise 14.
Fin
Okay, wow. This set of exercises took me considerably longer than the first set and the post ended up way too long. I’m going to think about splitting up the sets in half. Anyway, hope you enjoyed and found this post helpful!
Still have time to kill? Read through the third set!