Welcome to the third installment of my Cryptopals Challenges mini-series / walkthrough! Some challenges focus on block cipher cryptography (as before) and others focus on stream ciphers. As always, I highly recommend reading previous challenge sets before this one. I often refer back to previous solutions!
The first part of this exercise asks us to write a function to pad and encrypt a random string in a set of given plaintexts and to write a function that checks the padding of the plaintext given the ciphertext, iv and key.
The existence of a padding oracle lets us decrypt the ciphertext! We can corrupt specific bytes within a ciphertext block to discover the associated byte within the plaintext block! A good description (which I used) of the padding oracle attack can be found on Wikipedia.
For example, lets say we have N ciphertext blocks. We can decrypt the second block by corrupting the first block like so:
Let the last byte in the ciphertext block C_1 be c_1_15. Corrupt C_1 so that c_1_15 = c_1_15 ⊕ gp_2_15 ⊕ 0x01 where gp_2_15 is our guess for the last byte in the second plaintext block called p_2_15 (⊕ is the XOR sign).
If our guess for p_2_15 is correct, when we decrypt the C_1|C_2, the byte p_2_15 will likely 0x01 and will have valid padding. There is a possibility that p_2_15 is 0x02 and p_2_14 is also 0x02, which would produce a valid padding, but it’s less likely.
If our guess in incorrect, try the other 254 possibilities for p_2_15 until the padding succeeds!
Now corrupt c_1_14 = c_1_14 ⊕ gp_2_14 ⊕ 0x02 and c_1_15 = c_1_15 ⊕ gp_2_15 ⊕ 0x02. In the unlikely even that we incorrectly guessed p_2_15, we should notice the mistake in this step and we can go back to step 2.
Keep on going until we have correctly guessed the whole block!
Now do the same for the third block by corrupting the second block and decrypting C_1|C_2|C_3! We actually only need to decrypt C_2|C_3!
Note: We can decrypt the first ciphertext block by corrupting the IV block.
This problem was quite tricky and I discovered one or two bugs in previous
padding code! Here’s my solution:
Update (28 Nov 2017): The crack method above should instead iterate over every block in the original ciphertext (i.e. range(len(ciphertext) / AES.block_size - 1)). The code above incorrectly iterates over the iv as well. Credit to Doug Friedman for pointing this bug out and providing the solution.
18. Implement CTR, the stream cipher mode
This challenge involves implementing the CTR encryption mode (otherwise known as COUNTER mode for good reason). I’m not going to explain how it works as the challenge does a good job of doing that. However, remember to decode the ciphertext provided!
My implementation uses AES in ECB mode to create the keystream. This wasn’t clear in the challenge description but it produced the correct answer.
19. Break fixed-nonce CTR mode using substitutions
This problem teaches us how to break encryption in CTR mode if the nonce and key used is the same for multiple ciphertexts.
My “solution” involved guessing keystream bytes and decrypting (XOR’ing) respective ciphertext bytes. I compared byte N across each ciphertext and determined the most likely guess based on English letter frequencies. Unfortunately, as each ciphertext had different lengths, the last few characters of my keystream guess were incorrect!
Note: On a more serious note (sorry Vanilla), the text provided is the Easter, 1916 poem by W. B. Yeats.
I’m pretty sure this could be avoided using common English 2/3-grams. Unfortunately, I’m lazy so let’s move on…
20. Break fixed-nonce CTR statistically
Ok, so mistakes were made! I actually solved this (in a sense), in exercise 19. I should have done it manually in exercise 19 and statistically now. Let’s do it slightly differently using our solution from exercise 6!
Unfortunately, I still get the same error where my key is slightly wrong but I think this is due to my crappy ‘score’ algorithm which scores a piece of text based on how close it is to English.
Note: My plaintext is printed as one long string if I use code from exercise 6 so I truncated it slightly. Fixing this is left as a exercise for the reader.
21. Implement the MT19937 Mersenne Twister RNG
This exercise asks the victim (me) to implement the Mersenne Twister RNG, the most widely used general-purpose pseudorandom number generator. We’re instructed to use the pseudo-code from the Wikipedia article but since it already provides a Python implementation, I’m using that. Next!
(I feel like I’ve been cheating quite a bit in this set of challenges eee).
22. Crack an MT19937 seed
In this challenge, we crack a MT19937 seed! However, our cracking program assumes that the random numbers generated are seeded with a UNIX timestamp. Realistically, I believe a lot of people out there will seed their PRNGs with the current system time so this seems like a realistic attack. I decided to brute force this!
23. Clone an MT19937 RNG from its output
In this exercise, we copy a MT19937 PRNG! Essentially, we can copy the state of an existing MT19937 by observing 624 consecutive generated random numbers and “untemper-ing” them! Attackers can learn what your PRNG will produce in the future using this method!
I admit, struggled quite a bit with the untempering aspect of this question as it involves inverting functions in the form: f(x) = x >> 18 (not so hard) and the function g(x) = x >> 15 & 4022730752. I adapted an existing algorithm shared by James Roper for the second form of functions.
Note: The method MT19937.create_from_state initializes a new MT19937 object with the provided internal state!
24. Create the MT19937 stream cipher and break it
Ok, we’re at the last exercise in this set of challenges! In this exercise we’re tasked with writing a (broken) stream cipher that’s based on the MT19937 PRNG and then breaking it. First things first, let’s create the stream cipher!
Now that that’s done, we’re tasked with encrypting a known plaintext prefixed with a random number of random characters and from it’s ciphertext, we need to recover the 16-bit seed (the “key”).
Since the seed is only 16-bits long, I decided to brute force it. If I missed the point of this sub-exercise, I apologise.
The last part is quite similar but I’m not sure if I quite understood the task at hand (it seems too simple and doesn’t seem to reveal any glaring attacks). Please let me know if you think I’ve missed the point with my solution.
Fin
Another set of challenges done! I’m not too sure if I understood the point behind many of the exercises in this set so I would really appreciate any feedback! I think the CBC padding oracle exercise was my favourite and probably the most difficult. I also think this write-up was more casual and like a stream of thoughts than the previous two. I didn’t think there was that much to explain this time…
Anyway, I hope you enjoyed and stay tuned for the next set!