100 post of code #2: One-time pad encryption

I like programming and learning about stuff I find interesting. Particularly interested in functional programming and typing systems.
I have a background in Cognitive Science/Psychology and Mathematics. Sadly halted due to physical and mental health issues arising from getting Covid. Which unfortunately hit me quite hard back on 2020. I'm still dealing with some sequels so I would prefer working from home.
I find it fun trying to find insights that cut across disciplines. I tend to favor the "theoretical" side of things. But I also try to get as much hands-on experience as possible.
I am writing a blog. Trying to share what I learn, and that others might find useful. I try to focus on the unique things I can bring to the table. Hoping to add value to the life of other developers. I always have a thousand ideas racing through my mind, so I don't have any trouble coming up with ideas on what to write. If any, I have a hard time cutting down on the number of things I want to write about :D
On a personal note. I have always been quite solitary and introverted. But I don't think I'm shy. I love videogames and started programming because I wanted to make my own.
One-time pad encryption
What is a one-time pad?
A one-time pad is a way to encrypt data by combining it with some random dummy data. When you encrypt data this way you end up with two keys, the product, and the dummy.
The resulting encryption is unbreakable. On their own, each key is completely useless. You need both if you want to get the original back.
Before jumping into the implementation in Python, let's consider the underlying mathematics. Feel free to skip that part if you only want to see the Python implementation!
The $XOR$ operation
Being a bitwise operation, $XOR$ takes two elements of the set \( \{1,0\}\) and returns one element of the same set. For any $XOR(a,b)$, we can fully define $XOR$ with the following truth table:
| $a$ | $b$ | $XOR$ |
| 1 | 1 | 0 |
| 1 | 0 | 1 |
| 0 | 1 | 1 |
| 0 | 0 | 0 |
Return true if \(a\neq b\), and false otherwise. That is, it asks the question: "is $a$ different from $b$?"
Let's consider a concrete example. And see how we can implement a one-time pad using this operation.
Consider the original message, represented as a bitstring: \(o = 111000\) and some dummy data \(d=101011\). Then the product \(p=XOR(o,d)\) evaluates as follows:
$$ \begin{matrix} 1 & 1 & 1 & 0 & 0 & 0 \\ 1 & 0 & 1 & 0 & 1 & 1\\ \hline 0 & 1 & 0 & 0 & 1 & 1 \end{matrix} $$
But note that \(XOR(p,d) = o\). That is, applying XOR to the product and the dummy yields the original message:
$$ \begin{matrix} 0 & 1 & 0 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 & 1 & 1 \\ \hline 1 & 1 & 1 & 0 & 0 & 0 \\ \end{matrix} $$
We have that for any three bitstrings of the same length $o,d,p$ the following equations are true:
$$ XOR(o,d) = p $$ $$ XOR(d,p) = o $$ $$ XOR(o,p) = d $$
Which is the mathematical basis for our one-time pad encryption.
By the way, note that it is not possible to "get out" of the set \( \{o,p,d\} \) by using the $XOR$ operation alone. This is a property called closure. We say that the set \(\{o,p,d\}\) is closed under $XOR$. With this, we can guarantee that we can both encrypt and unencrypt the original message.
I mentioned at the beginning of this post that this encryption is unbreakable. This follows from the closure. We can guarantee it! If the set \( \{ o_i,d_i,p_i \} \) is closed under $XOR$, then there is no way to reach \(o_i\), if we don't have both \(d_i\) and \(p_i\).
Let's now see how to write it in Python.
Generating pseudo-random data
The dummy data must meet three criteria. First, it must be of the same length as the original data. Second, it must be random. And third, it must be kept secret.
There is no such thing as true randomness in computer programs. But the next best thing we can do is to use a pseudo-random generator. The secrets module from Python contains a function for generating pseudo-random data, token_bytes().
We will generate a sequence of pseudo-random bytes of a given length. Then we will cast it as a Python int. We don't have to worry about the size of the sequence as Python will adjust the size of int as needed.
from secrets import token_bytes
def random_key(length: int) -> int:
tb: bytes = token_bytes(length) # length random bytes
return int.from_bytes(tb, "big")
We are using an int as a generic bit string. The int.form_bytes() part will take a $n$-bytes string and turn it into an $8n$-bit number (because there are 8 bits in one byte). For example, if we have 10 bytes we end up with an 80-bit integer. The int type on Python can be of any arbitrary size, as needed. If you want to see another example, I used the same trick in my previous post
Encrypting the original data and generating keys
We can think of an str as a sequence of bytes. A UTF-8 encoded character takes a byte each. We can convert from str to bytes using the encode() of str. And back with the decode() method of the bytes type.
To encrypt we must first convert an input string into bytes. Then we can use the random_key() function we wrote to generate a dummy byte sequence of the same length.
from typing import Tuple
def encrypt (original: str) -> Tuple[int, int] :
original_bytes: bytes = original.encode()
dummy: int = random_key(len(original_bytes))
original_key: int = int.from_bytes(original_bytes, "big")
encrypted: int = original_key ^ dummy # XOR on bit strings
return dummy, encrypted
Note the function signature. encrypt() returns a tuple of two int. Those are the two keys that result from the encryption. We need both to decrypt our original message.
The actual encryption takes place on the line with the original_key ^ dummy operation. Note that both of those are of type int and are being used as bit strings.
The ^ symbol is the operator for the bitwise $XOR$ operation. It is pretty straightforward.
Decrypting and recovering the original data
The decryption function is pretty straightforward. We first apply the $ XOR $ operation. Then convert the bit-carrying int into bytes that we can interpret as a string.
def decrypt (dummy, encrypted) -> str:
decrypted: int = dummy ^ encrypted
temp: bytes = decrypted.to_bytes((decrypted.bit_length() + 7) // 8, "big")
return temp.decode()
Finally, let's write a simple test to see if our one-time pad works as expected:
def test_encryption() :
original: str = "Subscribe to my blog!"
key1, key2 = encrypt(original)
result: str = decrypt(key1, key2)
if (result == original) :
print ("test successful")
print (f"original: {original}")
print (f"result:{result}")
else :
print ("test failed")
print (f"original: {original}")
print (f"result:{result}")
And let's have it run the test when running the file:
if __name__ == "__main__":
test_encryption()
And the result:

Recap
We learned what a one-type pad is and why it is a kind of unbreakable encryption (although it is not very safe)
We saw how the XOR bitwise operation works. And how it forms the underlying mathematical basis for this type of encryption. I also hinted at an important property: closure. I'm sure I will come back to it in the future!
We briefly saw how to generate (pseudo) random data. And how to use a Python int as an arbitrary-sized bit string.
Finally, we used the tools we learned to write a one-time pad in Python.
Help me grow by leaving me some feedback. See you in my next post!





