# RA: RSA and Modular arithmetic

PDF of Eric’s handwritten notes are here.

This lecture will explain the mathematics behind the RSA cryptosystem.  Let’s begin with a brief review of the definition of modular arithmetic.

Modular Arithmetic:
Here is a simple example to illustrate modular arithmetic:
For $x \in \mathbb{Z}$,  $x$ mod 2 =  least significant bit of $x$  =

• 1 if $x$ is odd
• 0 if $x$ is even

The general definition is the following:
Definition: For integer $N \geq 1$,$x$ mod $N$ = remainder when divide $x$ by $N$= $r$, where

$x = qN + r$ and $0 \leq r < N$ for integers $q, r$.

Example: mod 3 has 3 equivalence classes:

• $\dots, -9, -6, -3, 0, 3, 6, 9, \dots$
• $\dots, -8, -5, -2, 1, 4, 7, 10, \dots$
• $\dots, -7, -4, -1, 2, 5, 8, 11, \dots$

Properties: Given $x \equiv x'$ mod $N$ and $y \equiv y'$ mod $N$, then:

• $x + y \equiv x' + y' \mod N$
• $xy \equiv x'y' \mod N$

Example: Find $2^{345}$ mod 31, given 345 = 5 x 69.

$2^{345} \equiv (2^5)^{69} \equiv (32)^{69} \equiv 1 \mod 31$

Example: input: $x, y, N$, where $x$ and $y$ are $n$-digit numbers, and $x, y \leq N$.
Complexity for the following computations:

• $x + y \mod N$:
• Addition: $O(n)$
• Module: if $x+y \geq N$: output $(x+y-N)$; else: output $x + y$. $O(n)$
• Total time: $O(n)$
• $xy \mod N$:
• Multiplication: $O(n^2)$
• Module: compute $xy$ divided by $N$ and output the remainder.
• Total time: $O(n^2)$
• $x^y \mod N$:
• Easy Idea: Compute $x^1 \mod N, x^2 \mod N, \dots, x^y \mod N$. $y$ rounds in total and $y = 2^n$, so the running time is $O(n^22^n)$, which is exponential time.
• Improvement: Compute all $x^k \mod N$, where $k$ is power of 2 and $k \leq y$. And then combine the results to get the final answer. $\log_2{y} = n$ rounds of computation in total, and $O(n^2)$ for each round,  and the same for the combining rounds. Thus the total time is $O(n^2 \log y) = O(n^3)$.
• Example: $y = 25 = (11001)_2$, then $x^y \equiv x^{25} \equiv x^{16}x^8x^2 \mod N$. And we need to compute $y^k \equiv (y^{\frac{k}{2}})^2 \mod N$, where $k = 1, 2, 4, 8, 16$. We need to do at most $n$ multiplications to get these values and another $n$ multiplications to get the final result. Thus the total time is $O(n^3)$.

Multiplicative Inverse:
Definition: $Z$ is a multiplicative inverse of $A \mod N$ if $aZ \equiv 1 \mod N$, we can also write it as $Z \equiv A^{-1} \mod N$.
Theorem: $Z$ exists iff $gcd(A, N) = 1$, (or in other words, there exists $\alpha, \beta \in \mathbb{Z}$ such that $\alpha Z + \beta A = 1$).
Example: $N = 14$, find the inverse of $A = 1, 2, 3, \dots, 13$

$A$ 1 2 3 4 5 6 7 8 9 10 11 12 13
$A^{-1}$ 1 X 5 X 3 X X X 11 X 9 X 13

To find the inverse of $A \mod N$, we can first compute $gcd(A, N)$ to verify that $gcd(A, N) = 1$ .  In addition we can find integers $\alpha, \beta$ such that $\alpha A + \beta N = gcd(A,N)$. The number $\alpha$ here is the inverse of A when $gcd(A,N)=1$.We can use the Extended Euclid’s Algorithm to find $d=gcd(A,B)$ and also these numbers $\alpha, \beta\in Z$ that $\alpha A + \beta N = d$. Let’s first review the basic Euclid’s algorithm for finding the $gcd(A,B)$.
Euclid’s Rule and Euclidean Algorithm:
Euclid’s Rule: For $x, y \in \mathbb{Z}$ such that $x \geq y > 0$, $gcd(x, y) = gcd(x \mod y, y)$.
This follows from the fact that $gcd(x,y) = gcd(x-y,y)$.  Euclid’s rule yields a simple divide and conquer algorithm for computing the gcd.
Pseudocode:
$Euclid(x, y): x \geq y \geq 0$
Output: $(d = gcd(x,y))$

• if $y = 0$:
• Return $x$
• else:
• Return $Euclid(y, x \mod y)$

Running Time:
Claim: If $y \leq x$, then $x \mod y < \frac{x}{2}$.Proof:

• $y \leq \frac{x}{2}$: $x \mod y < y \leq \frac{x}{2}$
• $y > \frac{x}{2}$: $x \mod y = x - y < \frac{x}{2}$

Hence every other round the first coordinate decreases by at least half.  Therefore, the running time satisfies: $T(n) = T(n-1) + O(n^2)$, here $n$ is the number of digit of $x$. Thus Euclid’s Algorithm takes $O(n^3)$ time to compute the $gcd(x, y)$.
We can now look at the extended version of the algorithm which also computes the numbers $\alpha,\beta$ mentioned earlier.
Pseudocode:
$Extended-Euclid(x, y): x \geq y \geq 0$
Output: $(d = gcd(x,y), \alpha, \beta)$

• if $y = 0$:
• Return ($x, 1, 0$)
• else:
• $(d, \alpha', \beta') = Extended-Euclid(y, x \mod y)$
• Return $(d, \beta', \alpha' - \lfloor \frac{x}{y} \rfloor \beta')$

Here if $d = gcd(x, y) = 1$, then $\alpha$ is the inverse of $x \mod y$ because $\alpha x + \beta y = 1$ and mod each side by $y$ gives us $\alpha \equiv x^{-1} \mod y$
Fermat’s Little Theorem and Euler’s Theorem:
Fermat’s Little Theorem:
For prime $p$, for all $a$ relatively prime to $p$ (so all $a$ where $a \neq 0 \mod p$), then $a^{p-1} \equiv 1 \mod p$
Proof:
Let $p$ be a prime and $a \neq 0 \mod p$.
Let $S = \{1, 2, \dots, p-1\}$.
Define $S' = aS = \{a \mod p, 2a \mod p, \dots, a(p-1) \mod p\}$.
E.g., $p = 7, a = 3, S = \{1, 2, 3, 4, 5, 6\}, S' = aS = \{3, 6, 2, 5, 1, 4\}$.
It turns out that in general the sets $S$ and $S'$ are identical sets, just possibly have the elements in different order.
Claim: $S = S'$
Proof of Claim: We’ll show that $S'$ has $p-1$ distinct and non-zero elements, and therefore it contains all elements of $S$ exactly once, so we’ll have that $S=S'$.
Let’s first show that the elements of $S'$ are distinct.  Assume $i,j \in S$ such that $i \neq j$ and $ai \equiv aj \mod p$.
Then since $a \neq 0 \mod p$, $a^{-1}$ mod p exists.
$a^{-1}ai \equiv a^{-1}aj \mod p \quad \Rightarrow \quad i \equiv j \mod p$
And this implies $i = j$, which is a contradiction.
Similarly, if $ai \equiv 0 \mod p$, then $i \equiv 0 \mod p$.
Thus $S'$ has the same $p-1$ elements as $S$, which gives us $S = S'$. $\square$
Using this claim, we can multiply together the elements of $S$ and we’ll get the same result as when we multiply the elements of $S'$, hence:
$(1)(2)(3)\dots(p-1) \equiv (a1)(a2)(a3)\dots(a(p-1))$
Since $p$ is prime, so $gcd(i, p) = 1$ for $i = 1, 2, \dots, p-1$.
Thus $1 \equiv a^{p-1} \mod p$ so we’re left with $1\equiv a^{p-1} \bmod p$, and this completes the proof of Fermat’s Little Theorem. $\square$
We’ll use the following generalization of Fermat’s little theorem:
Euler’s Theorem (generalization of Fermat’s Little Theorem):
For any $N$, given any $a$ relatively prime to $N$ (So $gcd(a,N)=1$)
$a^{\phi(N)} = 1 \mod N$ where $\phi(N)$  = # integers $b \in {1,...,N}$ that are relatively prime to $N$  =  $|\{b:b \in {1,...,N}\mid gcd(b, N)=1\}|$
For prime $p$, $\phi(p) = p-1$ and hence we obtain Fermat’s little theorem in this case.
For primes $p,q$, let $N = pq$, then $\phi(N) = (p-1)(q-1)$.  Why?  Well let’s look at the $N$ numbers between 1 and N and count how many are not relatively prime to $N$.  In particular, which of these $N$ numbers have a common factor with $N=pq$?  There are $q$ multiples of $p$, namely $p, 2p, \dots, qp$, and for each of these $q$ numbers $p$ is a common factor of this number and also of $N$.  Similarly there are $p$ multiples of $q$, namely $q, 2q, \dots, pq$.  And we counted $pq=qp$ twice.  Thus there are $q+p-1$ numbers between 1 and N which are not relatively prime to N and the others are relatively prime.  Therefore, $\phi(N) = pq - p - q + 1 = (p-1)(q-1)$.
So for $a$ where $gcd(a, N) = 1$, then $a^{(p-1)(q-1)} = 1$.  This is the key result for the RSA protocol.  Let us explore how we use it.
Back to Fermat’s Little Theorem
Take $b, c$ where $bc \equiv 1 \mod p-1$.
So $b \equiv c^{-1} \mod (p-1)$, which means $bc = 1+k(p-1)$ for some integer $k$.   Therefore, $a^{bc} \equiv (a)(a^{(p-1)})^k \equiv a \mod p$
Looking at Euler’s Theorem:
For primes $p, q$, let $N = pq$.
Take $d,e$ where $de \equiv 1 \mod (p-1)(q-1)$.  Thus, $de = 1+k(p-1)(q-1)$ for some integer $k$.  Therefore, for $a$ relatively prime to $N$, $a^{de} \equiv (a) (a^{(p-1)(q-1)})^k \equiv a \mod N$.
This important property is used in RSA. Someone can find large primes $p, q$ and compute $d,e$. They keep $d$ but give out the public $N, e$. Anyone who wants to send a secret message can encrypt the original message $m$ by $m^e \mod N$ to get the encrypted message $y$. When receiving that $y$, they  can recover the original message $m$ by decrypting using $y^e \equiv m^{de} \equiv m \mod N$. We’ll talk about the details of the protocol in a moment.
Cryptography
Alice has a message $m$ that she wants to send to Bob, but Eve can see the message sent.
$m \rightarrow encoder \rightarrow (e(m)) \rightarrow decoder \rightarrow m=d(e(m))$
Eve sees $e(m)$
Alice & Bob meet beforehand and decide on a $n$-bit string $r$.
Then Alice sends:
$e(m) = m \oplus r$. Here $\oplus$ is Exclusive OR.
Bob decrypts:
$d(e(m)) = e(m) \oplus r = m$
How should they pick $r$?
Choose a random string, then $e(m)$ is also a random string.
Problem:
1) need to meet or have a secure, private computation beforehand
2) can only use it once.
E.g., suppose Alice sends $m_1$ and $m_2$ using same $r$, then Eve can see
$(m_1 \oplus r) \oplus (m_2 \oplus r) = m_1 \oplus m_2$
which might have some useful info, e.g. if one of the has a string of 0’s then you see the other one.
Public-key ‘crypotgraphy’:
Bob publishes a public key: anyone can send a message to Bob uses his public key to encrypt, while Bob uses his private, secret key to decrypt the received message.
Protocol:
1) Bob picks 2 $n$-bit random primes $p, q$
-How?
Choose random $n$-bit numbers and check if they are prime (we’ll see how to do that next class). It has prob $\approx \frac{1}{n}$ of being prime.
2) Bob chooses an $e$ relatively prime to $(p-1)(q-1)$
He does this by trying $e=3,5,7,11$. And he also computes $N = pq$.
3) Bob publishes his public key as $(N,e)$ where $N = pq$. (he doesn’t reveal $p$ or $q$, just $pq$)
4) Bob computes his private key
$d \equiv e^{-1} \mod (p-1)(q-1)$
He does this using the extended Euclid algorithm.
Alice:
to send the message m to Bob
1) She looks up his public key $(N, e)$
2) She computes $y = m^e \mod N$ using fast modular exponentiation algorithm that we saw.
3) She sends $y$ to Bob
Bob:
1) He receives $y$ and decrypts using $m = y^d \mod N$ using fast modular exponentiation.
Key assumption:
Given $N, e, y$, it is computationally difficult to determine $m$.
Natural way is to factor $N$ to get $p, q$ then one can get $(p-1)(q-1)$.
We saw before that:
Since $de \equiv 1 \mod (p-1)(q-1)$ we have that $de = 1 + k(p-1)(q-1)$ for some integer $k$.   Moreover, if $m$ is relatively prime to $N$ then by Euler’s theorem $m^{(p-1)(q-1)} \equiv 1 \mod N$.  Therefore we have the following:
$y^d \equiv (m^e)^d \equiv m^{ed} \equiv m^{1+k(p-1)(q-1)} \equiv m \times (m^{(p-1)(q-1)})^k \equiv m \mod N$
The remaining case is when $m$ has $gcd(m, N) > 1$.  This case requires the which we have not covered so we skip this case but the identity $m^{ed} \equiv m \mod N$ still holds true when $gcd (m,N)>1$.  (The proof using the Chinese remainder theorem is not difficult, here’s a good explanation that I found.)
This gives us $y^d \equiv m \mod N$, so RSA works.
RSA = Rivest-Shamir-Adelman published in 1977.
Clifford Cocks – working for UK intelligence agency described a similar system in ’73.
– Note if $e=3$ (or is small), one needs to make sure $m^e > N$ otherwise mod $N$ isn’t doing anything and to decrypt one can just take $y^{\frac{1}{3}}$
So often “pad” $m$.
– If send same $m$ to $e$ or more people all using the same $e$ then can decrypt (see HW problems).