GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/06-elliptic-curves/sage/06e-scalar-multiplication.ipynb
⁴⁸³ views

unlisted

Kernel: SageMath 10.5

Notebook 06e: Scalar Multiplication

Module 06. Elliptic Curves

Motivating Question. To use elliptic curves for cryptography, we need to compute $kP$ (adding $P$ to itself $k$ times) for very large $k$ , say $k \approx 2^{256}$ . Adding $P$ one step at a time would take $2^{256}$ operations, which is impossibly slow. Is there a shortcut? Yes: the double-and-add algorithm computes $kP$ in only $O(\log k)$ steps, just as square-and-multiply computes $g^k$ efficiently in $\mathbb{Z}/p\mathbb{Z}^*$ .

Prerequisites. You should be comfortable with:

Point addition and doubling on elliptic curves (Notebook 06b)
Group structure and point orders (Notebook 06d)
Square-and-multiply for modular exponentiation (Module 04/05)

Learning objectives. By the end of this notebook you will be able to:

Implement naive repeated addition and understand why it is too slow.
Implement the double-and-add algorithm and prove it is $O(\log k)$ .
Compare the performance of naive vs double-and-add.
Understand the analogy between double-and-add and square-and-multiply.
Appreciate that scalar multiplication is the one-way function underlying EC crypto.

1. Naive Scalar Multiplication

The simplest approach: add $P$ to itself $k$ times.

kP = \underbrace{P + P + \cdots + P}_{k \text{ times}}

This requires $k - 1$ point additions. For $k \approx 2^{256}$ , this is $\approx 10^{77}$ operations, far beyond feasibility.

In [ ]:

def scalar_mul_naive(P, k):
    """Compute kP by repeated addition. O(k), only for small k!"""
    E = P.curve()
    result = E(0)  # O (identity)
    for _ in range(k):
        result = result + P
    return result

# Test on a small curve
E = EllipticCurve(GF(101), [1, 1])
P = E(0, 1)

# Small k: works fine
k = 17
result = scalar_mul_naive(P, k)
print(f"P = {P}")
print(f"{k}·P (naive) = {result}")
print(f"{k}·P (SageMath) = {k * P}")
print(f"Match? {result == k * P}")

In [ ]:


# Measure how naive scales
print("k Time (ms) # additions")
for k in [100, 1000, 10000, 50000]:
    start = walltime()
    result = scalar_mul_naive(P, k)
    elapsed = (walltime() - start) * 1000
    assert result == k * P, "Mismatch!"
    print(f"{k} {elapsed:>12.1f} {k-1}")

print(f"\nAt this rate, k = 2^256 would take ≈ 10^{70} years. Not practical!")

Checkpoint 1. Why is naive scalar multiplication $O(k)$ while we need $O(\log k)$ ? The key insight is the same as for exponentiation: we can double the point instead of adding it one at a time. Doubling $P$ to get $2P$ , then doubling again to get $4P$ , and so on, reaches $2^m P$ in only $m$ steps.

2. The Double-and-Add Algorithm

Idea: Write $k$ in binary: $k = \sum_{i=0}^{m} k_i \cdot 2^i$ where $k_i \in \{0, 1\}$ . Then:

kP = \sum_{i: k_i = 1} 2^i P

We precompute $P, 2P, 4P, 8P, \ldots$ by repeated doubling, and add together only the terms where $k_i = 1$ .

Analogy	Multiplicative group	Additive (EC) group
Operation	Multiplication	Addition
Repeated operation	$g^k$ (exponentiation)	$kP$ (scalar multiplication)
Efficient method	Square-and-multiply	Double-and-add
Complexity	$O(\log k)$ squarings + multiplications	$O(\log k)$ doublings + additions

In [ ]:

def double_and_add(P, k):
    """
    Compute kP using double-and-add. O(log k) point operations.
    
    Algorithm (right-to-left binary method):
    - R starts as O (identity)
    - Q starts as P
    - For each bit of k (LSB to MSB):
        - If bit is 1: R = R + Q
        - Q = 2Q (double)
    """
    E = P.curve()
    R = E(0)   # accumulator, starts at identity
    Q = P       # current power of 2 times P
    
    additions = 0
    doublings = 0
    
    while k > 0:
        if k % 2 == 1:  # current bit is 1
            R = R + Q
            additions += 1
        Q = Q + Q       # double Q
        doublings += 1
        k = k // 2
    
    return R, additions, doublings

# Test
E = EllipticCurve(GF(101), [1, 1])
P = E(0, 1)
k = 73

result, adds, dbls = double_and_add(P, k)
print(f"k = {k} = {bin(k)} (binary)")
print(f"Bits of k: {k.bit_length()}")
print(f"Additions: {adds}, Doublings: {dbls}")
print(f"Total operations: {adds + dbls} (vs {k-1} for naive!)")
print(f"Result: {result}")
print(f"SageMath: {k * P}")
print(f"Match? {result == k * P}")

Misconception alert. "Double-and-add always does exactly $\log_2 k$ additions." Not quite, it does $\log_2 k$ doublings, but the number of additions equals the Hamming weight of $k$ (the number of 1-bits in the binary representation). For a random 256-bit $k$ , this is about 128 additions + 256 doublings $\approx 384$ total operations.

In [ ]:

# Trace the algorithm step by step for k = 43
E = EllipticCurve(GF(101), [1, 1])
P = E(0, 1)
k = 43  # binary: 101011

print(f"Computing {k}·P where P = {P}")
print(f"k = {k} = {bin(k)} in binary")
print(f"Binary digits (LSB first): {[int(b) for b in reversed(bin(k)[2:])]}")
print()

R = E(0)
Q = P
temp_k = k
step = 0

print("Step Bit Action R Q (=2^i·P)")
while temp_k > 0:
    bit = temp_k % 2
    if bit == 1:
        R = R + Q
        action = f"R += Q (2^{step}·P)"
    else:
        action = "(skip)"
    
    r_str = f"({int(R[0])}, {int(R[1])})" if R != E(0) else "O"
    q_str = f"({int(Q[0])}, {int(Q[1])})" if Q != E(0) else "O"
    print(f"{step} {bit} {action} {r_str} {q_str}")
    
    Q = Q + Q  # double
    temp_k = temp_k // 2
    step += 1

print(f"\nResult: {k}·P = {R}")
print(f"Verify:       {k * P}")
print(f"Match? {R == k * P}")

3. Performance Comparison

Let us compare naive vs double-and-add on increasingly large scalars.

In [ ]:

# Use a larger curve for meaningful timing
p = next_prime(2^32)
E = EllipticCurve(GF(p), [1, 1])
P = E.random_point()

# Double-and-add for various k sizes
bits_list = [16, 32, 64, 128, 256, 512]
daa_times = []

print("Bits of k k (hex, first 8 chars) D&A time (ms) # ops")
for bits in bits_list:
    k = randint(2^(bits-1), 2^bits - 1)
    
    start = walltime()
    result, adds, dbls = double_and_add(P, k)
    elapsed = (walltime() - start) * 1000
    
    assert result == k * P
    daa_times.append(elapsed)
    k_hex = hex(k)[:10] + "..."
    print(f"{bits} {k_hex} {elapsed:>15.2f} {adds+dbls}")

# Plot
G = list_plot(list(zip(bits_list, daa_times)), plotjoined=True, color='blue',
              marker='o', size=60, legend_label='Double-and-add')
G.show(figsize=[9, 5],
       axes_labels=['Bits of $k$', 'Time (ms)'],
       title='Double-and-add: time scales linearly with bit-length of $k$',
       gridlines=True)

print(f"\nTime is O(log k) = O(bits of k). Doubling the bit-length roughly doubles the time.")

Checkpoint 2. For a 256-bit scalar, double-and-add needs $\approx 256$ doublings and $\approx 128$ additions = $\approx 384$ point operations. Naive would need $\approx 2^{256}$ additions. The speedup factor is $\frac{2^{256}}{384} \approx 10^{74}$ . This is the difference between "impossible" and "milliseconds."

4. The One-Way Function

Scalar multiplication is the one-way function of elliptic curve cryptography:

Direction	Operation	Complexity
Forward (easy)	Given $P, k$ , compute $Q = kP$	$O(\log k)$ via double-and-add
Backward (hard)	Given $P, Q$ , find $k$ such that $Q = kP$	Best known: $O(\sqrt{n})$ (Pollard's rho)

This asymmetry is the Elliptic Curve Discrete Logarithm Problem (ECDLP).

Bridge from Module 05. In Module 05, the one-way function was $g^k \bmod p$ (fast via square-and-multiply, hard to invert). Here the one-way function is $kP$ (fast via double-and-add, hard to invert). The mathematical structure is different, but the cryptographic paradigm is identical.

In [ ]:

# Demonstrate the asymmetry: fast forward, hard backward
p = next_prime(2^40)
E = EllipticCurve(GF(p), [1, 1])
G = E.gens()[0]
n = G.order()

k_secret = randint(2, n - 2)

# Forward: fast
start = walltime()
Q = k_secret * G
forward_time = (walltime() - start) * 1000

# Backward: slow (brute force for small groups)
start = walltime()
k_found = discrete_log(Q, G, n, operation='+')
backward_time = (walltime() - start) * 1000

print(f"Curve over F_p where p ≈ 2^40, |E| ≈ {n.ndigits()} digits")
print(f"\nForward (compute kG):    {forward_time:.2f} ms")
print(f"Backward (find k from Q): {backward_time:.1f} ms")
print(f"Ratio: backward is {backward_time/forward_time:.0f}x slower")
print(f"Correct? {k_found == k_secret}")

5. Windowed and Montgomery Methods (Overview)

Double-and-add is the basic algorithm, but real implementations use optimizations:

Method	Idea	Speed-up
Double-and-add (basic)	Process 1 bit at a time	Baseline
$w$ -ary method	Process $w$ bits at a time, precompute $P, 2P, \ldots, (2^w - 1)P$	Fewer additions
NAF (Non-Adjacent Form)	Use digits $\{-1, 0, 1\}$ to reduce additions	~33% fewer additions
Montgomery ladder	Always do same operations regardless of bit value	Side-channel resistant

The Montgomery ladder is particularly important for security: it prevents attackers from learning secret bits by measuring timing or power consumption.

In [ ]:

def montgomery_ladder(P, k):
    """
    Compute kP using the Montgomery ladder.
    Constant-time: always performs both an addition and a doubling per bit.
    """
    E = P.curve()
    R0 = E(0)  # will hold floor(k/2^i) * P at each step
    R1 = P      # will hold (floor(k/2^i) + 1) * P
    
    # Process bits from MSB to LSB
    for i in range(k.bit_length() - 1, -1, -1):
        bit = (k >> i) & 1
        if bit == 0:
            R1 = R0 + R1
            R0 = R0 + R0  # double
        else:
            R0 = R0 + R1
            R1 = R1 + R1  # double
    
    return R0

# Test Montgomery ladder
E = EllipticCurve(GF(101), [1, 1])
P = E(0, 1)

for k in [1, 2, 7, 43, 73, 97]:
    result = montgomery_ladder(P, k)
    expected = k * P
    status = "✓" if result == expected else "✗"
    print(f"{k}·P: Montgomery = ({int(result[0])}, {int(result[1])}), "
          f"SageMath = ({int(expected[0])}, {int(expected[1])})  {status}")

Checkpoint 3. Why is constant-time execution important? If double-and-add takes different amounts of time for bits 0 vs 1, an attacker measuring execution time can deduce the secret scalar $k$ bit by bit. This is a timing side-channel attack. The Montgomery ladder defeats this by performing the same operations regardless of the bit value.

Crypto foreshadowing. In the next notebook, we will use scalar multiplication as the core operation in ECDH (key exchange) and ECDSA (digital signatures). The security of both protocols rests on the ECDLP: computing $kP$ is easy, but recovering $k$ from $P$ and $kP$ is believed infeasible for properly chosen curves.

6. Exercises

Exercise 1 (Worked): Tracing Double-and-Add

Problem. On $E: y^2 = x^3 + x + 1$ over $\mathbb{F}_{23}$ , compute $11P$ where $P = (0, 1)$ using double-and-add. Show each step.

Solution. $11 = 1011_2$ , so we process bits right-to-left:

Step	Bit	Action	$R$	$Q$
0	1	$R = R + Q$	$P$	$P$
1	1	$R = R + Q$	$P + 2P = 3P$	$2P$
2	0	(skip)	$3P$	$4P$
3	1	$R = R + Q$	$3P + 8P = 11P$	$8P$

Result: $R = 11P$ . Let us verify:

In [ ]:

# Exercise 1: verification
E = EllipticCurve(GF(23), [1, 1])
P = E(0, 1)

# Step by step
print(f"P = {P}")
print(f"2P = {2*P}")
print(f"3P = P + 2P = {3*P}")
print(f"4P = {4*P}")
print(f"8P = {8*P}")
print(f"11P = 3P + 8P = {11*P}")
print()

# Using our double-and-add
result, adds, dbls = double_and_add(P, 11)
print(f"double_and_add(P, 11) = {result}")
print(f"Additions: {adds}, Doublings: {dbls}")
print(f"Match? {result == 11 * P}")

Exercise 2 (Guided): Operation Count

Problem. For a 256-bit scalar $k$ :

How many doublings does double-and-add perform?
On average, how many additions? (Hint: each bit is 1 with probability 1/2.)
Compare the total operation count with naive scalar multiplication.

Fill in the TODO cells below.

In [ ]:

# Exercise 2: fill in the TODOs

bits = 256

# TODO 1: doublings = ?
# doublings = ???

# TODO 2: expected additions = ?
# avg_additions = ???

# TODO 3: Compare with naive
# naive_ops = 2^bits - 1
# daa_ops = doublings + avg_additions
# speedup = naive_ops / daa_ops
# print(f"Naive: {naive_ops:.2e} operations")
# print(f"D&A:   {daa_ops} operations")
# print(f"Speedup: {speedup:.2e}x")

Exercise 3 (Independent): ECDLP Brute Force

Problem.

On $E: y^2 = x^3 + x + 1$ over $\mathbb{F}_{101}$ , pick a generator $G$ and compute $Q = 73G$ using SageMath.
Write a brute-force function ecdlp_brute(G, Q) that finds $k$ by computing $G, 2G, 3G, \ldots$ and comparing with $Q$ .
Time your brute-force ECDLP solver for curves over $\mathbb{F}_p$ with $p$ having 10, 15, and 20 bits. Plot the time vs $p$ (or vs $\sqrt{|E|}$ ).
Based on the trend, estimate how long brute force would take for a 256-bit curve.

In [ ]:

# Exercise 3: write your solution here

Summary

Concept	Key Fact
Naive scalar multiplication	$kP$ by repeated addition: $O(k)$ , infeasible for large $k$
Double-and-add	Process bits of $k$ : $O(\log k)$ doublings + additions
Operation count	$\approx \log_2 k$ doublings + $\frac{1}{2}\log_2 k$ additions on average
Analogy	Double-and-add is to EC what square-and-multiply is to $\mathbb{Z}/p\mathbb{Z}^*$
Montgomery ladder	Constant-time variant; resists timing side-channel attacks
One-way function	Computing $kP$ is fast; recovering $k$ from $P, kP$ is the ECDLP

Scalar multiplication is the workhorse operation of elliptic curve cryptography. Every key generation, key exchange, and signature computation reduces to one or a few scalar multiplications. In the final notebook, we put it all together with ECDH and ECDSA.

Next: 06f: ECDH and ECDSA