Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/06-elliptic-curves/sage/06e-scalar-multiplication.ipynb
483 views
unlisted
Kernel: SageMath 10.5

Notebook 06e: Scalar Multiplication

Module 06. Elliptic Curves


Motivating Question. To use elliptic curves for cryptography, we need to compute kPkP (adding PP to itself kk times) for very large kk, say k2256k \approx 2^{256}. Adding PP one step at a time would take 22562^{256} operations, which is impossibly slow. Is there a shortcut? Yes: the double-and-add algorithm computes kPkP in only O(logk)O(\log k) steps, just as square-and-multiply computes gkg^k efficiently in Z/pZ\mathbb{Z}/p\mathbb{Z}^*.


Prerequisites. You should be comfortable with:

  • Point addition and doubling on elliptic curves (Notebook 06b)

  • Group structure and point orders (Notebook 06d)

  • Square-and-multiply for modular exponentiation (Module 04/05)

Learning objectives. By the end of this notebook you will be able to:

  1. Implement naive repeated addition and understand why it is too slow.

  2. Implement the double-and-add algorithm and prove it is O(logk)O(\log k).

  3. Compare the performance of naive vs double-and-add.

  4. Understand the analogy between double-and-add and square-and-multiply.

  5. Appreciate that scalar multiplication is the one-way function underlying EC crypto.

1. Naive Scalar Multiplication

The simplest approach: add PP to itself kk times.

kP=P+P++Pk timeskP = \underbrace{P + P + \cdots + P}_{k \text{ times}}

This requires k1k - 1 point additions. For k2256k \approx 2^{256}, this is 1077\approx 10^{77} operations, far beyond feasibility.

def scalar_mul_naive(P, k): """Compute kP by repeated addition. O(k), only for small k!""" E = P.curve() result = E(0) # O (identity) for _ in range(k): result = result + P return result # Test on a small curve E = EllipticCurve(GF(101), [1, 1]) P = E(0, 1) # Small k: works fine k = 17 result = scalar_mul_naive(P, k) print(f"P = {P}") print(f"{k}·P (naive) = {result}") print(f"{k}·P (SageMath) = {k * P}") print(f"Match? {result == k * P}")
# Measure how naive scales print("k Time (ms) # additions") for k in [100, 1000, 10000, 50000]: start = walltime() result = scalar_mul_naive(P, k) elapsed = (walltime() - start) * 1000 assert result == k * P, "Mismatch!" print(f"{k} {elapsed:>12.1f} {k-1}") print(f"\nAt this rate, k = 2^256 would take ≈ 10^{70} years. Not practical!")

Checkpoint 1. Why is naive scalar multiplication O(k)O(k) while we need O(logk)O(\log k)? The key insight is the same as for exponentiation: we can double the point instead of adding it one at a time. Doubling PP to get 2P2P, then doubling again to get 4P4P, and so on, reaches 2mP2^m P in only mm steps.

2. The Double-and-Add Algorithm

Idea: Write kk in binary: k=i=0mki2ik = \sum_{i=0}^{m} k_i \cdot 2^i where ki{0,1}k_i \in \{0, 1\}. Then:

kP=i:ki=12iPkP = \sum_{i: k_i = 1} 2^i P

We precompute P,2P,4P,8P,P, 2P, 4P, 8P, \ldots by repeated doubling, and add together only the terms where ki=1k_i = 1.

AnalogyMultiplicative groupAdditive (EC) group
OperationMultiplicationAddition
Repeated operationgkg^k (exponentiation)kPkP (scalar multiplication)
Efficient methodSquare-and-multiplyDouble-and-add
ComplexityO(logk)O(\log k) squarings + multiplicationsO(logk)O(\log k) doublings + additions
def double_and_add(P, k): """ Compute kP using double-and-add. O(log k) point operations. Algorithm (right-to-left binary method): - R starts as O (identity) - Q starts as P - For each bit of k (LSB to MSB): - If bit is 1: R = R + Q - Q = 2Q (double) """ E = P.curve() R = E(0) # accumulator, starts at identity Q = P # current power of 2 times P additions = 0 doublings = 0 while k > 0: if k % 2 == 1: # current bit is 1 R = R + Q additions += 1 Q = Q + Q # double Q doublings += 1 k = k // 2 return R, additions, doublings # Test E = EllipticCurve(GF(101), [1, 1]) P = E(0, 1) k = 73 result, adds, dbls = double_and_add(P, k) print(f"k = {k} = {bin(k)} (binary)") print(f"Bits of k: {k.bit_length()}") print(f"Additions: {adds}, Doublings: {dbls}") print(f"Total operations: {adds + dbls} (vs {k-1} for naive!)") print(f"Result: {result}") print(f"SageMath: {k * P}") print(f"Match? {result == k * P}")

Misconception alert. "Double-and-add always does exactly log2k\log_2 k additions." Not quite, it does log2k\log_2 k doublings, but the number of additions equals the Hamming weight of kk (the number of 1-bits in the binary representation). For a random 256-bit kk, this is about 128 additions + 256 doublings 384\approx 384 total operations.

# Trace the algorithm step by step for k = 43 E = EllipticCurve(GF(101), [1, 1]) P = E(0, 1) k = 43 # binary: 101011 print(f"Computing {k}·P where P = {P}") print(f"k = {k} = {bin(k)} in binary") print(f"Binary digits (LSB first): {[int(b) for b in reversed(bin(k)[2:])]}") print() R = E(0) Q = P temp_k = k step = 0 print("Step Bit Action R Q (=2^i·P)") while temp_k > 0: bit = temp_k % 2 if bit == 1: R = R + Q action = f"R += Q (2^{step}·P)" else: action = "(skip)" r_str = f"({int(R[0])}, {int(R[1])})" if R != E(0) else "O" q_str = f"({int(Q[0])}, {int(Q[1])})" if Q != E(0) else "O" print(f"{step} {bit} {action} {r_str} {q_str}") Q = Q + Q # double temp_k = temp_k // 2 step += 1 print(f"\nResult: {k}·P = {R}") print(f"Verify: {k * P}") print(f"Match? {R == k * P}")

3. Performance Comparison

Let us compare naive vs double-and-add on increasingly large scalars.

# Use a larger curve for meaningful timing p = next_prime(2^32) E = EllipticCurve(GF(p), [1, 1]) P = E.random_point() # Double-and-add for various k sizes bits_list = [16, 32, 64, 128, 256, 512] daa_times = [] print("Bits of k k (hex, first 8 chars) D&A time (ms) # ops") for bits in bits_list: k = randint(2^(bits-1), 2^bits - 1) start = walltime() result, adds, dbls = double_and_add(P, k) elapsed = (walltime() - start) * 1000 assert result == k * P daa_times.append(elapsed) k_hex = hex(k)[:10] + "..." print(f"{bits} {k_hex} {elapsed:>15.2f} {adds+dbls}") # Plot G = list_plot(list(zip(bits_list, daa_times)), plotjoined=True, color='blue', marker='o', size=60, legend_label='Double-and-add') G.show(figsize=[9, 5], axes_labels=['Bits of $k$', 'Time (ms)'], title='Double-and-add: time scales linearly with bit-length of $k$', gridlines=True) print(f"\nTime is O(log k) = O(bits of k). Doubling the bit-length roughly doubles the time.")

Checkpoint 2. For a 256-bit scalar, double-and-add needs 256\approx 256 doublings and 128\approx 128 additions = 384\approx 384 point operations. Naive would need 2256\approx 2^{256} additions. The speedup factor is 22563841074\frac{2^{256}}{384} \approx 10^{74}. This is the difference between "impossible" and "milliseconds."

4. The One-Way Function

Scalar multiplication is the one-way function of elliptic curve cryptography:

DirectionOperationComplexity
Forward (easy)Given P,kP, k, compute Q=kPQ = kPO(logk)O(\log k) via double-and-add
Backward (hard)Given P,QP, Q, find kk such that Q=kPQ = kPBest known: O(n)O(\sqrt{n}) (Pollard's rho)

This asymmetry is the Elliptic Curve Discrete Logarithm Problem (ECDLP).


Bridge from Module 05. In Module 05, the one-way function was gkmodpg^k \bmod p (fast via square-and-multiply, hard to invert). Here the one-way function is kPkP (fast via double-and-add, hard to invert). The mathematical structure is different, but the cryptographic paradigm is identical.

# Demonstrate the asymmetry: fast forward, hard backward p = next_prime(2^40) E = EllipticCurve(GF(p), [1, 1]) G = E.gens()[0] n = G.order() k_secret = randint(2, n - 2) # Forward: fast start = walltime() Q = k_secret * G forward_time = (walltime() - start) * 1000 # Backward: slow (brute force for small groups) start = walltime() k_found = discrete_log(Q, G, n, operation='+') backward_time = (walltime() - start) * 1000 print(f"Curve over F_p where p ≈ 2^40, |E| ≈ {n.ndigits()} digits") print(f"\nForward (compute kG): {forward_time:.2f} ms") print(f"Backward (find k from Q): {backward_time:.1f} ms") print(f"Ratio: backward is {backward_time/forward_time:.0f}x slower") print(f"Correct? {k_found == k_secret}")

5. Windowed and Montgomery Methods (Overview)

Double-and-add is the basic algorithm, but real implementations use optimizations:

MethodIdeaSpeed-up
Double-and-add (basic)Process 1 bit at a timeBaseline
ww-ary methodProcess ww bits at a time, precompute P,2P,,(2w1)PP, 2P, \ldots, (2^w - 1)PFewer additions
NAF (Non-Adjacent Form)Use digits {1,0,1}\{-1, 0, 1\} to reduce additions~33% fewer additions
Montgomery ladderAlways do same operations regardless of bit valueSide-channel resistant

The Montgomery ladder is particularly important for security: it prevents attackers from learning secret bits by measuring timing or power consumption.

def montgomery_ladder(P, k): """ Compute kP using the Montgomery ladder. Constant-time: always performs both an addition and a doubling per bit. """ E = P.curve() R0 = E(0) # will hold floor(k/2^i) * P at each step R1 = P # will hold (floor(k/2^i) + 1) * P # Process bits from MSB to LSB for i in range(k.bit_length() - 1, -1, -1): bit = (k >> i) & 1 if bit == 0: R1 = R0 + R1 R0 = R0 + R0 # double else: R0 = R0 + R1 R1 = R1 + R1 # double return R0 # Test Montgomery ladder E = EllipticCurve(GF(101), [1, 1]) P = E(0, 1) for k in [1, 2, 7, 43, 73, 97]: result = montgomery_ladder(P, k) expected = k * P status = "✓" if result == expected else "✗" print(f"{k}·P: Montgomery = ({int(result[0])}, {int(result[1])}), " f"SageMath = ({int(expected[0])}, {int(expected[1])}) {status}")

Checkpoint 3. Why is constant-time execution important? If double-and-add takes different amounts of time for bits 0 vs 1, an attacker measuring execution time can deduce the secret scalar kk bit by bit. This is a timing side-channel attack. The Montgomery ladder defeats this by performing the same operations regardless of the bit value.

Crypto foreshadowing. In the next notebook, we will use scalar multiplication as the core operation in ECDH (key exchange) and ECDSA (digital signatures). The security of both protocols rests on the ECDLP: computing kPkP is easy, but recovering kk from PP and kPkP is believed infeasible for properly chosen curves.

6. Exercises

Exercise 1 (Worked): Tracing Double-and-Add

Problem. On E:y2=x3+x+1E: y^2 = x^3 + x + 1 over F23\mathbb{F}_{23}, compute 11P11P where P=(0,1)P = (0, 1) using double-and-add. Show each step.

Solution. 11=1011211 = 1011_2, so we process bits right-to-left:

StepBitActionRRQQ
01R=R+QR = R + QPPPP
11R=R+QR = R + QP+2P=3PP + 2P = 3P2P2P
20(skip)3P3P4P4P
31R=R+QR = R + Q3P+8P=11P3P + 8P = 11P8P8P

Result: R=11PR = 11P. Let us verify:

# Exercise 1: verification E = EllipticCurve(GF(23), [1, 1]) P = E(0, 1) # Step by step print(f"P = {P}") print(f"2P = {2*P}") print(f"3P = P + 2P = {3*P}") print(f"4P = {4*P}") print(f"8P = {8*P}") print(f"11P = 3P + 8P = {11*P}") print() # Using our double-and-add result, adds, dbls = double_and_add(P, 11) print(f"double_and_add(P, 11) = {result}") print(f"Additions: {adds}, Doublings: {dbls}") print(f"Match? {result == 11 * P}")

Exercise 2 (Guided): Operation Count

Problem. For a 256-bit scalar kk:

  1. How many doublings does double-and-add perform?

  2. On average, how many additions? (Hint: each bit is 1 with probability 1/2.)

  3. Compare the total operation count with naive scalar multiplication.

Fill in the TODO cells below.

# Exercise 2: fill in the TODOs bits = 256 # TODO 1: doublings = ? # doublings = ??? # TODO 2: expected additions = ? # avg_additions = ??? # TODO 3: Compare with naive # naive_ops = 2^bits - 1 # daa_ops = doublings + avg_additions # speedup = naive_ops / daa_ops # print(f"Naive: {naive_ops:.2e} operations") # print(f"D&A: {daa_ops} operations") # print(f"Speedup: {speedup:.2e}x")

Exercise 3 (Independent): ECDLP Brute Force

Problem.

  1. On E:y2=x3+x+1E: y^2 = x^3 + x + 1 over F101\mathbb{F}_{101}, pick a generator GG and compute Q=73GQ = 73G using SageMath.

  2. Write a brute-force function ecdlp_brute(G, Q) that finds kk by computing G,2G,3G,G, 2G, 3G, \ldots and comparing with QQ.

  3. Time your brute-force ECDLP solver for curves over Fp\mathbb{F}_p with pp having 10, 15, and 20 bits. Plot the time vs pp (or vs E\sqrt{|E|}).

  4. Based on the trend, estimate how long brute force would take for a 256-bit curve.

# Exercise 3: write your solution here

Summary

ConceptKey Fact
Naive scalar multiplicationkPkP by repeated addition: O(k)O(k), infeasible for large kk
Double-and-addProcess bits of kk: O(logk)O(\log k) doublings + additions
Operation countlog2k\approx \log_2 k doublings + 12log2k\frac{1}{2}\log_2 k additions on average
AnalogyDouble-and-add is to EC what square-and-multiply is to Z/pZ\mathbb{Z}/p\mathbb{Z}^*
Montgomery ladderConstant-time variant; resists timing side-channel attacks
One-way functionComputing kPkP is fast; recovering kk from P,kPP, kP is the ECDLP

Scalar multiplication is the workhorse operation of elliptic curve cryptography. Every key generation, key exchange, and signature computation reduces to one or a few scalar multiplications. In the final notebook, we put it all together with ECDH and ECDSA.


Next: 06f: ECDH and ECDSA