CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

| Download

Demonstration of Rabin's root-finding algorithm and the Cantor-Zassenhaus algorithm for factoring polynomials in finite fields, as presented in Lecture 3 of 18.783

Views: 1122
License: AGPL3
Image: ubuntu2004
Kernel: SageMath 9.2
Problem: Given a polynomial fFq[x]f\in \mathbb F_q[x], determine the Fq\mathbb F_q-rational roots of ff.
Solution: Rabin's algorithm!
p=61; F=GF(p); R.<x>=PolynomialRing(F); f = x^8-2*x+5
f.roots()
[(44, 1), (39, 1), (15, 1)]
f.factor()
(x + 17) * (x + 22) * (x + 46) * (x^2 + 46*x + 1) * (x^3 + 52*x^2 + 41*x + 33)
f.change_ring(GF(p^6)).roots()
[(44, 1), (39, 1), (15, 1), (46*z6^5 + 32*z6^4 + 56*z6^3 + 22*z6^2 + 48*z6 + 44, 1), (7*z6^5 + 35*z6^4 + 2*z6^3 + 4*z6 + 37, 1), (15*z6^5 + 29*z6^4 + 5*z6^3 + 39*z6^2 + 13*z6 + 32, 1), (28*z6^5 + 30*z6^4 + 20*z6^3 + 26*z6^2 + 21*z6 + 30, 1), (26*z6^5 + 57*z6^4 + 39*z6^3 + 35*z6^2 + 36*z6 + 3, 1)]
Theorem: Fpn={αFp:αpn=α}\mathbb F_{p^n} = \{\alpha\in \overline{\mathbb F}_p: \alpha^{p^n}=\alpha\}, in other words, we have xpnx=αFpn(xα)x^{p^n}-x=\prod_{\alpha\in \mathbb F_{p^n}}(x-\alpha)
Proof: For n=1n=1 this follows from Fermat's little theorem. If we identify FpZ/pZ\mathbb F_p\simeq \mathbb Z/p\mathbb Z, we have ap1=1modpa^{p-1}=1\bmod p for a[0,p1]a\in [0,p-1], so the pp elements of Z/pZ\mathbb Z/p\mathbb Z are precisely the roots of xpxx^p-x.
For n>1n>1 this is true by definition: Fpn:=Fpααpn\mathbb F_{p^n}:=\overline{\mathbb{F}}_{p}^{\langle \alpha \mapsto \alpha^{p^n}\rangle} is the fixed field of the pnp^n-power Frobenius automorphism of Fp\overline{\mathbb F}_p, equivalently, the splitting field of xpnxx^{p^n}-x over Fp\mathbb F_p.
Corollary: For any prime power qq we have Fqn={αFq:αqn=α}\mathbb F_{q^n} = \{\alpha\in \overline{\mathbb F}_q: \alpha^{q^n}=\alpha\} and xqnx=αFqn(xα)x^{q^n}-x=\prod_{\alpha\in \mathbb F_{q^n}}(x-\alpha).
product([x-i for i in F]) == x^p-x
True
product([x.change_ring(GF(p^2))-i for i in GF(p^2)]) == x^(p^2)-x
True
Corollary: Let fFq[x]f\in \mathbb F_q[x] be nonzero and let S:={αFqn:f(α)=0}S:=\{\alpha\in \mathbb F_{q^n}:f(\alpha)=0\}. Then gcd(f(x),xqnx)=αS(xα).\gcd(f(x),x^{q^n}-x)=\prod_{\alpha\in S}(x-\alpha). In particular, #S=deg(gcd(f(x),xqnx)\#S=\deg(\gcd(f(x),x^{q^n}-x).
As usual, gcd(a,b)\gcd(a,b) denotes the unique monic representative of the nonzero principal ideal (a,b)(a,b) of Fq[x]\mathbb F_q[x].
%time gcd(f,x^p-x).degree()
CPU times: user 57 µs, sys: 0 ns, total: 57 µs Wall time: 60.6 µs
3
%time gcd(f,x^(p^2)-x).degree()
CPU times: user 465 µs, sys: 90 µs, total: 555 µs Wall time: 562 µs
5
%time gcd(f,x^(p^3)-x).degree()
CPU times: user 42.7 ms, sys: 330 µs, total: 43 ms Wall time: 43 ms
6
%time gcd(f,x^(p^6)-x).degree()
Key point:Make sure you exponentiate in the right ring!

The first step in computing gcd(f(x),xqx)\gcd(f(x),x^q-x) is to compute xqxmodfx^q-x\bmod f (note that qq may be exponentially large, but degf\deg f certainly won't be).
If πf ⁣:Fq[x]Fq[x]/(f)\pi_f\colon \mathbb{F}_q[x]\to\mathbb{F}_q[x]/(f) is the ring homomorphism defined by xxmodfx\mapsto x\bmod f, then πf(xqx)=πf(xq)πf(x)=πf(x)qπf(x),\pi_f(x^q-x)=\pi_f(x^q)-\pi_f(x)=\pi_f(x)^q-\pi_f(x), so the order of operations makes no mathematical difference, but the RHS can be computed exponentially faster than either the middle or the LHS!

The complexity of computing xqmodfx^q\bmod f with Euclidean division is quasi-linear in qq,
The complexity of computing πf(x)q\pi_f(x)^q using binary exponentiation is quasi-linear in logq\log q.
pif=R.quotient(f)
%time print(pif(x)^(p^6))
p=2^255-19; F=GF(p); R.<x>=PolynomialRing(F); f=x^8-2*x+5; pif=R.quotient(f); %time print(f.factor())
%time print(pif(x)^p) %time print(pif(x)^(p^2)) %time print(pif(x)^(p^3)) %time print(pif(x)^(p^4)) %time print(pif(x)^(p^10))
def distinct_root_poly(f): q = f.base_ring().cardinality() pif = f.parent().quotient(f) return gcd(f,(pif(x)^q).lift()-x) def count_distinct_roots(f): return distinct_root_poly(f).degree()
p=61; F=GF(p); R.<x>=PolynomialRing(F); f=x^8-2*x+5; pif=R.quotient(f) for n in range(1,7): print("%d distinct roots over F_{%d^%d}" % (count_distinct_roots(f.change_ring(GF(p^n))),p,n))

Okay, we can count (distinct) roots, but what if we actually want to know what they are?

gcd(f,(pif(x)^p-x).lift())
s = (p-1) // 2 # we assume p (or more generally q) is odd gcd(f,(pif(x)^s-1).lift())
print(gcd(f,(pif(x)^p-x).lift())/gcd(f,(pif(x)^s-1).lift()))

Excellent, we managed to split the roots of ff by picking out two that are roots of xs1x^s-1, where s=(q1)/2s=(q-1)/2 (we are assuming qq is odd).
What are the roots of xs1x^s-1 anyway?

print(sorted((x^s-1).roots()))

My keen sense of pattern recognition (or a recollection of Euler's criterion) tells me these are the roots of xs1x^s-1 are precisely the squares (quadratic residues) in Fp×.\mathbb F_p^\times.

product([x-a^2 for a in F if a < p-a]) == (x^s-1)

But if ff has multiple roots that are quadratic residues (or maybe none), then what?

Rabin: Instead of xs1x^s-1, use (x+δ)s1(x+\delta)^s-1 for a random δFq\delta\in \mathbb F_q.

Definition: Let us say that α,βFq\alpha,\beta\in \mathbb F_q are of different type if both are nonzero and αsβs\alpha^s\ne \beta^s, in other words, one is a quadratic residue and one is not.
Theorem: For every pair of distinct α,βFq\alpha,\beta\in \mathbb F_q we have #{δFq:α+δ and β+δ are of different type }=q12.\#\{\delta\in \mathbb F_q:\alpha+\delta \text{ and } \beta+\delta \text{ are of different type }\} = \frac{q-1}{2}.
Proof: See Section 3.11 in the lecture notes for a short easy proof.
Corollary: Let fFq[x]f\in \mathbb F_q[x], let g(x)=gcd(f,xqx)g(x)=\gcd(f,x^q-x), let s=(q1)/2Zs=(q-1)/2\in \Z, and let δ\delta be a uniformly random element of Fq\mathbb F_q.
If gg has more than one nonzero root, then Pr[0<deggcd(g,(x+δ)s1)<degg]12.\Pr[0<\deg\gcd(g,(x+\delta)^s-1)<\deg g] \ge \frac{1}{2}.
def rational_root(f): """Returns a rational root of f or None if f has no rational roots.""" q = f.base_ring().cardinality() assert q % 2 == 1 s = (q-1) // 2 # Note // yields an integer (whereas / would yield a rational) x = f.parent().gen() if f[0]==0: return f.base_ring()(0) g = distinct_root_poly(f) print("Initial g = gcd(f,x^q-x) = %s" % g) pig = g.parent().quotient(g) while g.degree() > 1: delta = g.base_ring().random_element() print("delta = %s" % delta) h = gcd(g, ((pig(x)-delta)^s-1).lift()) if 0 < h.degree() < g.degree(): g = h if 2*h.degree() <= g.degree() else g // h # as above, we use // to get a polynomial pig = g.parent().quotient(g) print("Better g = %s" % g) return -g[0] if g.degree() == 1 else None
Theorem: Let n=logqn=\log q and d=degfd=\deg f. The expected running time of rational_root\texttt{rational\_root} is O~(n2d)\tilde O(n^2d) bit operations.

Here the soft-O notation means up to a poly-logarithmic factors. See the lecture notes for a more precise bound and a proof.

for i in range(5): r = rational_root(f) print("Found root %s\n" % r) assert f(r) == 0
h = product([x-i for i in range(1,20)]) for i in range(5): r = rational_root(h) print("Found root %s\n" % r) assert h(r) == 0
p=2^61-1; F=GF(p); R.<x>=PolynomialRing(F); f=x^8-2*x+5; h = product([x-i for i in range(1,20)]) for i in range(3): %time print("Found root %s\n" % rational_root(h))

What if we want all the rational roots?

def distinct_rational_roots(f,recurse=False): """Returns a list of all the distinct rational roots of f.""" q = f.base_ring().cardinality() assert q % 2 == 1 s = (q-1) // 2 x = f.parent().gen() roots = [] if recurse: g = f else: g = distinct_root_poly(f) if g(0) == 0: roots = [g.base_ring()(0)] g = g // x if g.degree() <= 1: return [-g[0]] if g.degree() == 1 else [] while True: delta = g.base_ring().random_element() pig = g.parent().quotient(g) h = gcd(g, ((pig(x)-delta)^s-1).lift()) if 0 < h.degree() < g.degree(): roots += distinct_rational_roots(h, True) roots += distinct_rational_roots(g//h, True) return roots
Theorem: Let n=logqn=\log q and d=degfd=\deg f. The expected running time of distinct_rational_root\texttt{distinct\_rational\_root} is O~(n2d)\tilde O(n^2d) bit operations.

This looks the same as the bound for rational_root\texttt{rational\_root}, but in fact there is an extra factor of logd\log d hidden in the soft-O notation.

%time distinct_rational_roots(f)
%time distinct_rational_roots(product([x-F.random_element() for i in range(10)]))

What if we want all the rational roots and their multiplicities?

Yun: Compute the squarefree factorization!

Definition: The squarefree factorization of a monic fFq[x]f\in \mathbb F_q[x] is the unique list of monic squarefree polynomials g1,,gmg_1,\ldots,g_m with gm1g_m\ne 1 for which f=g1g22gmmf=g_1g_2^2\cdots g_m^m.
Key fact: If f=f1fnf=f_1\cdots f_n is the factorization of a monic fFq[x]f\in \mathbb F_q[x] into irreducibles then fi=fjf_i=f_j for some iji\ne j if and only if figcd(f,f)f_i|\gcd(f,f').
This holds in any perfect field.
def squarefree_factorization(f): """Yun's algorithm to compute a list of polynomials g_1,..,g_m such that f=lc(f)g_1g_2^2...g_m^m with g_m != 1, assuming deg(f) < char(Fq). We include g_0=1 for convenience.""" assert f.degree() < f.base_ring().characteristic() if not f.is_monic: f = f.monic() df = f.derivative() u = gcd(f,df) v, w = f//u, df//u gs =[f.parent()(1)] while True: g = gcd(v, w-v.derivative()); gs.append(g) if v.degree() == g.degree(): return gs v, w = v//g, (w-v.derivative())//g
Theorem: Let n=logqn=\log q and d=degfd=\deg f. The running time of squarefree_factorization\texttt{squarefree\_factorization} is O~(nd)\tilde O(nd) bit operations.

This is negligible compared to the complexity of distinct_rational_roots\texttt{distinct\_rational\_roots}, so we might as well call squarefree_factorization\texttt{squarefree\_factorization} first.
Even if we only want distinct roots, this will actually save time if ff is not squarefree.

h = product([(x^2+F.random_element()*x+F.random_element())^i for i in range(1,10)]) %time gs = squarefree_factorization(h) assert h == product([gs[i]^i for i in range(len(gs))]) gs
def rational_roots(f): """Returns a list of tuples identifying the rational roots of f and their multiplicities.""" gs = squarefree_factorization(f) return reduce(lambda x, y: x+y,[[(a,i) for a in distinct_rational_roots(gs[i])] for i in range(1,len(gs))])
h = product([(x-F.random_element())^i for i in range(1,6)]) %time rational_roots(h)

What if we want to factor ff into irreducibles?

Rabin: You can do that with my root-finding algorithm.

Cantor-Zassenhaus: Yes, but that requires working in extension fields, and with our algorithm there is no need!

The first step is to compute the distinct degree factorization

Definition: The distinct degree factorization of a monic fFq[x]f\in \mathbb F_q[x] is the unique list of polynomials g1,,gdg_1,\ldots,g_d with f=g1gdf=g_1\cdots g_d such that each gig_i is a (possibly empty) product of distinct monic irreducible polynomials of degree ii, for 1id=degf1\le i\le d=\deg f.
Key point: This can be done deterministically by taking succesive gcds with xqixx^{q^i}-x.
def distinct_degree_poly(f,i): q = f.base_ring().cardinality() pif = f.parent().quotient(f) return gcd(f,(pif(x)^(q^i)).lift()-x) def distinct_degree_factorization(f, squarefree=False): """ Computes the distinct degree factorization of f as list gs with gs[0]=1 and gs[i]=g_i such that f=prod_i gs[i]^i. The optional parameter squarefree inidicates whether f is known to be squarefree (usually one computes the squarefree factorization first). """ if not squarefree: return naive_distinct_degree_factorization(f) d = f.degree() gs = [f.parent()(1) for i in range(d+1)] for i in range(1,d+1): if 2*i > f.degree(): gs[f.degree()] = f break gs[i] = distinct_degree_poly(f,i) f = f // gs[i] return gs def naive_distinct_degree_factorization(f): """ Computed distinct degree factorization of arbitrary f by combining distinct degree factorization of polys in square free factorization. There is no reason to ever use this, it is provided just for the sake of completeness """ d = f.degree() gs = [f.parent()(1) for i in range(d+1)] hs = [distinct_degree_factorization(g, squarefree=True) for g in squarefree_factorization(f)] # combine results (this is silly, we are undoing work here) for i in range(1,len(hs)): for j in range(1,len(hs[i])): gs[j] *= hs[i][j]^i return gs def facpat(f): """Returns a dictionary {d:i} whose keys are the degrees of the irreducible factors of f and whose values are the number of irreducible factors of that degree.""" X = {} gs = squarefree_factorization(f) for i in range(1,len(gs)): gis = distinct_degree_factorization(gs[i],squarefree=True) for j in range(1,len(gis)): if gis[j].degree() > 0: X[j] = X.get(j,0) + i * (gis[j].degree() // j) return X
%time facpat(f)
h = (x^9-1)^2*(x^32-1) assert h == product(distinct_degree_factorization(h)) %time print(facpat(h)) print([(r[0].degree(),r[1]) for r in h.factor()])

The key innovation introduced by Cantor-Zassenhaus is that when searching for irreducible factors gg of ff of degree jj, rather than computing gcd(f,(x+δ)s1))\gcd(f,(x+\delta)^s-1)) using a random δFqj\delta\in \mathbb F_{q^j} and computing the gcd in Fqj[x]\mathbb F_{q^j}[x] (which is how one would go about finding a root of gg using Rabin's algorithm), it is better to compute gcd(f,us1)\gcd(f,u^s-1) in Fq[x]\mathbb F_q[x] using a random uFq[x]u\in \mathbb F_q[x] coprime to ff with degu<degf\deg u < \deg f. This faster because we are working the ring Fq[x]/(f)\mathbb F_q[x]/(f) rather than Fqj[x]/(f)\mathbb F_{q^j}[x]/(f) (so the bit-size of each element is smaller by a factor of jj), and the probability of success is the same.

Theorem: Let fFq[x]f\in \mathbb F_q[x] be the product of r2r\ge 2 distinct monic irreducible polynomials of degree jj and let uFq[j]u\in \mathbb F_q[j] be a random polynomial coprime to ff with degu<degf\deg u < \deg f. Then Pr[0<deggcd(f,us1)<degf]=21r12.\Pr[0<\deg\gcd(f,u^s-1)<\deg f] = 2^{1-r} \ge \frac{1}{2}.
Proof: This follows from applying the Chinese Remainder Theorem to Fq[x]/(f)Fqjr\mathbb F_q[x]/(f)\simeq \mathbb F_{q^j}^r. See Section 3.12 of the lecture notes for details.
def equal_degree_factorization(f,j): """Completely factors a monic polynomial f in Fq[x] that is known to be the product of distinct irreducible polynomials of degree j""" d = f.degree() assert d % j == 0 if d == 0: return [] elif d == j: return [f] Fq = f.base_ring(); q = Fq.cardinality() assert q%2 == 1 s = (q^j-1) // 2 x = f.parent().gen() pif = f.parent().quotient(f) while True: # generate a random polynomial of degree < deg(f), note f.parent()(v) converts a vector of Fq coefficients to a polynomial u = f.parent()([Fq.random_element() for i in range(d)]) h = gcd(f,u) if 0 < h.degree() < d: return equal_degree_factorization(h,j) + equal_degree_factorization(f//h,j) # at this point we know u is coprime to f h = gcd(f,(pif(u)^s-1).lift()) if 0 < h.degree() < d: return equal_degree_factorization(h,j) + equal_degree_factorization(f//h,j) def factorization(f): """Computes the complete factorization of a non-constant f in Fq[x] using the Cantor-Zassenhaaus algorithm (assumes q is odd).""" assert f.degree() > 0 if not f.is_monic: f = f.monic() x = f.parent().gen() res = [] gs = squarefree_factorization(f) for i in range(1,len(gs)): gis = distinct_degree_factorization(gs[i],squarefree=True) for j in range(1,len(gis)): res += [(g,i) for g in equal_degree_factorization(gis[j],j)] # sort output to make it canonical (independent of random choices) return sorted(res)
assert factorization(f) == sorted(list(f.factor())) print(factorization(f)) %timeit -n 10 factorization(f) %timeit -n 10 f.factor()
assert factorization(h) == sorted(list((h).factor())) print(factorization(h)) %timeit -n 1 -r 5 factorization(h) %timeit -n 1 -r 5 h.factor()
Theorem: Let n=logqn=\log q and d=degfd=\deg f. The expected running time of factorization\texttt{factorization} is O~(n2d2)\tilde O(n^2d^2) bit operations.

There are other factorization methods based on linear algebra and modular composition that improve the dependence on dd.
Currently the best asymptotic bound is achieved by the Kedlaya-Umans algorithm with an expected running time of O~(d3/2n+dn2)\tilde O(d^{3/2}n + dn^2).

p=2^255-19; F=GF(p); R.<x>=PolynomialRing(F); f=x^8-2*x+5; h = product([x-i for i in range(1,20)]) assert factorization(f^3+f^2+1) == sorted(list((f^3+f^2+1).factor())) %time print(factorization(f))