Path: blob/main/Modules/_decimal/libmpdec/literature/bignum.txt
12 views
12Bignum support (Fast Number Theoretic Transform or FNT):3========================================================45Bignum arithmetic in libmpdec uses the scheme for fast convolution6of integer sequences from:78J. M. Pollard: The fast Fourier transform in a finite field9http://www.ams.org/journals/mcom/1971-25-114/S0025-5718-1971-0301966-0/home.html101112The transform in a finite field can be used for convolution in the same13way as the Fourier Transform. The main advantages of the Number Theoretic14Transform are that it is both exact and very memory efficient.151617Convolution in pseudo-code:18~~~~~~~~~~~~~~~~~~~~~~~~~~~1920fnt_convolute(a, b):21x = fnt(a) # forward transform of a22y = fnt(b) # forward transform of b23z = pairwise multiply x[i] and y[i]24result = inv_fnt(z) # backward transform of z.252627Extending the maximum transform length (Chinese Remainder Theorem):28-------------------------------------------------------------------2930The maximum transform length is quite limited when using a single31prime field. However, it is possible to use multiple primes and32recover the result using the Chinese Remainder Theorem.333435Multiplication in pseudo-code:36~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~3738_mpd_fntmul(u, v):39c1 = fnt_convolute(u, v, P1) # convolute modulo prime140c2 = fnt_convolute(u, v, P2) # convolute modulo prime241c3 = fnt_convolute(u, v, P3) # convolute modulo prime342result = crt3(c1, c2, c3) # Chinese Remainder Theorem434445Optimized transform functions:46------------------------------4748There are three different fnt() functions:4950std_fnt: "standard" decimation in frequency transform for array lengths51of 2**n. Performs well up to 1024 words.5253sixstep: Cache-friendly algorithm for array lengths of 2**n. Outperforms54std_fnt for large arrays.5556fourstep: Algorithm for array lengths of 3 * 2**n. Also cache friendly57in large parts.585960List of bignum-only files:61--------------------------6263Functions from these files are only used in _mpd_fntmul().6465umodarith.h -> fast low level routines for unsigned modular arithmetic66numbertheory.c -> routines for setting up the FNT67difradix2.c -> decimation in frequency transform, used as the68"base case" by the following three files:6970fnt.c -> standard transform for smaller arrays71sixstep.c -> transform large arrays of length 2**n72fourstep.c -> transform arrays of length 3 * 2**n7374convolute.c -> do the actual fast convolution, using one of75the three transform functions.76transpose.c -> transpositions needed for the sixstep algorithm.77crt.c -> Chinese Remainder Theorem: use information from three78transforms modulo three different primes to get the79final result.808182838485