17db96d56Sopenharmony_ci
27db96d56Sopenharmony_ci
37db96d56Sopenharmony_ciBignum support (Fast Number Theoretic Transform or FNT):
47db96d56Sopenharmony_ci========================================================
57db96d56Sopenharmony_ci
67db96d56Sopenharmony_ciBignum arithmetic in libmpdec uses the scheme for fast convolution
77db96d56Sopenharmony_ciof integer sequences from:
87db96d56Sopenharmony_ci
97db96d56Sopenharmony_ciJ. M. Pollard: The fast Fourier transform in a finite field
107db96d56Sopenharmony_cihttp://www.ams.org/journals/mcom/1971-25-114/S0025-5718-1971-0301966-0/home.html
117db96d56Sopenharmony_ci
127db96d56Sopenharmony_ci
137db96d56Sopenharmony_ciThe transform in a finite field can be used for convolution in the same
147db96d56Sopenharmony_ciway as the Fourier Transform. The main advantages of the Number Theoretic
157db96d56Sopenharmony_ciTransform are that it is both exact and very memory efficient.
167db96d56Sopenharmony_ci
177db96d56Sopenharmony_ci
187db96d56Sopenharmony_ciConvolution in pseudo-code:
197db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~
207db96d56Sopenharmony_ci
217db96d56Sopenharmony_ci  fnt_convolute(a, b):
227db96d56Sopenharmony_ci    x = fnt(a)                          # forward transform of a
237db96d56Sopenharmony_ci    y = fnt(b)                          # forward transform of b
247db96d56Sopenharmony_ci    z = pairwise multiply x[i] and y[i]
257db96d56Sopenharmony_ci    result = inv_fnt(z)                 # backward transform of z.
267db96d56Sopenharmony_ci
277db96d56Sopenharmony_ci
287db96d56Sopenharmony_ciExtending the maximum transform length (Chinese Remainder Theorem):
297db96d56Sopenharmony_ci-------------------------------------------------------------------
307db96d56Sopenharmony_ci
317db96d56Sopenharmony_ciThe maximum transform length is quite limited when using a single
327db96d56Sopenharmony_ciprime field. However, it is possible to use multiple primes and
337db96d56Sopenharmony_cirecover the result using the Chinese Remainder Theorem.
347db96d56Sopenharmony_ci
357db96d56Sopenharmony_ci
367db96d56Sopenharmony_ciMultiplication in pseudo-code:
377db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
387db96d56Sopenharmony_ci
397db96d56Sopenharmony_ci  _mpd_fntmul(u, v):
407db96d56Sopenharmony_ci    c1 = fnt_convolute(u, v, P1)  # convolute modulo prime1
417db96d56Sopenharmony_ci    c2 = fnt_convolute(u, v, P2)  # convolute modulo prime2
427db96d56Sopenharmony_ci    c3 = fnt_convolute(u, v, P3)  # convolute modulo prime3
437db96d56Sopenharmony_ci    result = crt3(c1, c2, c3)     # Chinese Remainder Theorem
447db96d56Sopenharmony_ci
457db96d56Sopenharmony_ci
467db96d56Sopenharmony_ciOptimized transform functions:
477db96d56Sopenharmony_ci------------------------------
487db96d56Sopenharmony_ci
497db96d56Sopenharmony_ciThere are three different fnt() functions:
507db96d56Sopenharmony_ci
517db96d56Sopenharmony_ci   std_fnt: "standard" decimation in frequency transform for array lengths
527db96d56Sopenharmony_ci            of 2**n. Performs well up to 1024 words.
537db96d56Sopenharmony_ci
547db96d56Sopenharmony_ci   sixstep: Cache-friendly algorithm for array lengths of 2**n. Outperforms
557db96d56Sopenharmony_ci            std_fnt for large arrays.
567db96d56Sopenharmony_ci
577db96d56Sopenharmony_ci   fourstep: Algorithm for array lengths of 3 * 2**n. Also cache friendly
587db96d56Sopenharmony_ci             in large parts.
597db96d56Sopenharmony_ci
607db96d56Sopenharmony_ci
617db96d56Sopenharmony_ciList of bignum-only files:
627db96d56Sopenharmony_ci--------------------------
637db96d56Sopenharmony_ci
647db96d56Sopenharmony_ciFunctions from these files are only used in _mpd_fntmul().
657db96d56Sopenharmony_ci
667db96d56Sopenharmony_ci  umodarith.h    -> fast low level routines for unsigned modular arithmetic
677db96d56Sopenharmony_ci  numbertheory.c -> routines for setting up the FNT
687db96d56Sopenharmony_ci  difradix2.c    -> decimation in frequency transform, used as the
697db96d56Sopenharmony_ci                    "base case" by the following three files:
707db96d56Sopenharmony_ci
717db96d56Sopenharmony_ci      fnt.c          -> standard transform for smaller arrays
727db96d56Sopenharmony_ci      sixstep.c      -> transform large arrays of length 2**n
737db96d56Sopenharmony_ci      fourstep.c     -> transform arrays of length 3 * 2**n
747db96d56Sopenharmony_ci
757db96d56Sopenharmony_ci  convolute.c    -> do the actual fast convolution, using one of
767db96d56Sopenharmony_ci                    the three transform functions.
777db96d56Sopenharmony_ci  transpose.c    -> transpositions needed for the sixstep algorithm.
787db96d56Sopenharmony_ci  crt.c          -> Chinese Remainder Theorem: use information from three
797db96d56Sopenharmony_ci                    transforms modulo three different primes to get the
807db96d56Sopenharmony_ci                    final result.
817db96d56Sopenharmony_ci
827db96d56Sopenharmony_ci
837db96d56Sopenharmony_ci
84