17db96d56Sopenharmony_ci
27db96d56Sopenharmony_ci
37db96d56Sopenharmony_ci(* Copyright (c) 2011-2020 Stefan Krah. All rights reserved. *)
47db96d56Sopenharmony_ci
57db96d56Sopenharmony_ci
67db96d56Sopenharmony_ciThe Matrix Fourier Transform:
77db96d56Sopenharmony_ci=============================
87db96d56Sopenharmony_ci
97db96d56Sopenharmony_ciIn libmpdec, the Matrix Fourier Transform [1] is called four-step transform
107db96d56Sopenharmony_ciafter a variant that appears in [2]. The algorithm requires that the input
117db96d56Sopenharmony_ciarray can be viewed as an R*C matrix.
127db96d56Sopenharmony_ci
137db96d56Sopenharmony_ciAll operations are done modulo p. For readability, the proofs drop all
147db96d56Sopenharmony_ciinstances of (mod p).
157db96d56Sopenharmony_ci
167db96d56Sopenharmony_ci
177db96d56Sopenharmony_ciAlgorithm four-step (forward transform):
187db96d56Sopenharmony_ci----------------------------------------
197db96d56Sopenharmony_ci
207db96d56Sopenharmony_ci  a := input array
217db96d56Sopenharmony_ci  d := len(a) = R * C
227db96d56Sopenharmony_ci  p := prime
237db96d56Sopenharmony_ci  w := primitive root of unity of the prime field
247db96d56Sopenharmony_ci  r := w**((p-1)/d)
257db96d56Sopenharmony_ci  A := output array
267db96d56Sopenharmony_ci
277db96d56Sopenharmony_ci  1) Apply a length R FNT to each column.
287db96d56Sopenharmony_ci
297db96d56Sopenharmony_ci  2) Multiply each matrix element (addressed by j*C+m) by r**(j*m).
307db96d56Sopenharmony_ci
317db96d56Sopenharmony_ci  3) Apply a length C FNT to each row.
327db96d56Sopenharmony_ci
337db96d56Sopenharmony_ci  4) Transpose the matrix.
347db96d56Sopenharmony_ci
357db96d56Sopenharmony_ci
367db96d56Sopenharmony_ciProof (forward transform):
377db96d56Sopenharmony_ci--------------------------
387db96d56Sopenharmony_ci
397db96d56Sopenharmony_ci  The algorithm can be derived starting from the regular definition of
407db96d56Sopenharmony_ci  the finite-field transform of length d:
417db96d56Sopenharmony_ci
427db96d56Sopenharmony_ci            d-1
437db96d56Sopenharmony_ci           ,----
447db96d56Sopenharmony_ci           \
457db96d56Sopenharmony_ci    A[k] =  |  a[l]  * r**(k * l)
467db96d56Sopenharmony_ci           /
477db96d56Sopenharmony_ci           `----
487db96d56Sopenharmony_ci           l = 0
497db96d56Sopenharmony_ci
507db96d56Sopenharmony_ci
517db96d56Sopenharmony_ci  The sum can be rearranged into the sum of the sums of columns:
527db96d56Sopenharmony_ci
537db96d56Sopenharmony_ci            C-1     R-1
547db96d56Sopenharmony_ci           ,----   ,----
557db96d56Sopenharmony_ci           \       \
567db96d56Sopenharmony_ci         =  |       |  a[i * C + j] * r**(k * (i * C + j))
577db96d56Sopenharmony_ci           /       /
587db96d56Sopenharmony_ci           `----   `----
597db96d56Sopenharmony_ci           j = 0   i = 0
607db96d56Sopenharmony_ci
617db96d56Sopenharmony_ci
627db96d56Sopenharmony_ci  Extracting a constant from the inner sum:
637db96d56Sopenharmony_ci
647db96d56Sopenharmony_ci            C-1           R-1
657db96d56Sopenharmony_ci           ,----         ,----
667db96d56Sopenharmony_ci           \             \
677db96d56Sopenharmony_ci         =  |  r**k*j  *  |  a[i * C + j] * r**(k * i * C)
687db96d56Sopenharmony_ci           /             /
697db96d56Sopenharmony_ci           `----         `----
707db96d56Sopenharmony_ci           j = 0         i = 0
717db96d56Sopenharmony_ci
727db96d56Sopenharmony_ci
737db96d56Sopenharmony_ci  Without any loss of generality, let k = n * R + m,
747db96d56Sopenharmony_ci  where n < C and m < R:
757db96d56Sopenharmony_ci
767db96d56Sopenharmony_ci                C-1                          R-1
777db96d56Sopenharmony_ci               ,----                        ,----
787db96d56Sopenharmony_ci               \                            \
797db96d56Sopenharmony_ci    A[n*R+m] =  |  r**(R*n*j) * r**(m*j)  *  |  a[i*C+j] * r**(R*C*n*i) * r**(C*m*i)
807db96d56Sopenharmony_ci               /                            /
817db96d56Sopenharmony_ci               `----                        `----
827db96d56Sopenharmony_ci               j = 0                        i = 0
837db96d56Sopenharmony_ci
847db96d56Sopenharmony_ci
857db96d56Sopenharmony_ci  Since r = w ** ((p-1) / (R*C)):
867db96d56Sopenharmony_ci
877db96d56Sopenharmony_ci     a) r**(R*C*n*i) = w**((p-1)*n*i) = 1
887db96d56Sopenharmony_ci
897db96d56Sopenharmony_ci     b) r**(C*m*i) = w**((p-1) / R) ** (m*i) = r_R ** (m*i)
907db96d56Sopenharmony_ci
917db96d56Sopenharmony_ci     c) r**(R*n*j) = w**((p-1) / C) ** (n*j) = r_C ** (n*j)
927db96d56Sopenharmony_ci
937db96d56Sopenharmony_ci     r_R := root of the subfield of length R.
947db96d56Sopenharmony_ci     r_C := root of the subfield of length C.
957db96d56Sopenharmony_ci
967db96d56Sopenharmony_ci
977db96d56Sopenharmony_ci                C-1                             R-1
987db96d56Sopenharmony_ci               ,----                           ,----
997db96d56Sopenharmony_ci               \                               \
1007db96d56Sopenharmony_ci    A[n*R+m] =  |  r_C**(n*j) * [ r**(m*j)  *   |  a[i*C+j] * r_R**(m*i) ]
1017db96d56Sopenharmony_ci               /                     ^         /
1027db96d56Sopenharmony_ci               `----                 |         `----    1) transform the columns
1037db96d56Sopenharmony_ci               j = 0                 |         i = 0
1047db96d56Sopenharmony_ci                 ^                   |
1057db96d56Sopenharmony_ci                 |                   `-- 2) multiply
1067db96d56Sopenharmony_ci                 |
1077db96d56Sopenharmony_ci                 `-- 3) transform the rows
1087db96d56Sopenharmony_ci
1097db96d56Sopenharmony_ci
1107db96d56Sopenharmony_ci   Note that the entire RHS is a function of n and m and that the results
1117db96d56Sopenharmony_ci   for each pair (n, m) are stored in Fortran order.
1127db96d56Sopenharmony_ci
1137db96d56Sopenharmony_ci   Let the term in square brackets be f(m, j). Step 1) and 2) precalculate
1147db96d56Sopenharmony_ci   the term for all (m, j). After that, the original matrix is now a lookup
1157db96d56Sopenharmony_ci   table with the mth element in the jth column at location m * C + j.
1167db96d56Sopenharmony_ci
1177db96d56Sopenharmony_ci   Let the complete RHS be g(m, n). Step 3) does an in-place transform of
1187db96d56Sopenharmony_ci   length n on all rows. After that, the original matrix is now a lookup
1197db96d56Sopenharmony_ci   table with the mth element in the nth column at location m * C + n.
1207db96d56Sopenharmony_ci
1217db96d56Sopenharmony_ci   But each (m, n) pair should be written to location n * R + m. Therefore,
1227db96d56Sopenharmony_ci   step 4) transposes the result of step 3).
1237db96d56Sopenharmony_ci
1247db96d56Sopenharmony_ci
1257db96d56Sopenharmony_ci
1267db96d56Sopenharmony_ciAlgorithm four-step (inverse transform):
1277db96d56Sopenharmony_ci----------------------------------------
1287db96d56Sopenharmony_ci
1297db96d56Sopenharmony_ci  A  := input array
1307db96d56Sopenharmony_ci  d  := len(A) = R * C
1317db96d56Sopenharmony_ci  p  := prime
1327db96d56Sopenharmony_ci  d' := d**(p-2)             # inverse of d
1337db96d56Sopenharmony_ci  w  := primitive root of unity of the prime field
1347db96d56Sopenharmony_ci  r  := w**((p-1)/d)         # root of the subfield
1357db96d56Sopenharmony_ci  r' := w**((p-1) - (p-1)/d) # inverse of r
1367db96d56Sopenharmony_ci  a  := output array
1377db96d56Sopenharmony_ci
1387db96d56Sopenharmony_ci  0) View the matrix as a C*R matrix.
1397db96d56Sopenharmony_ci
1407db96d56Sopenharmony_ci  1) Transpose the matrix, producing an R*C matrix.
1417db96d56Sopenharmony_ci
1427db96d56Sopenharmony_ci  2) Apply a length C FNT to each row.
1437db96d56Sopenharmony_ci
1447db96d56Sopenharmony_ci  3) Multiply each matrix element (addressed by i*C+n) by r**(i*n).
1457db96d56Sopenharmony_ci
1467db96d56Sopenharmony_ci  4) Apply a length R FNT to each column.
1477db96d56Sopenharmony_ci
1487db96d56Sopenharmony_ci
1497db96d56Sopenharmony_ciProof (inverse transform):
1507db96d56Sopenharmony_ci--------------------------
1517db96d56Sopenharmony_ci
1527db96d56Sopenharmony_ci  The algorithm can be derived starting from the regular definition of
1537db96d56Sopenharmony_ci  the finite-field inverse transform of length d:
1547db96d56Sopenharmony_ci
1557db96d56Sopenharmony_ci                  d-1
1567db96d56Sopenharmony_ci                 ,----
1577db96d56Sopenharmony_ci                 \
1587db96d56Sopenharmony_ci    a[k] =  d' *  |  A[l]  * r' ** (k * l)
1597db96d56Sopenharmony_ci                 /
1607db96d56Sopenharmony_ci                 `----
1617db96d56Sopenharmony_ci                 l = 0
1627db96d56Sopenharmony_ci
1637db96d56Sopenharmony_ci
1647db96d56Sopenharmony_ci  The sum can be rearranged into the sum of the sums of columns. Note
1657db96d56Sopenharmony_ci  that at this stage we still have a C*R matrix, so C denotes the number
1667db96d56Sopenharmony_ci  of rows:
1677db96d56Sopenharmony_ci
1687db96d56Sopenharmony_ci                  R-1     C-1
1697db96d56Sopenharmony_ci                 ,----   ,----
1707db96d56Sopenharmony_ci                 \       \
1717db96d56Sopenharmony_ci         =  d' *  |       |  a[j * R + i] * r' ** (k * (j * R + i))
1727db96d56Sopenharmony_ci                 /       /
1737db96d56Sopenharmony_ci                 `----   `----
1747db96d56Sopenharmony_ci                 i = 0   j = 0
1757db96d56Sopenharmony_ci
1767db96d56Sopenharmony_ci
1777db96d56Sopenharmony_ci  Extracting a constant from the inner sum:
1787db96d56Sopenharmony_ci
1797db96d56Sopenharmony_ci                  R-1                C-1
1807db96d56Sopenharmony_ci                 ,----              ,----
1817db96d56Sopenharmony_ci                 \                  \
1827db96d56Sopenharmony_ci         =  d' *  |  r' ** (k*i)  *  |  a[j * R + i] * r' ** (k * j * R)
1837db96d56Sopenharmony_ci                 /                  /
1847db96d56Sopenharmony_ci                 `----              `----
1857db96d56Sopenharmony_ci                 i = 0              j = 0
1867db96d56Sopenharmony_ci
1877db96d56Sopenharmony_ci
1887db96d56Sopenharmony_ci  Without any loss of generality, let k = m * C + n,
1897db96d56Sopenharmony_ci  where m < R and n < C:
1907db96d56Sopenharmony_ci
1917db96d56Sopenharmony_ci                     R-1                                  C-1
1927db96d56Sopenharmony_ci                    ,----                                ,----
1937db96d56Sopenharmony_ci                    \                                    \
1947db96d56Sopenharmony_ci    A[m*C+n] = d' *  |  r' ** (C*m*i) *  r' ** (n*i)   *  |  a[j*R+i] * r' ** (R*C*m*j) * r' ** (R*n*j)
1957db96d56Sopenharmony_ci                    /                                    /
1967db96d56Sopenharmony_ci                    `----                                `----
1977db96d56Sopenharmony_ci                    i = 0                                j = 0
1987db96d56Sopenharmony_ci
1997db96d56Sopenharmony_ci
2007db96d56Sopenharmony_ci  Since r' = w**((p-1) - (p-1)/d) and d = R*C:
2017db96d56Sopenharmony_ci
2027db96d56Sopenharmony_ci     a) r' ** (R*C*m*j) = w**((p-1)*R*C*m*j - (p-1)*m*j) = 1
2037db96d56Sopenharmony_ci
2047db96d56Sopenharmony_ci     b) r' ** (C*m*i) = w**((p-1)*C - (p-1)/R) ** (m*i) = r_R' ** (m*i)
2057db96d56Sopenharmony_ci
2067db96d56Sopenharmony_ci     c) r' ** (R*n*j) = r_C' ** (n*j)
2077db96d56Sopenharmony_ci
2087db96d56Sopenharmony_ci     d) d' = d**(p-2) = (R*C) ** (p-2) = R**(p-2) * C**(p-2) = R' * C'
2097db96d56Sopenharmony_ci
2107db96d56Sopenharmony_ci     r_R' := inverse of the root of the subfield of length R.
2117db96d56Sopenharmony_ci     r_C' := inverse of the root of the subfield of length C.
2127db96d56Sopenharmony_ci     R'   := inverse of R
2137db96d56Sopenharmony_ci     C'   := inverse of C
2147db96d56Sopenharmony_ci
2157db96d56Sopenharmony_ci
2167db96d56Sopenharmony_ci                     R-1                                      C-1
2177db96d56Sopenharmony_ci                    ,----                                    ,----  2) transform the rows of a^T
2187db96d56Sopenharmony_ci                    \                                        \
2197db96d56Sopenharmony_ci    A[m*C+n] = R' *  |  r_R' ** (m*i) * [ r' ** (n*i) * C' *  |  a[j*R+i] * r_C' ** (n*j) ]
2207db96d56Sopenharmony_ci                    /                           ^            /       ^
2217db96d56Sopenharmony_ci                    `----                       |            `----   |
2227db96d56Sopenharmony_ci                    i = 0                       |            j = 0   |
2237db96d56Sopenharmony_ci                      ^                         |                    `-- 1) Transpose input matrix
2247db96d56Sopenharmony_ci                      |                         `-- 3) multiply             to address elements by
2257db96d56Sopenharmony_ci                      |                                                     i * C + j
2267db96d56Sopenharmony_ci                      `-- 3) transform the columns
2277db96d56Sopenharmony_ci
2287db96d56Sopenharmony_ci
2297db96d56Sopenharmony_ci
2307db96d56Sopenharmony_ci   Note that the entire RHS is a function of m and n and that the results
2317db96d56Sopenharmony_ci   for each pair (m, n) are stored in C order.
2327db96d56Sopenharmony_ci
2337db96d56Sopenharmony_ci   Let the term in square brackets be f(n, i). Without step 1), the sum
2347db96d56Sopenharmony_ci   would perform a length C transform on the columns of the input matrix.
2357db96d56Sopenharmony_ci   This is a) inefficient and b) the results are needed in C order, so
2367db96d56Sopenharmony_ci   step 1) exchanges rows and columns.
2377db96d56Sopenharmony_ci
2387db96d56Sopenharmony_ci   Step 2) and 3) precalculate f(n, i) for all (n, i). After that, the
2397db96d56Sopenharmony_ci   original matrix is now a lookup table with the ith element in the nth
2407db96d56Sopenharmony_ci   column at location i * C + n.
2417db96d56Sopenharmony_ci
2427db96d56Sopenharmony_ci   Let the complete RHS be g(m, n). Step 4) does an in-place transform of
2437db96d56Sopenharmony_ci   length m on all columns. After that, the original matrix is now a lookup
2447db96d56Sopenharmony_ci   table with the mth element in the nth column at location m * C + n,
2457db96d56Sopenharmony_ci   which means that all A[k] = A[m * C + n] are in the correct order.
2467db96d56Sopenharmony_ci
2477db96d56Sopenharmony_ci
2487db96d56Sopenharmony_ci--
2497db96d56Sopenharmony_ci
2507db96d56Sopenharmony_ci  [1] Joerg Arndt: "Matters Computational"
2517db96d56Sopenharmony_ci      http://www.jjj.de/fxt/
2527db96d56Sopenharmony_ci  [2] David H. Bailey: FFTs in External or Hierarchical Memory
2537db96d56Sopenharmony_ci      http://crd.lbl.gov/~dhbailey/dhbpapers/
2547db96d56Sopenharmony_ci
2557db96d56Sopenharmony_ci
2567db96d56Sopenharmony_ci
257