18c2ecf20Sopenharmony_ci| 28c2ecf20Sopenharmony_ci| stan.sa 3.3 7/29/91 38c2ecf20Sopenharmony_ci| 48c2ecf20Sopenharmony_ci| The entry point stan computes the tangent of 58c2ecf20Sopenharmony_ci| an input argument; 68c2ecf20Sopenharmony_ci| stand does the same except for denormalized input. 78c2ecf20Sopenharmony_ci| 88c2ecf20Sopenharmony_ci| Input: Double-extended number X in location pointed to 98c2ecf20Sopenharmony_ci| by address register a0. 108c2ecf20Sopenharmony_ci| 118c2ecf20Sopenharmony_ci| Output: The value tan(X) returned in floating-point register Fp0. 128c2ecf20Sopenharmony_ci| 138c2ecf20Sopenharmony_ci| Accuracy and Monotonicity: The returned result is within 3 ulp in 148c2ecf20Sopenharmony_ci| 64 significant bit, i.e. within 0.5001 ulp to 53 bits if the 158c2ecf20Sopenharmony_ci| result is subsequently rounded to double precision. The 168c2ecf20Sopenharmony_ci| result is provably monotonic in double precision. 178c2ecf20Sopenharmony_ci| 188c2ecf20Sopenharmony_ci| Speed: The program sTAN takes approximately 170 cycles for 198c2ecf20Sopenharmony_ci| input argument X such that |X| < 15Pi, which is the usual 208c2ecf20Sopenharmony_ci| situation. 218c2ecf20Sopenharmony_ci| 228c2ecf20Sopenharmony_ci| Algorithm: 238c2ecf20Sopenharmony_ci| 248c2ecf20Sopenharmony_ci| 1. If |X| >= 15Pi or |X| < 2**(-40), go to 6. 258c2ecf20Sopenharmony_ci| 268c2ecf20Sopenharmony_ci| 2. Decompose X as X = N(Pi/2) + r where |r| <= Pi/4. Let 278c2ecf20Sopenharmony_ci| k = N mod 2, so in particular, k = 0 or 1. 288c2ecf20Sopenharmony_ci| 298c2ecf20Sopenharmony_ci| 3. If k is odd, go to 5. 308c2ecf20Sopenharmony_ci| 318c2ecf20Sopenharmony_ci| 4. (k is even) Tan(X) = tan(r) and tan(r) is approximated by a 328c2ecf20Sopenharmony_ci| rational function U/V where 338c2ecf20Sopenharmony_ci| U = r + r*s*(P1 + s*(P2 + s*P3)), and 348c2ecf20Sopenharmony_ci| V = 1 + s*(Q1 + s*(Q2 + s*(Q3 + s*Q4))), s = r*r. 358c2ecf20Sopenharmony_ci| Exit. 368c2ecf20Sopenharmony_ci| 378c2ecf20Sopenharmony_ci| 4. (k is odd) Tan(X) = -cot(r). Since tan(r) is approximated by a 388c2ecf20Sopenharmony_ci| rational function U/V where 398c2ecf20Sopenharmony_ci| U = r + r*s*(P1 + s*(P2 + s*P3)), and 408c2ecf20Sopenharmony_ci| V = 1 + s*(Q1 + s*(Q2 + s*(Q3 + s*Q4))), s = r*r, 418c2ecf20Sopenharmony_ci| -Cot(r) = -V/U. Exit. 428c2ecf20Sopenharmony_ci| 438c2ecf20Sopenharmony_ci| 6. If |X| > 1, go to 8. 448c2ecf20Sopenharmony_ci| 458c2ecf20Sopenharmony_ci| 7. (|X|<2**(-40)) Tan(X) = X. Exit. 468c2ecf20Sopenharmony_ci| 478c2ecf20Sopenharmony_ci| 8. Overwrite X by X := X rem 2Pi. Now that |X| <= Pi, go back to 2. 488c2ecf20Sopenharmony_ci| 498c2ecf20Sopenharmony_ci 508c2ecf20Sopenharmony_ci| Copyright (C) Motorola, Inc. 1990 518c2ecf20Sopenharmony_ci| All Rights Reserved 528c2ecf20Sopenharmony_ci| 538c2ecf20Sopenharmony_ci| For details on the license for this file, please see the 548c2ecf20Sopenharmony_ci| file, README, in this same directory. 558c2ecf20Sopenharmony_ci 568c2ecf20Sopenharmony_ci|STAN idnt 2,1 | Motorola 040 Floating Point Software Package 578c2ecf20Sopenharmony_ci 588c2ecf20Sopenharmony_ci |section 8 598c2ecf20Sopenharmony_ci 608c2ecf20Sopenharmony_ci#include "fpsp.h" 618c2ecf20Sopenharmony_ci 628c2ecf20Sopenharmony_ciBOUNDS1: .long 0x3FD78000,0x4004BC7E 638c2ecf20Sopenharmony_ciTWOBYPI: .long 0x3FE45F30,0x6DC9C883 648c2ecf20Sopenharmony_ci 658c2ecf20Sopenharmony_ciTANQ4: .long 0x3EA0B759,0xF50F8688 668c2ecf20Sopenharmony_ciTANP3: .long 0xBEF2BAA5,0xA8924F04 678c2ecf20Sopenharmony_ci 688c2ecf20Sopenharmony_ciTANQ3: .long 0xBF346F59,0xB39BA65F,0x00000000,0x00000000 698c2ecf20Sopenharmony_ci 708c2ecf20Sopenharmony_ciTANP2: .long 0x3FF60000,0xE073D3FC,0x199C4A00,0x00000000 718c2ecf20Sopenharmony_ci 728c2ecf20Sopenharmony_ciTANQ2: .long 0x3FF90000,0xD23CD684,0x15D95FA1,0x00000000 738c2ecf20Sopenharmony_ci 748c2ecf20Sopenharmony_ciTANP1: .long 0xBFFC0000,0x8895A6C5,0xFB423BCA,0x00000000 758c2ecf20Sopenharmony_ci 768c2ecf20Sopenharmony_ciTANQ1: .long 0xBFFD0000,0xEEF57E0D,0xA84BC8CE,0x00000000 778c2ecf20Sopenharmony_ci 788c2ecf20Sopenharmony_ciINVTWOPI: .long 0x3FFC0000,0xA2F9836E,0x4E44152A,0x00000000 798c2ecf20Sopenharmony_ci 808c2ecf20Sopenharmony_ciTWOPI1: .long 0x40010000,0xC90FDAA2,0x00000000,0x00000000 818c2ecf20Sopenharmony_ciTWOPI2: .long 0x3FDF0000,0x85A308D4,0x00000000,0x00000000 828c2ecf20Sopenharmony_ci 838c2ecf20Sopenharmony_ci|--N*PI/2, -32 <= N <= 32, IN A LEADING TERM IN EXT. AND TRAILING 848c2ecf20Sopenharmony_ci|--TERM IN SGL. NOTE THAT PI IS 64-BIT LONG, THUS N*PI/2 IS AT 858c2ecf20Sopenharmony_ci|--MOST 69 BITS LONG. 868c2ecf20Sopenharmony_ci .global PITBL 878c2ecf20Sopenharmony_ciPITBL: 888c2ecf20Sopenharmony_ci .long 0xC0040000,0xC90FDAA2,0x2168C235,0x21800000 898c2ecf20Sopenharmony_ci .long 0xC0040000,0xC2C75BCD,0x105D7C23,0xA0D00000 908c2ecf20Sopenharmony_ci .long 0xC0040000,0xBC7EDCF7,0xFF523611,0xA1E80000 918c2ecf20Sopenharmony_ci .long 0xC0040000,0xB6365E22,0xEE46F000,0x21480000 928c2ecf20Sopenharmony_ci .long 0xC0040000,0xAFEDDF4D,0xDD3BA9EE,0xA1200000 938c2ecf20Sopenharmony_ci .long 0xC0040000,0xA9A56078,0xCC3063DD,0x21FC0000 948c2ecf20Sopenharmony_ci .long 0xC0040000,0xA35CE1A3,0xBB251DCB,0x21100000 958c2ecf20Sopenharmony_ci .long 0xC0040000,0x9D1462CE,0xAA19D7B9,0xA1580000 968c2ecf20Sopenharmony_ci .long 0xC0040000,0x96CBE3F9,0x990E91A8,0x21E00000 978c2ecf20Sopenharmony_ci .long 0xC0040000,0x90836524,0x88034B96,0x20B00000 988c2ecf20Sopenharmony_ci .long 0xC0040000,0x8A3AE64F,0x76F80584,0xA1880000 998c2ecf20Sopenharmony_ci .long 0xC0040000,0x83F2677A,0x65ECBF73,0x21C40000 1008c2ecf20Sopenharmony_ci .long 0xC0030000,0xFB53D14A,0xA9C2F2C2,0x20000000 1018c2ecf20Sopenharmony_ci .long 0xC0030000,0xEEC2D3A0,0x87AC669F,0x21380000 1028c2ecf20Sopenharmony_ci .long 0xC0030000,0xE231D5F6,0x6595DA7B,0xA1300000 1038c2ecf20Sopenharmony_ci .long 0xC0030000,0xD5A0D84C,0x437F4E58,0x9FC00000 1048c2ecf20Sopenharmony_ci .long 0xC0030000,0xC90FDAA2,0x2168C235,0x21000000 1058c2ecf20Sopenharmony_ci .long 0xC0030000,0xBC7EDCF7,0xFF523611,0xA1680000 1068c2ecf20Sopenharmony_ci .long 0xC0030000,0xAFEDDF4D,0xDD3BA9EE,0xA0A00000 1078c2ecf20Sopenharmony_ci .long 0xC0030000,0xA35CE1A3,0xBB251DCB,0x20900000 1088c2ecf20Sopenharmony_ci .long 0xC0030000,0x96CBE3F9,0x990E91A8,0x21600000 1098c2ecf20Sopenharmony_ci .long 0xC0030000,0x8A3AE64F,0x76F80584,0xA1080000 1108c2ecf20Sopenharmony_ci .long 0xC0020000,0xFB53D14A,0xA9C2F2C2,0x1F800000 1118c2ecf20Sopenharmony_ci .long 0xC0020000,0xE231D5F6,0x6595DA7B,0xA0B00000 1128c2ecf20Sopenharmony_ci .long 0xC0020000,0xC90FDAA2,0x2168C235,0x20800000 1138c2ecf20Sopenharmony_ci .long 0xC0020000,0xAFEDDF4D,0xDD3BA9EE,0xA0200000 1148c2ecf20Sopenharmony_ci .long 0xC0020000,0x96CBE3F9,0x990E91A8,0x20E00000 1158c2ecf20Sopenharmony_ci .long 0xC0010000,0xFB53D14A,0xA9C2F2C2,0x1F000000 1168c2ecf20Sopenharmony_ci .long 0xC0010000,0xC90FDAA2,0x2168C235,0x20000000 1178c2ecf20Sopenharmony_ci .long 0xC0010000,0x96CBE3F9,0x990E91A8,0x20600000 1188c2ecf20Sopenharmony_ci .long 0xC0000000,0xC90FDAA2,0x2168C235,0x1F800000 1198c2ecf20Sopenharmony_ci .long 0xBFFF0000,0xC90FDAA2,0x2168C235,0x1F000000 1208c2ecf20Sopenharmony_ci .long 0x00000000,0x00000000,0x00000000,0x00000000 1218c2ecf20Sopenharmony_ci .long 0x3FFF0000,0xC90FDAA2,0x2168C235,0x9F000000 1228c2ecf20Sopenharmony_ci .long 0x40000000,0xC90FDAA2,0x2168C235,0x9F800000 1238c2ecf20Sopenharmony_ci .long 0x40010000,0x96CBE3F9,0x990E91A8,0xA0600000 1248c2ecf20Sopenharmony_ci .long 0x40010000,0xC90FDAA2,0x2168C235,0xA0000000 1258c2ecf20Sopenharmony_ci .long 0x40010000,0xFB53D14A,0xA9C2F2C2,0x9F000000 1268c2ecf20Sopenharmony_ci .long 0x40020000,0x96CBE3F9,0x990E91A8,0xA0E00000 1278c2ecf20Sopenharmony_ci .long 0x40020000,0xAFEDDF4D,0xDD3BA9EE,0x20200000 1288c2ecf20Sopenharmony_ci .long 0x40020000,0xC90FDAA2,0x2168C235,0xA0800000 1298c2ecf20Sopenharmony_ci .long 0x40020000,0xE231D5F6,0x6595DA7B,0x20B00000 1308c2ecf20Sopenharmony_ci .long 0x40020000,0xFB53D14A,0xA9C2F2C2,0x9F800000 1318c2ecf20Sopenharmony_ci .long 0x40030000,0x8A3AE64F,0x76F80584,0x21080000 1328c2ecf20Sopenharmony_ci .long 0x40030000,0x96CBE3F9,0x990E91A8,0xA1600000 1338c2ecf20Sopenharmony_ci .long 0x40030000,0xA35CE1A3,0xBB251DCB,0xA0900000 1348c2ecf20Sopenharmony_ci .long 0x40030000,0xAFEDDF4D,0xDD3BA9EE,0x20A00000 1358c2ecf20Sopenharmony_ci .long 0x40030000,0xBC7EDCF7,0xFF523611,0x21680000 1368c2ecf20Sopenharmony_ci .long 0x40030000,0xC90FDAA2,0x2168C235,0xA1000000 1378c2ecf20Sopenharmony_ci .long 0x40030000,0xD5A0D84C,0x437F4E58,0x1FC00000 1388c2ecf20Sopenharmony_ci .long 0x40030000,0xE231D5F6,0x6595DA7B,0x21300000 1398c2ecf20Sopenharmony_ci .long 0x40030000,0xEEC2D3A0,0x87AC669F,0xA1380000 1408c2ecf20Sopenharmony_ci .long 0x40030000,0xFB53D14A,0xA9C2F2C2,0xA0000000 1418c2ecf20Sopenharmony_ci .long 0x40040000,0x83F2677A,0x65ECBF73,0xA1C40000 1428c2ecf20Sopenharmony_ci .long 0x40040000,0x8A3AE64F,0x76F80584,0x21880000 1438c2ecf20Sopenharmony_ci .long 0x40040000,0x90836524,0x88034B96,0xA0B00000 1448c2ecf20Sopenharmony_ci .long 0x40040000,0x96CBE3F9,0x990E91A8,0xA1E00000 1458c2ecf20Sopenharmony_ci .long 0x40040000,0x9D1462CE,0xAA19D7B9,0x21580000 1468c2ecf20Sopenharmony_ci .long 0x40040000,0xA35CE1A3,0xBB251DCB,0xA1100000 1478c2ecf20Sopenharmony_ci .long 0x40040000,0xA9A56078,0xCC3063DD,0xA1FC0000 1488c2ecf20Sopenharmony_ci .long 0x40040000,0xAFEDDF4D,0xDD3BA9EE,0x21200000 1498c2ecf20Sopenharmony_ci .long 0x40040000,0xB6365E22,0xEE46F000,0xA1480000 1508c2ecf20Sopenharmony_ci .long 0x40040000,0xBC7EDCF7,0xFF523611,0x21E80000 1518c2ecf20Sopenharmony_ci .long 0x40040000,0xC2C75BCD,0x105D7C23,0x20D00000 1528c2ecf20Sopenharmony_ci .long 0x40040000,0xC90FDAA2,0x2168C235,0xA1800000 1538c2ecf20Sopenharmony_ci 1548c2ecf20Sopenharmony_ci .set INARG,FP_SCR4 1558c2ecf20Sopenharmony_ci 1568c2ecf20Sopenharmony_ci .set TWOTO63,L_SCR1 1578c2ecf20Sopenharmony_ci .set ENDFLAG,L_SCR2 1588c2ecf20Sopenharmony_ci .set N,L_SCR3 1598c2ecf20Sopenharmony_ci 1608c2ecf20Sopenharmony_ci | xref t_frcinx 1618c2ecf20Sopenharmony_ci |xref t_extdnrm 1628c2ecf20Sopenharmony_ci 1638c2ecf20Sopenharmony_ci .global stand 1648c2ecf20Sopenharmony_cistand: 1658c2ecf20Sopenharmony_ci|--TAN(X) = X FOR DENORMALIZED X 1668c2ecf20Sopenharmony_ci 1678c2ecf20Sopenharmony_ci bra t_extdnrm 1688c2ecf20Sopenharmony_ci 1698c2ecf20Sopenharmony_ci .global stan 1708c2ecf20Sopenharmony_cistan: 1718c2ecf20Sopenharmony_ci fmovex (%a0),%fp0 | ...LOAD INPUT 1728c2ecf20Sopenharmony_ci 1738c2ecf20Sopenharmony_ci movel (%a0),%d0 1748c2ecf20Sopenharmony_ci movew 4(%a0),%d0 1758c2ecf20Sopenharmony_ci andil #0x7FFFFFFF,%d0 1768c2ecf20Sopenharmony_ci 1778c2ecf20Sopenharmony_ci cmpil #0x3FD78000,%d0 | ...|X| >= 2**(-40)? 1788c2ecf20Sopenharmony_ci bges TANOK1 1798c2ecf20Sopenharmony_ci bra TANSM 1808c2ecf20Sopenharmony_ciTANOK1: 1818c2ecf20Sopenharmony_ci cmpil #0x4004BC7E,%d0 | ...|X| < 15 PI? 1828c2ecf20Sopenharmony_ci blts TANMAIN 1838c2ecf20Sopenharmony_ci bra REDUCEX 1848c2ecf20Sopenharmony_ci 1858c2ecf20Sopenharmony_ci 1868c2ecf20Sopenharmony_ciTANMAIN: 1878c2ecf20Sopenharmony_ci|--THIS IS THE USUAL CASE, |X| <= 15 PI. 1888c2ecf20Sopenharmony_ci|--THE ARGUMENT REDUCTION IS DONE BY TABLE LOOK UP. 1898c2ecf20Sopenharmony_ci fmovex %fp0,%fp1 1908c2ecf20Sopenharmony_ci fmuld TWOBYPI,%fp1 | ...X*2/PI 1918c2ecf20Sopenharmony_ci 1928c2ecf20Sopenharmony_ci|--HIDE THE NEXT TWO INSTRUCTIONS 1938c2ecf20Sopenharmony_ci leal PITBL+0x200,%a1 | ...TABLE OF N*PI/2, N = -32,...,32 1948c2ecf20Sopenharmony_ci 1958c2ecf20Sopenharmony_ci|--FP1 IS NOW READY 1968c2ecf20Sopenharmony_ci fmovel %fp1,%d0 | ...CONVERT TO INTEGER 1978c2ecf20Sopenharmony_ci 1988c2ecf20Sopenharmony_ci asll #4,%d0 1998c2ecf20Sopenharmony_ci addal %d0,%a1 | ...ADDRESS N*PIBY2 IN Y1, Y2 2008c2ecf20Sopenharmony_ci 2018c2ecf20Sopenharmony_ci fsubx (%a1)+,%fp0 | ...X-Y1 2028c2ecf20Sopenharmony_ci|--HIDE THE NEXT ONE 2038c2ecf20Sopenharmony_ci 2048c2ecf20Sopenharmony_ci fsubs (%a1),%fp0 | ...FP0 IS R = (X-Y1)-Y2 2058c2ecf20Sopenharmony_ci 2068c2ecf20Sopenharmony_ci rorl #5,%d0 2078c2ecf20Sopenharmony_ci andil #0x80000000,%d0 | ...D0 WAS ODD IFF D0 < 0 2088c2ecf20Sopenharmony_ci 2098c2ecf20Sopenharmony_ciTANCONT: 2108c2ecf20Sopenharmony_ci 2118c2ecf20Sopenharmony_ci cmpil #0,%d0 2128c2ecf20Sopenharmony_ci blt NODD 2138c2ecf20Sopenharmony_ci 2148c2ecf20Sopenharmony_ci fmovex %fp0,%fp1 2158c2ecf20Sopenharmony_ci fmulx %fp1,%fp1 | ...S = R*R 2168c2ecf20Sopenharmony_ci 2178c2ecf20Sopenharmony_ci fmoved TANQ4,%fp3 2188c2ecf20Sopenharmony_ci fmoved TANP3,%fp2 2198c2ecf20Sopenharmony_ci 2208c2ecf20Sopenharmony_ci fmulx %fp1,%fp3 | ...SQ4 2218c2ecf20Sopenharmony_ci fmulx %fp1,%fp2 | ...SP3 2228c2ecf20Sopenharmony_ci 2238c2ecf20Sopenharmony_ci faddd TANQ3,%fp3 | ...Q3+SQ4 2248c2ecf20Sopenharmony_ci faddx TANP2,%fp2 | ...P2+SP3 2258c2ecf20Sopenharmony_ci 2268c2ecf20Sopenharmony_ci fmulx %fp1,%fp3 | ...S(Q3+SQ4) 2278c2ecf20Sopenharmony_ci fmulx %fp1,%fp2 | ...S(P2+SP3) 2288c2ecf20Sopenharmony_ci 2298c2ecf20Sopenharmony_ci faddx TANQ2,%fp3 | ...Q2+S(Q3+SQ4) 2308c2ecf20Sopenharmony_ci faddx TANP1,%fp2 | ...P1+S(P2+SP3) 2318c2ecf20Sopenharmony_ci 2328c2ecf20Sopenharmony_ci fmulx %fp1,%fp3 | ...S(Q2+S(Q3+SQ4)) 2338c2ecf20Sopenharmony_ci fmulx %fp1,%fp2 | ...S(P1+S(P2+SP3)) 2348c2ecf20Sopenharmony_ci 2358c2ecf20Sopenharmony_ci faddx TANQ1,%fp3 | ...Q1+S(Q2+S(Q3+SQ4)) 2368c2ecf20Sopenharmony_ci fmulx %fp0,%fp2 | ...RS(P1+S(P2+SP3)) 2378c2ecf20Sopenharmony_ci 2388c2ecf20Sopenharmony_ci fmulx %fp3,%fp1 | ...S(Q1+S(Q2+S(Q3+SQ4))) 2398c2ecf20Sopenharmony_ci 2408c2ecf20Sopenharmony_ci 2418c2ecf20Sopenharmony_ci faddx %fp2,%fp0 | ...R+RS(P1+S(P2+SP3)) 2428c2ecf20Sopenharmony_ci 2438c2ecf20Sopenharmony_ci 2448c2ecf20Sopenharmony_ci fadds #0x3F800000,%fp1 | ...1+S(Q1+...) 2458c2ecf20Sopenharmony_ci 2468c2ecf20Sopenharmony_ci fmovel %d1,%fpcr |restore users exceptions 2478c2ecf20Sopenharmony_ci fdivx %fp1,%fp0 |last inst - possible exception set 2488c2ecf20Sopenharmony_ci 2498c2ecf20Sopenharmony_ci bra t_frcinx 2508c2ecf20Sopenharmony_ci 2518c2ecf20Sopenharmony_ciNODD: 2528c2ecf20Sopenharmony_ci fmovex %fp0,%fp1 2538c2ecf20Sopenharmony_ci fmulx %fp0,%fp0 | ...S = R*R 2548c2ecf20Sopenharmony_ci 2558c2ecf20Sopenharmony_ci fmoved TANQ4,%fp3 2568c2ecf20Sopenharmony_ci fmoved TANP3,%fp2 2578c2ecf20Sopenharmony_ci 2588c2ecf20Sopenharmony_ci fmulx %fp0,%fp3 | ...SQ4 2598c2ecf20Sopenharmony_ci fmulx %fp0,%fp2 | ...SP3 2608c2ecf20Sopenharmony_ci 2618c2ecf20Sopenharmony_ci faddd TANQ3,%fp3 | ...Q3+SQ4 2628c2ecf20Sopenharmony_ci faddx TANP2,%fp2 | ...P2+SP3 2638c2ecf20Sopenharmony_ci 2648c2ecf20Sopenharmony_ci fmulx %fp0,%fp3 | ...S(Q3+SQ4) 2658c2ecf20Sopenharmony_ci fmulx %fp0,%fp2 | ...S(P2+SP3) 2668c2ecf20Sopenharmony_ci 2678c2ecf20Sopenharmony_ci faddx TANQ2,%fp3 | ...Q2+S(Q3+SQ4) 2688c2ecf20Sopenharmony_ci faddx TANP1,%fp2 | ...P1+S(P2+SP3) 2698c2ecf20Sopenharmony_ci 2708c2ecf20Sopenharmony_ci fmulx %fp0,%fp3 | ...S(Q2+S(Q3+SQ4)) 2718c2ecf20Sopenharmony_ci fmulx %fp0,%fp2 | ...S(P1+S(P2+SP3)) 2728c2ecf20Sopenharmony_ci 2738c2ecf20Sopenharmony_ci faddx TANQ1,%fp3 | ...Q1+S(Q2+S(Q3+SQ4)) 2748c2ecf20Sopenharmony_ci fmulx %fp1,%fp2 | ...RS(P1+S(P2+SP3)) 2758c2ecf20Sopenharmony_ci 2768c2ecf20Sopenharmony_ci fmulx %fp3,%fp0 | ...S(Q1+S(Q2+S(Q3+SQ4))) 2778c2ecf20Sopenharmony_ci 2788c2ecf20Sopenharmony_ci 2798c2ecf20Sopenharmony_ci faddx %fp2,%fp1 | ...R+RS(P1+S(P2+SP3)) 2808c2ecf20Sopenharmony_ci fadds #0x3F800000,%fp0 | ...1+S(Q1+...) 2818c2ecf20Sopenharmony_ci 2828c2ecf20Sopenharmony_ci 2838c2ecf20Sopenharmony_ci fmovex %fp1,-(%sp) 2848c2ecf20Sopenharmony_ci eoril #0x80000000,(%sp) 2858c2ecf20Sopenharmony_ci 2868c2ecf20Sopenharmony_ci fmovel %d1,%fpcr |restore users exceptions 2878c2ecf20Sopenharmony_ci fdivx (%sp)+,%fp0 |last inst - possible exception set 2888c2ecf20Sopenharmony_ci 2898c2ecf20Sopenharmony_ci bra t_frcinx 2908c2ecf20Sopenharmony_ci 2918c2ecf20Sopenharmony_ciTANBORS: 2928c2ecf20Sopenharmony_ci|--IF |X| > 15PI, WE USE THE GENERAL ARGUMENT REDUCTION. 2938c2ecf20Sopenharmony_ci|--IF |X| < 2**(-40), RETURN X OR 1. 2948c2ecf20Sopenharmony_ci cmpil #0x3FFF8000,%d0 2958c2ecf20Sopenharmony_ci bgts REDUCEX 2968c2ecf20Sopenharmony_ci 2978c2ecf20Sopenharmony_ciTANSM: 2988c2ecf20Sopenharmony_ci 2998c2ecf20Sopenharmony_ci fmovex %fp0,-(%sp) 3008c2ecf20Sopenharmony_ci fmovel %d1,%fpcr |restore users exceptions 3018c2ecf20Sopenharmony_ci fmovex (%sp)+,%fp0 |last inst - possible exception set 3028c2ecf20Sopenharmony_ci 3038c2ecf20Sopenharmony_ci bra t_frcinx 3048c2ecf20Sopenharmony_ci 3058c2ecf20Sopenharmony_ci 3068c2ecf20Sopenharmony_ciREDUCEX: 3078c2ecf20Sopenharmony_ci|--WHEN REDUCEX IS USED, THE CODE WILL INEVITABLY BE SLOW. 3088c2ecf20Sopenharmony_ci|--THIS REDUCTION METHOD, HOWEVER, IS MUCH FASTER THAN USING 3098c2ecf20Sopenharmony_ci|--THE REMAINDER INSTRUCTION WHICH IS NOW IN SOFTWARE. 3108c2ecf20Sopenharmony_ci 3118c2ecf20Sopenharmony_ci fmovemx %fp2-%fp5,-(%a7) | ...save FP2 through FP5 3128c2ecf20Sopenharmony_ci movel %d2,-(%a7) 3138c2ecf20Sopenharmony_ci fmoves #0x00000000,%fp1 3148c2ecf20Sopenharmony_ci 3158c2ecf20Sopenharmony_ci|--If compact form of abs(arg) in d0=$7ffeffff, argument is so large that 3168c2ecf20Sopenharmony_ci|--there is a danger of unwanted overflow in first LOOP iteration. In this 3178c2ecf20Sopenharmony_ci|--case, reduce argument by one remainder step to make subsequent reduction 3188c2ecf20Sopenharmony_ci|--safe. 3198c2ecf20Sopenharmony_ci cmpil #0x7ffeffff,%d0 |is argument dangerously large? 3208c2ecf20Sopenharmony_ci bnes LOOP 3218c2ecf20Sopenharmony_ci movel #0x7ffe0000,FP_SCR2(%a6) |yes 3228c2ecf20Sopenharmony_ci| ;create 2**16383*PI/2 3238c2ecf20Sopenharmony_ci movel #0xc90fdaa2,FP_SCR2+4(%a6) 3248c2ecf20Sopenharmony_ci clrl FP_SCR2+8(%a6) 3258c2ecf20Sopenharmony_ci ftstx %fp0 |test sign of argument 3268c2ecf20Sopenharmony_ci movel #0x7fdc0000,FP_SCR3(%a6) |create low half of 2**16383* 3278c2ecf20Sopenharmony_ci| ;PI/2 at FP_SCR3 3288c2ecf20Sopenharmony_ci movel #0x85a308d3,FP_SCR3+4(%a6) 3298c2ecf20Sopenharmony_ci clrl FP_SCR3+8(%a6) 3308c2ecf20Sopenharmony_ci fblt red_neg 3318c2ecf20Sopenharmony_ci orw #0x8000,FP_SCR2(%a6) |positive arg 3328c2ecf20Sopenharmony_ci orw #0x8000,FP_SCR3(%a6) 3338c2ecf20Sopenharmony_cired_neg: 3348c2ecf20Sopenharmony_ci faddx FP_SCR2(%a6),%fp0 |high part of reduction is exact 3358c2ecf20Sopenharmony_ci fmovex %fp0,%fp1 |save high result in fp1 3368c2ecf20Sopenharmony_ci faddx FP_SCR3(%a6),%fp0 |low part of reduction 3378c2ecf20Sopenharmony_ci fsubx %fp0,%fp1 |determine low component of result 3388c2ecf20Sopenharmony_ci faddx FP_SCR3(%a6),%fp1 |fp0/fp1 are reduced argument. 3398c2ecf20Sopenharmony_ci 3408c2ecf20Sopenharmony_ci|--ON ENTRY, FP0 IS X, ON RETURN, FP0 IS X REM PI/2, |X| <= PI/4. 3418c2ecf20Sopenharmony_ci|--integer quotient will be stored in N 3428c2ecf20Sopenharmony_ci|--Intermediate remainder is 66-bit long; (R,r) in (FP0,FP1) 3438c2ecf20Sopenharmony_ci 3448c2ecf20Sopenharmony_ciLOOP: 3458c2ecf20Sopenharmony_ci fmovex %fp0,INARG(%a6) | ...+-2**K * F, 1 <= F < 2 3468c2ecf20Sopenharmony_ci movew INARG(%a6),%d0 3478c2ecf20Sopenharmony_ci movel %d0,%a1 | ...save a copy of D0 3488c2ecf20Sopenharmony_ci andil #0x00007FFF,%d0 3498c2ecf20Sopenharmony_ci subil #0x00003FFF,%d0 | ...D0 IS K 3508c2ecf20Sopenharmony_ci cmpil #28,%d0 3518c2ecf20Sopenharmony_ci bles LASTLOOP 3528c2ecf20Sopenharmony_ciCONTLOOP: 3538c2ecf20Sopenharmony_ci subil #27,%d0 | ...D0 IS L := K-27 3548c2ecf20Sopenharmony_ci movel #0,ENDFLAG(%a6) 3558c2ecf20Sopenharmony_ci bras WORK 3568c2ecf20Sopenharmony_ciLASTLOOP: 3578c2ecf20Sopenharmony_ci clrl %d0 | ...D0 IS L := 0 3588c2ecf20Sopenharmony_ci movel #1,ENDFLAG(%a6) 3598c2ecf20Sopenharmony_ci 3608c2ecf20Sopenharmony_ciWORK: 3618c2ecf20Sopenharmony_ci|--FIND THE REMAINDER OF (R,r) W.R.T. 2**L * (PI/2). L IS SO CHOSEN 3628c2ecf20Sopenharmony_ci|--THAT INT( X * (2/PI) / 2**(L) ) < 2**29. 3638c2ecf20Sopenharmony_ci 3648c2ecf20Sopenharmony_ci|--CREATE 2**(-L) * (2/PI), SIGN(INARG)*2**(63), 3658c2ecf20Sopenharmony_ci|--2**L * (PIby2_1), 2**L * (PIby2_2) 3668c2ecf20Sopenharmony_ci 3678c2ecf20Sopenharmony_ci movel #0x00003FFE,%d2 | ...BIASED EXPO OF 2/PI 3688c2ecf20Sopenharmony_ci subl %d0,%d2 | ...BIASED EXPO OF 2**(-L)*(2/PI) 3698c2ecf20Sopenharmony_ci 3708c2ecf20Sopenharmony_ci movel #0xA2F9836E,FP_SCR1+4(%a6) 3718c2ecf20Sopenharmony_ci movel #0x4E44152A,FP_SCR1+8(%a6) 3728c2ecf20Sopenharmony_ci movew %d2,FP_SCR1(%a6) | ...FP_SCR1 is 2**(-L)*(2/PI) 3738c2ecf20Sopenharmony_ci 3748c2ecf20Sopenharmony_ci fmovex %fp0,%fp2 3758c2ecf20Sopenharmony_ci fmulx FP_SCR1(%a6),%fp2 3768c2ecf20Sopenharmony_ci|--WE MUST NOW FIND INT(FP2). SINCE WE NEED THIS VALUE IN 3778c2ecf20Sopenharmony_ci|--FLOATING POINT FORMAT, THE TWO FMOVE'S FMOVE.L FP <--> N 3788c2ecf20Sopenharmony_ci|--WILL BE TOO INEFFICIENT. THE WAY AROUND IT IS THAT 3798c2ecf20Sopenharmony_ci|--(SIGN(INARG)*2**63 + FP2) - SIGN(INARG)*2**63 WILL GIVE 3808c2ecf20Sopenharmony_ci|--US THE DESIRED VALUE IN FLOATING POINT. 3818c2ecf20Sopenharmony_ci 3828c2ecf20Sopenharmony_ci|--HIDE SIX CYCLES OF INSTRUCTION 3838c2ecf20Sopenharmony_ci movel %a1,%d2 3848c2ecf20Sopenharmony_ci swap %d2 3858c2ecf20Sopenharmony_ci andil #0x80000000,%d2 3868c2ecf20Sopenharmony_ci oril #0x5F000000,%d2 | ...D2 IS SIGN(INARG)*2**63 IN SGL 3878c2ecf20Sopenharmony_ci movel %d2,TWOTO63(%a6) 3888c2ecf20Sopenharmony_ci 3898c2ecf20Sopenharmony_ci movel %d0,%d2 3908c2ecf20Sopenharmony_ci addil #0x00003FFF,%d2 | ...BIASED EXPO OF 2**L * (PI/2) 3918c2ecf20Sopenharmony_ci 3928c2ecf20Sopenharmony_ci|--FP2 IS READY 3938c2ecf20Sopenharmony_ci fadds TWOTO63(%a6),%fp2 | ...THE FRACTIONAL PART OF FP1 IS ROUNDED 3948c2ecf20Sopenharmony_ci 3958c2ecf20Sopenharmony_ci|--HIDE 4 CYCLES OF INSTRUCTION; creating 2**(L)*Piby2_1 and 2**(L)*Piby2_2 3968c2ecf20Sopenharmony_ci movew %d2,FP_SCR2(%a6) 3978c2ecf20Sopenharmony_ci clrw FP_SCR2+2(%a6) 3988c2ecf20Sopenharmony_ci movel #0xC90FDAA2,FP_SCR2+4(%a6) 3998c2ecf20Sopenharmony_ci clrl FP_SCR2+8(%a6) | ...FP_SCR2 is 2**(L) * Piby2_1 4008c2ecf20Sopenharmony_ci 4018c2ecf20Sopenharmony_ci|--FP2 IS READY 4028c2ecf20Sopenharmony_ci fsubs TWOTO63(%a6),%fp2 | ...FP2 is N 4038c2ecf20Sopenharmony_ci 4048c2ecf20Sopenharmony_ci addil #0x00003FDD,%d0 4058c2ecf20Sopenharmony_ci movew %d0,FP_SCR3(%a6) 4068c2ecf20Sopenharmony_ci clrw FP_SCR3+2(%a6) 4078c2ecf20Sopenharmony_ci movel #0x85A308D3,FP_SCR3+4(%a6) 4088c2ecf20Sopenharmony_ci clrl FP_SCR3+8(%a6) | ...FP_SCR3 is 2**(L) * Piby2_2 4098c2ecf20Sopenharmony_ci 4108c2ecf20Sopenharmony_ci movel ENDFLAG(%a6),%d0 4118c2ecf20Sopenharmony_ci 4128c2ecf20Sopenharmony_ci|--We are now ready to perform (R+r) - N*P1 - N*P2, P1 = 2**(L) * Piby2_1 and 4138c2ecf20Sopenharmony_ci|--P2 = 2**(L) * Piby2_2 4148c2ecf20Sopenharmony_ci fmovex %fp2,%fp4 4158c2ecf20Sopenharmony_ci fmulx FP_SCR2(%a6),%fp4 | ...W = N*P1 4168c2ecf20Sopenharmony_ci fmovex %fp2,%fp5 4178c2ecf20Sopenharmony_ci fmulx FP_SCR3(%a6),%fp5 | ...w = N*P2 4188c2ecf20Sopenharmony_ci fmovex %fp4,%fp3 4198c2ecf20Sopenharmony_ci|--we want P+p = W+w but |p| <= half ulp of P 4208c2ecf20Sopenharmony_ci|--Then, we need to compute A := R-P and a := r-p 4218c2ecf20Sopenharmony_ci faddx %fp5,%fp3 | ...FP3 is P 4228c2ecf20Sopenharmony_ci fsubx %fp3,%fp4 | ...W-P 4238c2ecf20Sopenharmony_ci 4248c2ecf20Sopenharmony_ci fsubx %fp3,%fp0 | ...FP0 is A := R - P 4258c2ecf20Sopenharmony_ci faddx %fp5,%fp4 | ...FP4 is p = (W-P)+w 4268c2ecf20Sopenharmony_ci 4278c2ecf20Sopenharmony_ci fmovex %fp0,%fp3 | ...FP3 A 4288c2ecf20Sopenharmony_ci fsubx %fp4,%fp1 | ...FP1 is a := r - p 4298c2ecf20Sopenharmony_ci 4308c2ecf20Sopenharmony_ci|--Now we need to normalize (A,a) to "new (R,r)" where R+r = A+a but 4318c2ecf20Sopenharmony_ci|--|r| <= half ulp of R. 4328c2ecf20Sopenharmony_ci faddx %fp1,%fp0 | ...FP0 is R := A+a 4338c2ecf20Sopenharmony_ci|--No need to calculate r if this is the last loop 4348c2ecf20Sopenharmony_ci cmpil #0,%d0 4358c2ecf20Sopenharmony_ci bgt RESTORE 4368c2ecf20Sopenharmony_ci 4378c2ecf20Sopenharmony_ci|--Need to calculate r 4388c2ecf20Sopenharmony_ci fsubx %fp0,%fp3 | ...A-R 4398c2ecf20Sopenharmony_ci faddx %fp3,%fp1 | ...FP1 is r := (A-R)+a 4408c2ecf20Sopenharmony_ci bra LOOP 4418c2ecf20Sopenharmony_ci 4428c2ecf20Sopenharmony_ciRESTORE: 4438c2ecf20Sopenharmony_ci fmovel %fp2,N(%a6) 4448c2ecf20Sopenharmony_ci movel (%a7)+,%d2 4458c2ecf20Sopenharmony_ci fmovemx (%a7)+,%fp2-%fp5 4468c2ecf20Sopenharmony_ci 4478c2ecf20Sopenharmony_ci 4488c2ecf20Sopenharmony_ci movel N(%a6),%d0 4498c2ecf20Sopenharmony_ci rorl #1,%d0 4508c2ecf20Sopenharmony_ci 4518c2ecf20Sopenharmony_ci 4528c2ecf20Sopenharmony_ci bra TANCONT 4538c2ecf20Sopenharmony_ci 4548c2ecf20Sopenharmony_ci |end 455