162306a36Sopenharmony_ci| 262306a36Sopenharmony_ci| stan.sa 3.3 7/29/91 362306a36Sopenharmony_ci| 462306a36Sopenharmony_ci| The entry point stan computes the tangent of 562306a36Sopenharmony_ci| an input argument; 662306a36Sopenharmony_ci| stand does the same except for denormalized input. 762306a36Sopenharmony_ci| 862306a36Sopenharmony_ci| Input: Double-extended number X in location pointed to 962306a36Sopenharmony_ci| by address register a0. 1062306a36Sopenharmony_ci| 1162306a36Sopenharmony_ci| Output: The value tan(X) returned in floating-point register Fp0. 1262306a36Sopenharmony_ci| 1362306a36Sopenharmony_ci| Accuracy and Monotonicity: The returned result is within 3 ulp in 1462306a36Sopenharmony_ci| 64 significant bit, i.e. within 0.5001 ulp to 53 bits if the 1562306a36Sopenharmony_ci| result is subsequently rounded to double precision. The 1662306a36Sopenharmony_ci| result is provably monotonic in double precision. 1762306a36Sopenharmony_ci| 1862306a36Sopenharmony_ci| Speed: The program sTAN takes approximately 170 cycles for 1962306a36Sopenharmony_ci| input argument X such that |X| < 15Pi, which is the usual 2062306a36Sopenharmony_ci| situation. 2162306a36Sopenharmony_ci| 2262306a36Sopenharmony_ci| Algorithm: 2362306a36Sopenharmony_ci| 2462306a36Sopenharmony_ci| 1. If |X| >= 15Pi or |X| < 2**(-40), go to 6. 2562306a36Sopenharmony_ci| 2662306a36Sopenharmony_ci| 2. Decompose X as X = N(Pi/2) + r where |r| <= Pi/4. Let 2762306a36Sopenharmony_ci| k = N mod 2, so in particular, k = 0 or 1. 2862306a36Sopenharmony_ci| 2962306a36Sopenharmony_ci| 3. If k is odd, go to 5. 3062306a36Sopenharmony_ci| 3162306a36Sopenharmony_ci| 4. (k is even) Tan(X) = tan(r) and tan(r) is approximated by a 3262306a36Sopenharmony_ci| rational function U/V where 3362306a36Sopenharmony_ci| U = r + r*s*(P1 + s*(P2 + s*P3)), and 3462306a36Sopenharmony_ci| V = 1 + s*(Q1 + s*(Q2 + s*(Q3 + s*Q4))), s = r*r. 3562306a36Sopenharmony_ci| Exit. 3662306a36Sopenharmony_ci| 3762306a36Sopenharmony_ci| 4. (k is odd) Tan(X) = -cot(r). Since tan(r) is approximated by a 3862306a36Sopenharmony_ci| rational function U/V where 3962306a36Sopenharmony_ci| U = r + r*s*(P1 + s*(P2 + s*P3)), and 4062306a36Sopenharmony_ci| V = 1 + s*(Q1 + s*(Q2 + s*(Q3 + s*Q4))), s = r*r, 4162306a36Sopenharmony_ci| -Cot(r) = -V/U. Exit. 4262306a36Sopenharmony_ci| 4362306a36Sopenharmony_ci| 6. If |X| > 1, go to 8. 4462306a36Sopenharmony_ci| 4562306a36Sopenharmony_ci| 7. (|X|<2**(-40)) Tan(X) = X. Exit. 4662306a36Sopenharmony_ci| 4762306a36Sopenharmony_ci| 8. Overwrite X by X := X rem 2Pi. Now that |X| <= Pi, go back to 2. 4862306a36Sopenharmony_ci| 4962306a36Sopenharmony_ci 5062306a36Sopenharmony_ci| Copyright (C) Motorola, Inc. 1990 5162306a36Sopenharmony_ci| All Rights Reserved 5262306a36Sopenharmony_ci| 5362306a36Sopenharmony_ci| For details on the license for this file, please see the 5462306a36Sopenharmony_ci| file, README, in this same directory. 5562306a36Sopenharmony_ci 5662306a36Sopenharmony_ci|STAN idnt 2,1 | Motorola 040 Floating Point Software Package 5762306a36Sopenharmony_ci 5862306a36Sopenharmony_ci |section 8 5962306a36Sopenharmony_ci 6062306a36Sopenharmony_ci#include "fpsp.h" 6162306a36Sopenharmony_ci 6262306a36Sopenharmony_ciBOUNDS1: .long 0x3FD78000,0x4004BC7E 6362306a36Sopenharmony_ciTWOBYPI: .long 0x3FE45F30,0x6DC9C883 6462306a36Sopenharmony_ci 6562306a36Sopenharmony_ciTANQ4: .long 0x3EA0B759,0xF50F8688 6662306a36Sopenharmony_ciTANP3: .long 0xBEF2BAA5,0xA8924F04 6762306a36Sopenharmony_ci 6862306a36Sopenharmony_ciTANQ3: .long 0xBF346F59,0xB39BA65F,0x00000000,0x00000000 6962306a36Sopenharmony_ci 7062306a36Sopenharmony_ciTANP2: .long 0x3FF60000,0xE073D3FC,0x199C4A00,0x00000000 7162306a36Sopenharmony_ci 7262306a36Sopenharmony_ciTANQ2: .long 0x3FF90000,0xD23CD684,0x15D95FA1,0x00000000 7362306a36Sopenharmony_ci 7462306a36Sopenharmony_ciTANP1: .long 0xBFFC0000,0x8895A6C5,0xFB423BCA,0x00000000 7562306a36Sopenharmony_ci 7662306a36Sopenharmony_ciTANQ1: .long 0xBFFD0000,0xEEF57E0D,0xA84BC8CE,0x00000000 7762306a36Sopenharmony_ci 7862306a36Sopenharmony_ciINVTWOPI: .long 0x3FFC0000,0xA2F9836E,0x4E44152A,0x00000000 7962306a36Sopenharmony_ci 8062306a36Sopenharmony_ciTWOPI1: .long 0x40010000,0xC90FDAA2,0x00000000,0x00000000 8162306a36Sopenharmony_ciTWOPI2: .long 0x3FDF0000,0x85A308D4,0x00000000,0x00000000 8262306a36Sopenharmony_ci 8362306a36Sopenharmony_ci|--N*PI/2, -32 <= N <= 32, IN A LEADING TERM IN EXT. AND TRAILING 8462306a36Sopenharmony_ci|--TERM IN SGL. NOTE THAT PI IS 64-BIT LONG, THUS N*PI/2 IS AT 8562306a36Sopenharmony_ci|--MOST 69 BITS LONG. 8662306a36Sopenharmony_ci .global PITBL 8762306a36Sopenharmony_ciPITBL: 8862306a36Sopenharmony_ci .long 0xC0040000,0xC90FDAA2,0x2168C235,0x21800000 8962306a36Sopenharmony_ci .long 0xC0040000,0xC2C75BCD,0x105D7C23,0xA0D00000 9062306a36Sopenharmony_ci .long 0xC0040000,0xBC7EDCF7,0xFF523611,0xA1E80000 9162306a36Sopenharmony_ci .long 0xC0040000,0xB6365E22,0xEE46F000,0x21480000 9262306a36Sopenharmony_ci .long 0xC0040000,0xAFEDDF4D,0xDD3BA9EE,0xA1200000 9362306a36Sopenharmony_ci .long 0xC0040000,0xA9A56078,0xCC3063DD,0x21FC0000 9462306a36Sopenharmony_ci .long 0xC0040000,0xA35CE1A3,0xBB251DCB,0x21100000 9562306a36Sopenharmony_ci .long 0xC0040000,0x9D1462CE,0xAA19D7B9,0xA1580000 9662306a36Sopenharmony_ci .long 0xC0040000,0x96CBE3F9,0x990E91A8,0x21E00000 9762306a36Sopenharmony_ci .long 0xC0040000,0x90836524,0x88034B96,0x20B00000 9862306a36Sopenharmony_ci .long 0xC0040000,0x8A3AE64F,0x76F80584,0xA1880000 9962306a36Sopenharmony_ci .long 0xC0040000,0x83F2677A,0x65ECBF73,0x21C40000 10062306a36Sopenharmony_ci .long 0xC0030000,0xFB53D14A,0xA9C2F2C2,0x20000000 10162306a36Sopenharmony_ci .long 0xC0030000,0xEEC2D3A0,0x87AC669F,0x21380000 10262306a36Sopenharmony_ci .long 0xC0030000,0xE231D5F6,0x6595DA7B,0xA1300000 10362306a36Sopenharmony_ci .long 0xC0030000,0xD5A0D84C,0x437F4E58,0x9FC00000 10462306a36Sopenharmony_ci .long 0xC0030000,0xC90FDAA2,0x2168C235,0x21000000 10562306a36Sopenharmony_ci .long 0xC0030000,0xBC7EDCF7,0xFF523611,0xA1680000 10662306a36Sopenharmony_ci .long 0xC0030000,0xAFEDDF4D,0xDD3BA9EE,0xA0A00000 10762306a36Sopenharmony_ci .long 0xC0030000,0xA35CE1A3,0xBB251DCB,0x20900000 10862306a36Sopenharmony_ci .long 0xC0030000,0x96CBE3F9,0x990E91A8,0x21600000 10962306a36Sopenharmony_ci .long 0xC0030000,0x8A3AE64F,0x76F80584,0xA1080000 11062306a36Sopenharmony_ci .long 0xC0020000,0xFB53D14A,0xA9C2F2C2,0x1F800000 11162306a36Sopenharmony_ci .long 0xC0020000,0xE231D5F6,0x6595DA7B,0xA0B00000 11262306a36Sopenharmony_ci .long 0xC0020000,0xC90FDAA2,0x2168C235,0x20800000 11362306a36Sopenharmony_ci .long 0xC0020000,0xAFEDDF4D,0xDD3BA9EE,0xA0200000 11462306a36Sopenharmony_ci .long 0xC0020000,0x96CBE3F9,0x990E91A8,0x20E00000 11562306a36Sopenharmony_ci .long 0xC0010000,0xFB53D14A,0xA9C2F2C2,0x1F000000 11662306a36Sopenharmony_ci .long 0xC0010000,0xC90FDAA2,0x2168C235,0x20000000 11762306a36Sopenharmony_ci .long 0xC0010000,0x96CBE3F9,0x990E91A8,0x20600000 11862306a36Sopenharmony_ci .long 0xC0000000,0xC90FDAA2,0x2168C235,0x1F800000 11962306a36Sopenharmony_ci .long 0xBFFF0000,0xC90FDAA2,0x2168C235,0x1F000000 12062306a36Sopenharmony_ci .long 0x00000000,0x00000000,0x00000000,0x00000000 12162306a36Sopenharmony_ci .long 0x3FFF0000,0xC90FDAA2,0x2168C235,0x9F000000 12262306a36Sopenharmony_ci .long 0x40000000,0xC90FDAA2,0x2168C235,0x9F800000 12362306a36Sopenharmony_ci .long 0x40010000,0x96CBE3F9,0x990E91A8,0xA0600000 12462306a36Sopenharmony_ci .long 0x40010000,0xC90FDAA2,0x2168C235,0xA0000000 12562306a36Sopenharmony_ci .long 0x40010000,0xFB53D14A,0xA9C2F2C2,0x9F000000 12662306a36Sopenharmony_ci .long 0x40020000,0x96CBE3F9,0x990E91A8,0xA0E00000 12762306a36Sopenharmony_ci .long 0x40020000,0xAFEDDF4D,0xDD3BA9EE,0x20200000 12862306a36Sopenharmony_ci .long 0x40020000,0xC90FDAA2,0x2168C235,0xA0800000 12962306a36Sopenharmony_ci .long 0x40020000,0xE231D5F6,0x6595DA7B,0x20B00000 13062306a36Sopenharmony_ci .long 0x40020000,0xFB53D14A,0xA9C2F2C2,0x9F800000 13162306a36Sopenharmony_ci .long 0x40030000,0x8A3AE64F,0x76F80584,0x21080000 13262306a36Sopenharmony_ci .long 0x40030000,0x96CBE3F9,0x990E91A8,0xA1600000 13362306a36Sopenharmony_ci .long 0x40030000,0xA35CE1A3,0xBB251DCB,0xA0900000 13462306a36Sopenharmony_ci .long 0x40030000,0xAFEDDF4D,0xDD3BA9EE,0x20A00000 13562306a36Sopenharmony_ci .long 0x40030000,0xBC7EDCF7,0xFF523611,0x21680000 13662306a36Sopenharmony_ci .long 0x40030000,0xC90FDAA2,0x2168C235,0xA1000000 13762306a36Sopenharmony_ci .long 0x40030000,0xD5A0D84C,0x437F4E58,0x1FC00000 13862306a36Sopenharmony_ci .long 0x40030000,0xE231D5F6,0x6595DA7B,0x21300000 13962306a36Sopenharmony_ci .long 0x40030000,0xEEC2D3A0,0x87AC669F,0xA1380000 14062306a36Sopenharmony_ci .long 0x40030000,0xFB53D14A,0xA9C2F2C2,0xA0000000 14162306a36Sopenharmony_ci .long 0x40040000,0x83F2677A,0x65ECBF73,0xA1C40000 14262306a36Sopenharmony_ci .long 0x40040000,0x8A3AE64F,0x76F80584,0x21880000 14362306a36Sopenharmony_ci .long 0x40040000,0x90836524,0x88034B96,0xA0B00000 14462306a36Sopenharmony_ci .long 0x40040000,0x96CBE3F9,0x990E91A8,0xA1E00000 14562306a36Sopenharmony_ci .long 0x40040000,0x9D1462CE,0xAA19D7B9,0x21580000 14662306a36Sopenharmony_ci .long 0x40040000,0xA35CE1A3,0xBB251DCB,0xA1100000 14762306a36Sopenharmony_ci .long 0x40040000,0xA9A56078,0xCC3063DD,0xA1FC0000 14862306a36Sopenharmony_ci .long 0x40040000,0xAFEDDF4D,0xDD3BA9EE,0x21200000 14962306a36Sopenharmony_ci .long 0x40040000,0xB6365E22,0xEE46F000,0xA1480000 15062306a36Sopenharmony_ci .long 0x40040000,0xBC7EDCF7,0xFF523611,0x21E80000 15162306a36Sopenharmony_ci .long 0x40040000,0xC2C75BCD,0x105D7C23,0x20D00000 15262306a36Sopenharmony_ci .long 0x40040000,0xC90FDAA2,0x2168C235,0xA1800000 15362306a36Sopenharmony_ci 15462306a36Sopenharmony_ci .set INARG,FP_SCR4 15562306a36Sopenharmony_ci 15662306a36Sopenharmony_ci .set TWOTO63,L_SCR1 15762306a36Sopenharmony_ci .set ENDFLAG,L_SCR2 15862306a36Sopenharmony_ci .set N,L_SCR3 15962306a36Sopenharmony_ci 16062306a36Sopenharmony_ci | xref t_frcinx 16162306a36Sopenharmony_ci |xref t_extdnrm 16262306a36Sopenharmony_ci 16362306a36Sopenharmony_ci .global stand 16462306a36Sopenharmony_cistand: 16562306a36Sopenharmony_ci|--TAN(X) = X FOR DENORMALIZED X 16662306a36Sopenharmony_ci 16762306a36Sopenharmony_ci bra t_extdnrm 16862306a36Sopenharmony_ci 16962306a36Sopenharmony_ci .global stan 17062306a36Sopenharmony_cistan: 17162306a36Sopenharmony_ci fmovex (%a0),%fp0 | ...LOAD INPUT 17262306a36Sopenharmony_ci 17362306a36Sopenharmony_ci movel (%a0),%d0 17462306a36Sopenharmony_ci movew 4(%a0),%d0 17562306a36Sopenharmony_ci andil #0x7FFFFFFF,%d0 17662306a36Sopenharmony_ci 17762306a36Sopenharmony_ci cmpil #0x3FD78000,%d0 | ...|X| >= 2**(-40)? 17862306a36Sopenharmony_ci bges TANOK1 17962306a36Sopenharmony_ci bra TANSM 18062306a36Sopenharmony_ciTANOK1: 18162306a36Sopenharmony_ci cmpil #0x4004BC7E,%d0 | ...|X| < 15 PI? 18262306a36Sopenharmony_ci blts TANMAIN 18362306a36Sopenharmony_ci bra REDUCEX 18462306a36Sopenharmony_ci 18562306a36Sopenharmony_ci 18662306a36Sopenharmony_ciTANMAIN: 18762306a36Sopenharmony_ci|--THIS IS THE USUAL CASE, |X| <= 15 PI. 18862306a36Sopenharmony_ci|--THE ARGUMENT REDUCTION IS DONE BY TABLE LOOK UP. 18962306a36Sopenharmony_ci fmovex %fp0,%fp1 19062306a36Sopenharmony_ci fmuld TWOBYPI,%fp1 | ...X*2/PI 19162306a36Sopenharmony_ci 19262306a36Sopenharmony_ci|--HIDE THE NEXT TWO INSTRUCTIONS 19362306a36Sopenharmony_ci leal PITBL+0x200,%a1 | ...TABLE OF N*PI/2, N = -32,...,32 19462306a36Sopenharmony_ci 19562306a36Sopenharmony_ci|--FP1 IS NOW READY 19662306a36Sopenharmony_ci fmovel %fp1,%d0 | ...CONVERT TO INTEGER 19762306a36Sopenharmony_ci 19862306a36Sopenharmony_ci asll #4,%d0 19962306a36Sopenharmony_ci addal %d0,%a1 | ...ADDRESS N*PIBY2 IN Y1, Y2 20062306a36Sopenharmony_ci 20162306a36Sopenharmony_ci fsubx (%a1)+,%fp0 | ...X-Y1 20262306a36Sopenharmony_ci|--HIDE THE NEXT ONE 20362306a36Sopenharmony_ci 20462306a36Sopenharmony_ci fsubs (%a1),%fp0 | ...FP0 IS R = (X-Y1)-Y2 20562306a36Sopenharmony_ci 20662306a36Sopenharmony_ci rorl #5,%d0 20762306a36Sopenharmony_ci andil #0x80000000,%d0 | ...D0 WAS ODD IFF D0 < 0 20862306a36Sopenharmony_ci 20962306a36Sopenharmony_ciTANCONT: 21062306a36Sopenharmony_ci 21162306a36Sopenharmony_ci cmpil #0,%d0 21262306a36Sopenharmony_ci blt NODD 21362306a36Sopenharmony_ci 21462306a36Sopenharmony_ci fmovex %fp0,%fp1 21562306a36Sopenharmony_ci fmulx %fp1,%fp1 | ...S = R*R 21662306a36Sopenharmony_ci 21762306a36Sopenharmony_ci fmoved TANQ4,%fp3 21862306a36Sopenharmony_ci fmoved TANP3,%fp2 21962306a36Sopenharmony_ci 22062306a36Sopenharmony_ci fmulx %fp1,%fp3 | ...SQ4 22162306a36Sopenharmony_ci fmulx %fp1,%fp2 | ...SP3 22262306a36Sopenharmony_ci 22362306a36Sopenharmony_ci faddd TANQ3,%fp3 | ...Q3+SQ4 22462306a36Sopenharmony_ci faddx TANP2,%fp2 | ...P2+SP3 22562306a36Sopenharmony_ci 22662306a36Sopenharmony_ci fmulx %fp1,%fp3 | ...S(Q3+SQ4) 22762306a36Sopenharmony_ci fmulx %fp1,%fp2 | ...S(P2+SP3) 22862306a36Sopenharmony_ci 22962306a36Sopenharmony_ci faddx TANQ2,%fp3 | ...Q2+S(Q3+SQ4) 23062306a36Sopenharmony_ci faddx TANP1,%fp2 | ...P1+S(P2+SP3) 23162306a36Sopenharmony_ci 23262306a36Sopenharmony_ci fmulx %fp1,%fp3 | ...S(Q2+S(Q3+SQ4)) 23362306a36Sopenharmony_ci fmulx %fp1,%fp2 | ...S(P1+S(P2+SP3)) 23462306a36Sopenharmony_ci 23562306a36Sopenharmony_ci faddx TANQ1,%fp3 | ...Q1+S(Q2+S(Q3+SQ4)) 23662306a36Sopenharmony_ci fmulx %fp0,%fp2 | ...RS(P1+S(P2+SP3)) 23762306a36Sopenharmony_ci 23862306a36Sopenharmony_ci fmulx %fp3,%fp1 | ...S(Q1+S(Q2+S(Q3+SQ4))) 23962306a36Sopenharmony_ci 24062306a36Sopenharmony_ci 24162306a36Sopenharmony_ci faddx %fp2,%fp0 | ...R+RS(P1+S(P2+SP3)) 24262306a36Sopenharmony_ci 24362306a36Sopenharmony_ci 24462306a36Sopenharmony_ci fadds #0x3F800000,%fp1 | ...1+S(Q1+...) 24562306a36Sopenharmony_ci 24662306a36Sopenharmony_ci fmovel %d1,%fpcr |restore users exceptions 24762306a36Sopenharmony_ci fdivx %fp1,%fp0 |last inst - possible exception set 24862306a36Sopenharmony_ci 24962306a36Sopenharmony_ci bra t_frcinx 25062306a36Sopenharmony_ci 25162306a36Sopenharmony_ciNODD: 25262306a36Sopenharmony_ci fmovex %fp0,%fp1 25362306a36Sopenharmony_ci fmulx %fp0,%fp0 | ...S = R*R 25462306a36Sopenharmony_ci 25562306a36Sopenharmony_ci fmoved TANQ4,%fp3 25662306a36Sopenharmony_ci fmoved TANP3,%fp2 25762306a36Sopenharmony_ci 25862306a36Sopenharmony_ci fmulx %fp0,%fp3 | ...SQ4 25962306a36Sopenharmony_ci fmulx %fp0,%fp2 | ...SP3 26062306a36Sopenharmony_ci 26162306a36Sopenharmony_ci faddd TANQ3,%fp3 | ...Q3+SQ4 26262306a36Sopenharmony_ci faddx TANP2,%fp2 | ...P2+SP3 26362306a36Sopenharmony_ci 26462306a36Sopenharmony_ci fmulx %fp0,%fp3 | ...S(Q3+SQ4) 26562306a36Sopenharmony_ci fmulx %fp0,%fp2 | ...S(P2+SP3) 26662306a36Sopenharmony_ci 26762306a36Sopenharmony_ci faddx TANQ2,%fp3 | ...Q2+S(Q3+SQ4) 26862306a36Sopenharmony_ci faddx TANP1,%fp2 | ...P1+S(P2+SP3) 26962306a36Sopenharmony_ci 27062306a36Sopenharmony_ci fmulx %fp0,%fp3 | ...S(Q2+S(Q3+SQ4)) 27162306a36Sopenharmony_ci fmulx %fp0,%fp2 | ...S(P1+S(P2+SP3)) 27262306a36Sopenharmony_ci 27362306a36Sopenharmony_ci faddx TANQ1,%fp3 | ...Q1+S(Q2+S(Q3+SQ4)) 27462306a36Sopenharmony_ci fmulx %fp1,%fp2 | ...RS(P1+S(P2+SP3)) 27562306a36Sopenharmony_ci 27662306a36Sopenharmony_ci fmulx %fp3,%fp0 | ...S(Q1+S(Q2+S(Q3+SQ4))) 27762306a36Sopenharmony_ci 27862306a36Sopenharmony_ci 27962306a36Sopenharmony_ci faddx %fp2,%fp1 | ...R+RS(P1+S(P2+SP3)) 28062306a36Sopenharmony_ci fadds #0x3F800000,%fp0 | ...1+S(Q1+...) 28162306a36Sopenharmony_ci 28262306a36Sopenharmony_ci 28362306a36Sopenharmony_ci fmovex %fp1,-(%sp) 28462306a36Sopenharmony_ci eoril #0x80000000,(%sp) 28562306a36Sopenharmony_ci 28662306a36Sopenharmony_ci fmovel %d1,%fpcr |restore users exceptions 28762306a36Sopenharmony_ci fdivx (%sp)+,%fp0 |last inst - possible exception set 28862306a36Sopenharmony_ci 28962306a36Sopenharmony_ci bra t_frcinx 29062306a36Sopenharmony_ci 29162306a36Sopenharmony_ciTANBORS: 29262306a36Sopenharmony_ci|--IF |X| > 15PI, WE USE THE GENERAL ARGUMENT REDUCTION. 29362306a36Sopenharmony_ci|--IF |X| < 2**(-40), RETURN X OR 1. 29462306a36Sopenharmony_ci cmpil #0x3FFF8000,%d0 29562306a36Sopenharmony_ci bgts REDUCEX 29662306a36Sopenharmony_ci 29762306a36Sopenharmony_ciTANSM: 29862306a36Sopenharmony_ci 29962306a36Sopenharmony_ci fmovex %fp0,-(%sp) 30062306a36Sopenharmony_ci fmovel %d1,%fpcr |restore users exceptions 30162306a36Sopenharmony_ci fmovex (%sp)+,%fp0 |last inst - possible exception set 30262306a36Sopenharmony_ci 30362306a36Sopenharmony_ci bra t_frcinx 30462306a36Sopenharmony_ci 30562306a36Sopenharmony_ci 30662306a36Sopenharmony_ciREDUCEX: 30762306a36Sopenharmony_ci|--WHEN REDUCEX IS USED, THE CODE WILL INEVITABLY BE SLOW. 30862306a36Sopenharmony_ci|--THIS REDUCTION METHOD, HOWEVER, IS MUCH FASTER THAN USING 30962306a36Sopenharmony_ci|--THE REMAINDER INSTRUCTION WHICH IS NOW IN SOFTWARE. 31062306a36Sopenharmony_ci 31162306a36Sopenharmony_ci fmovemx %fp2-%fp5,-(%a7) | ...save FP2 through FP5 31262306a36Sopenharmony_ci movel %d2,-(%a7) 31362306a36Sopenharmony_ci fmoves #0x00000000,%fp1 31462306a36Sopenharmony_ci 31562306a36Sopenharmony_ci|--If compact form of abs(arg) in d0=$7ffeffff, argument is so large that 31662306a36Sopenharmony_ci|--there is a danger of unwanted overflow in first LOOP iteration. In this 31762306a36Sopenharmony_ci|--case, reduce argument by one remainder step to make subsequent reduction 31862306a36Sopenharmony_ci|--safe. 31962306a36Sopenharmony_ci cmpil #0x7ffeffff,%d0 |is argument dangerously large? 32062306a36Sopenharmony_ci bnes LOOP 32162306a36Sopenharmony_ci movel #0x7ffe0000,FP_SCR2(%a6) |yes 32262306a36Sopenharmony_ci| ;create 2**16383*PI/2 32362306a36Sopenharmony_ci movel #0xc90fdaa2,FP_SCR2+4(%a6) 32462306a36Sopenharmony_ci clrl FP_SCR2+8(%a6) 32562306a36Sopenharmony_ci ftstx %fp0 |test sign of argument 32662306a36Sopenharmony_ci movel #0x7fdc0000,FP_SCR3(%a6) |create low half of 2**16383* 32762306a36Sopenharmony_ci| ;PI/2 at FP_SCR3 32862306a36Sopenharmony_ci movel #0x85a308d3,FP_SCR3+4(%a6) 32962306a36Sopenharmony_ci clrl FP_SCR3+8(%a6) 33062306a36Sopenharmony_ci fblt red_neg 33162306a36Sopenharmony_ci orw #0x8000,FP_SCR2(%a6) |positive arg 33262306a36Sopenharmony_ci orw #0x8000,FP_SCR3(%a6) 33362306a36Sopenharmony_cired_neg: 33462306a36Sopenharmony_ci faddx FP_SCR2(%a6),%fp0 |high part of reduction is exact 33562306a36Sopenharmony_ci fmovex %fp0,%fp1 |save high result in fp1 33662306a36Sopenharmony_ci faddx FP_SCR3(%a6),%fp0 |low part of reduction 33762306a36Sopenharmony_ci fsubx %fp0,%fp1 |determine low component of result 33862306a36Sopenharmony_ci faddx FP_SCR3(%a6),%fp1 |fp0/fp1 are reduced argument. 33962306a36Sopenharmony_ci 34062306a36Sopenharmony_ci|--ON ENTRY, FP0 IS X, ON RETURN, FP0 IS X REM PI/2, |X| <= PI/4. 34162306a36Sopenharmony_ci|--integer quotient will be stored in N 34262306a36Sopenharmony_ci|--Intermediate remainder is 66-bit long; (R,r) in (FP0,FP1) 34362306a36Sopenharmony_ci 34462306a36Sopenharmony_ciLOOP: 34562306a36Sopenharmony_ci fmovex %fp0,INARG(%a6) | ...+-2**K * F, 1 <= F < 2 34662306a36Sopenharmony_ci movew INARG(%a6),%d0 34762306a36Sopenharmony_ci movel %d0,%a1 | ...save a copy of D0 34862306a36Sopenharmony_ci andil #0x00007FFF,%d0 34962306a36Sopenharmony_ci subil #0x00003FFF,%d0 | ...D0 IS K 35062306a36Sopenharmony_ci cmpil #28,%d0 35162306a36Sopenharmony_ci bles LASTLOOP 35262306a36Sopenharmony_ciCONTLOOP: 35362306a36Sopenharmony_ci subil #27,%d0 | ...D0 IS L := K-27 35462306a36Sopenharmony_ci movel #0,ENDFLAG(%a6) 35562306a36Sopenharmony_ci bras WORK 35662306a36Sopenharmony_ciLASTLOOP: 35762306a36Sopenharmony_ci clrl %d0 | ...D0 IS L := 0 35862306a36Sopenharmony_ci movel #1,ENDFLAG(%a6) 35962306a36Sopenharmony_ci 36062306a36Sopenharmony_ciWORK: 36162306a36Sopenharmony_ci|--FIND THE REMAINDER OF (R,r) W.R.T. 2**L * (PI/2). L IS SO CHOSEN 36262306a36Sopenharmony_ci|--THAT INT( X * (2/PI) / 2**(L) ) < 2**29. 36362306a36Sopenharmony_ci 36462306a36Sopenharmony_ci|--CREATE 2**(-L) * (2/PI), SIGN(INARG)*2**(63), 36562306a36Sopenharmony_ci|--2**L * (PIby2_1), 2**L * (PIby2_2) 36662306a36Sopenharmony_ci 36762306a36Sopenharmony_ci movel #0x00003FFE,%d2 | ...BIASED EXPO OF 2/PI 36862306a36Sopenharmony_ci subl %d0,%d2 | ...BIASED EXPO OF 2**(-L)*(2/PI) 36962306a36Sopenharmony_ci 37062306a36Sopenharmony_ci movel #0xA2F9836E,FP_SCR1+4(%a6) 37162306a36Sopenharmony_ci movel #0x4E44152A,FP_SCR1+8(%a6) 37262306a36Sopenharmony_ci movew %d2,FP_SCR1(%a6) | ...FP_SCR1 is 2**(-L)*(2/PI) 37362306a36Sopenharmony_ci 37462306a36Sopenharmony_ci fmovex %fp0,%fp2 37562306a36Sopenharmony_ci fmulx FP_SCR1(%a6),%fp2 37662306a36Sopenharmony_ci|--WE MUST NOW FIND INT(FP2). SINCE WE NEED THIS VALUE IN 37762306a36Sopenharmony_ci|--FLOATING POINT FORMAT, THE TWO FMOVE'S FMOVE.L FP <--> N 37862306a36Sopenharmony_ci|--WILL BE TOO INEFFICIENT. THE WAY AROUND IT IS THAT 37962306a36Sopenharmony_ci|--(SIGN(INARG)*2**63 + FP2) - SIGN(INARG)*2**63 WILL GIVE 38062306a36Sopenharmony_ci|--US THE DESIRED VALUE IN FLOATING POINT. 38162306a36Sopenharmony_ci 38262306a36Sopenharmony_ci|--HIDE SIX CYCLES OF INSTRUCTION 38362306a36Sopenharmony_ci movel %a1,%d2 38462306a36Sopenharmony_ci swap %d2 38562306a36Sopenharmony_ci andil #0x80000000,%d2 38662306a36Sopenharmony_ci oril #0x5F000000,%d2 | ...D2 IS SIGN(INARG)*2**63 IN SGL 38762306a36Sopenharmony_ci movel %d2,TWOTO63(%a6) 38862306a36Sopenharmony_ci 38962306a36Sopenharmony_ci movel %d0,%d2 39062306a36Sopenharmony_ci addil #0x00003FFF,%d2 | ...BIASED EXPO OF 2**L * (PI/2) 39162306a36Sopenharmony_ci 39262306a36Sopenharmony_ci|--FP2 IS READY 39362306a36Sopenharmony_ci fadds TWOTO63(%a6),%fp2 | ...THE FRACTIONAL PART OF FP1 IS ROUNDED 39462306a36Sopenharmony_ci 39562306a36Sopenharmony_ci|--HIDE 4 CYCLES OF INSTRUCTION; creating 2**(L)*Piby2_1 and 2**(L)*Piby2_2 39662306a36Sopenharmony_ci movew %d2,FP_SCR2(%a6) 39762306a36Sopenharmony_ci clrw FP_SCR2+2(%a6) 39862306a36Sopenharmony_ci movel #0xC90FDAA2,FP_SCR2+4(%a6) 39962306a36Sopenharmony_ci clrl FP_SCR2+8(%a6) | ...FP_SCR2 is 2**(L) * Piby2_1 40062306a36Sopenharmony_ci 40162306a36Sopenharmony_ci|--FP2 IS READY 40262306a36Sopenharmony_ci fsubs TWOTO63(%a6),%fp2 | ...FP2 is N 40362306a36Sopenharmony_ci 40462306a36Sopenharmony_ci addil #0x00003FDD,%d0 40562306a36Sopenharmony_ci movew %d0,FP_SCR3(%a6) 40662306a36Sopenharmony_ci clrw FP_SCR3+2(%a6) 40762306a36Sopenharmony_ci movel #0x85A308D3,FP_SCR3+4(%a6) 40862306a36Sopenharmony_ci clrl FP_SCR3+8(%a6) | ...FP_SCR3 is 2**(L) * Piby2_2 40962306a36Sopenharmony_ci 41062306a36Sopenharmony_ci movel ENDFLAG(%a6),%d0 41162306a36Sopenharmony_ci 41262306a36Sopenharmony_ci|--We are now ready to perform (R+r) - N*P1 - N*P2, P1 = 2**(L) * Piby2_1 and 41362306a36Sopenharmony_ci|--P2 = 2**(L) * Piby2_2 41462306a36Sopenharmony_ci fmovex %fp2,%fp4 41562306a36Sopenharmony_ci fmulx FP_SCR2(%a6),%fp4 | ...W = N*P1 41662306a36Sopenharmony_ci fmovex %fp2,%fp5 41762306a36Sopenharmony_ci fmulx FP_SCR3(%a6),%fp5 | ...w = N*P2 41862306a36Sopenharmony_ci fmovex %fp4,%fp3 41962306a36Sopenharmony_ci|--we want P+p = W+w but |p| <= half ulp of P 42062306a36Sopenharmony_ci|--Then, we need to compute A := R-P and a := r-p 42162306a36Sopenharmony_ci faddx %fp5,%fp3 | ...FP3 is P 42262306a36Sopenharmony_ci fsubx %fp3,%fp4 | ...W-P 42362306a36Sopenharmony_ci 42462306a36Sopenharmony_ci fsubx %fp3,%fp0 | ...FP0 is A := R - P 42562306a36Sopenharmony_ci faddx %fp5,%fp4 | ...FP4 is p = (W-P)+w 42662306a36Sopenharmony_ci 42762306a36Sopenharmony_ci fmovex %fp0,%fp3 | ...FP3 A 42862306a36Sopenharmony_ci fsubx %fp4,%fp1 | ...FP1 is a := r - p 42962306a36Sopenharmony_ci 43062306a36Sopenharmony_ci|--Now we need to normalize (A,a) to "new (R,r)" where R+r = A+a but 43162306a36Sopenharmony_ci|--|r| <= half ulp of R. 43262306a36Sopenharmony_ci faddx %fp1,%fp0 | ...FP0 is R := A+a 43362306a36Sopenharmony_ci|--No need to calculate r if this is the last loop 43462306a36Sopenharmony_ci cmpil #0,%d0 43562306a36Sopenharmony_ci bgt RESTORE 43662306a36Sopenharmony_ci 43762306a36Sopenharmony_ci|--Need to calculate r 43862306a36Sopenharmony_ci fsubx %fp0,%fp3 | ...A-R 43962306a36Sopenharmony_ci faddx %fp3,%fp1 | ...FP1 is r := (A-R)+a 44062306a36Sopenharmony_ci bra LOOP 44162306a36Sopenharmony_ci 44262306a36Sopenharmony_ciRESTORE: 44362306a36Sopenharmony_ci fmovel %fp2,N(%a6) 44462306a36Sopenharmony_ci movel (%a7)+,%d2 44562306a36Sopenharmony_ci fmovemx (%a7)+,%fp2-%fp5 44662306a36Sopenharmony_ci 44762306a36Sopenharmony_ci 44862306a36Sopenharmony_ci movel N(%a6),%d0 44962306a36Sopenharmony_ci rorl #1,%d0 45062306a36Sopenharmony_ci 45162306a36Sopenharmony_ci 45262306a36Sopenharmony_ci bra TANCONT 45362306a36Sopenharmony_ci 45462306a36Sopenharmony_ci |end 455