18c2ecf20Sopenharmony_ci +---------------------------------------------------------------------------+ 28c2ecf20Sopenharmony_ci | wm-FPU-emu an FPU emulator for 80386 and 80486SX microprocessors. | 38c2ecf20Sopenharmony_ci | | 48c2ecf20Sopenharmony_ci | Copyright (C) 1992,1993,1994,1995,1996,1997,1999 | 58c2ecf20Sopenharmony_ci | W. Metzenthen, 22 Parker St, Ormond, Vic 3163, | 68c2ecf20Sopenharmony_ci | Australia. E-mail billm@melbpc.org.au | 78c2ecf20Sopenharmony_ci | | 88c2ecf20Sopenharmony_ci | This program is free software; you can redistribute it and/or modify | 98c2ecf20Sopenharmony_ci | it under the terms of the GNU General Public License version 2 as | 108c2ecf20Sopenharmony_ci | published by the Free Software Foundation. | 118c2ecf20Sopenharmony_ci | | 128c2ecf20Sopenharmony_ci | This program is distributed in the hope that it will be useful, | 138c2ecf20Sopenharmony_ci | but WITHOUT ANY WARRANTY; without even the implied warranty of | 148c2ecf20Sopenharmony_ci | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | 158c2ecf20Sopenharmony_ci | GNU General Public License for more details. | 168c2ecf20Sopenharmony_ci | | 178c2ecf20Sopenharmony_ci | You should have received a copy of the GNU General Public License | 188c2ecf20Sopenharmony_ci | along with this program; if not, write to the Free Software | 198c2ecf20Sopenharmony_ci | Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. | 208c2ecf20Sopenharmony_ci | | 218c2ecf20Sopenharmony_ci +---------------------------------------------------------------------------+ 228c2ecf20Sopenharmony_ci 238c2ecf20Sopenharmony_ci 248c2ecf20Sopenharmony_ci 258c2ecf20Sopenharmony_ciwm-FPU-emu is an FPU emulator for Linux. It is derived from wm-emu387 268c2ecf20Sopenharmony_ciwhich was my 80387 emulator for early versions of djgpp (gcc under 278c2ecf20Sopenharmony_cimsdos); wm-emu387 was in turn based upon emu387 which was written by 288c2ecf20Sopenharmony_ciDJ Delorie for djgpp. The interface to the Linux kernel is based upon 298c2ecf20Sopenharmony_cithe original Linux math emulator by Linus Torvalds. 308c2ecf20Sopenharmony_ci 318c2ecf20Sopenharmony_ciMy target FPU for wm-FPU-emu is that described in the Intel486 328c2ecf20Sopenharmony_ciProgrammer's Reference Manual (1992 edition). Unfortunately, numerous 338c2ecf20Sopenharmony_cifacets of the functioning of the FPU are not well covered in the 348c2ecf20Sopenharmony_ciReference Manual. The information in the manual has been supplemented 358c2ecf20Sopenharmony_ciwith measurements on real 80486's. Unfortunately, it is simply not 368c2ecf20Sopenharmony_cipossible to be sure that all of the peculiarities of the 80486 have 378c2ecf20Sopenharmony_cibeen discovered, so there is always likely to be obscure differences 388c2ecf20Sopenharmony_ciin the detailed behaviour of the emulator and a real 80486. 398c2ecf20Sopenharmony_ci 408c2ecf20Sopenharmony_ciwm-FPU-emu does not implement all of the behaviour of the 80486 FPU, 418c2ecf20Sopenharmony_cibut is very close. See "Limitations" later in this file for a list of 428c2ecf20Sopenharmony_cisome differences. 438c2ecf20Sopenharmony_ci 448c2ecf20Sopenharmony_ciPlease report bugs, etc to me at: 458c2ecf20Sopenharmony_ci billm@melbpc.org.au 468c2ecf20Sopenharmony_cior b.metzenthen@medoto.unimelb.edu.au 478c2ecf20Sopenharmony_ci 488c2ecf20Sopenharmony_ciFor more information on the emulator and on floating point topics, see 498c2ecf20Sopenharmony_cimy web pages, currently at http://www.suburbia.net/~billm/ 508c2ecf20Sopenharmony_ci 518c2ecf20Sopenharmony_ci 528c2ecf20Sopenharmony_ci--Bill Metzenthen 538c2ecf20Sopenharmony_ci December 1999 548c2ecf20Sopenharmony_ci 558c2ecf20Sopenharmony_ci 568c2ecf20Sopenharmony_ci----------------------- Internals of wm-FPU-emu ----------------------- 578c2ecf20Sopenharmony_ci 588c2ecf20Sopenharmony_ciNumeric algorithms: 598c2ecf20Sopenharmony_ci(1) Add, subtract, and multiply. Nothing remarkable in these. 608c2ecf20Sopenharmony_ci(2) Divide has been tuned to get reasonable performance. The algorithm 618c2ecf20Sopenharmony_ci is not the obvious one which most people seem to use, but is designed 628c2ecf20Sopenharmony_ci to take advantage of the characteristics of the 80386. I expect that 638c2ecf20Sopenharmony_ci it has been invented many times before I discovered it, but I have not 648c2ecf20Sopenharmony_ci seen it. It is based upon one of those ideas which one carries around 658c2ecf20Sopenharmony_ci for years without ever bothering to check it out. 668c2ecf20Sopenharmony_ci(3) The sqrt function has been tuned to get good performance. It is based 678c2ecf20Sopenharmony_ci upon Newton's classic method. Performance was improved by capitalizing 688c2ecf20Sopenharmony_ci upon the properties of Newton's method, and the code is once again 698c2ecf20Sopenharmony_ci structured taking account of the 80386 characteristics. 708c2ecf20Sopenharmony_ci(4) The trig, log, and exp functions are based in each case upon quasi- 718c2ecf20Sopenharmony_ci "optimal" polynomial approximations. My definition of "optimal" was 728c2ecf20Sopenharmony_ci based upon getting good accuracy with reasonable speed. 738c2ecf20Sopenharmony_ci(5) The argument reducing code for the trig function effectively uses 748c2ecf20Sopenharmony_ci a value of pi which is accurate to more than 128 bits. As a consequence, 758c2ecf20Sopenharmony_ci the reduced argument is accurate to more than 64 bits for arguments up 768c2ecf20Sopenharmony_ci to a few pi, and accurate to more than 64 bits for most arguments, 778c2ecf20Sopenharmony_ci even for arguments approaching 2^63. This is far superior to an 788c2ecf20Sopenharmony_ci 80486, which uses a value of pi which is accurate to 66 bits. 798c2ecf20Sopenharmony_ci 808c2ecf20Sopenharmony_ciThe code of the emulator is complicated slightly by the need to 818c2ecf20Sopenharmony_ciaccount for a limited form of re-entrancy. Normally, the emulator will 828c2ecf20Sopenharmony_ciemulate each FPU instruction to completion without interruption. 838c2ecf20Sopenharmony_ciHowever, it may happen that when the emulator is accessing the user 848c2ecf20Sopenharmony_cimemory space, swapping may be needed. In this case the emulator may be 858c2ecf20Sopenharmony_citemporarily suspended while disk i/o takes place. During this time 868c2ecf20Sopenharmony_cianother process may use the emulator, thereby perhaps changing static 878c2ecf20Sopenharmony_civariables. The code which accesses user memory is confined to five 888c2ecf20Sopenharmony_cifiles: 898c2ecf20Sopenharmony_ci fpu_entry.c 908c2ecf20Sopenharmony_ci reg_ld_str.c 918c2ecf20Sopenharmony_ci load_store.c 928c2ecf20Sopenharmony_ci get_address.c 938c2ecf20Sopenharmony_ci errors.c 948c2ecf20Sopenharmony_ciAs from version 1.12 of the emulator, no static variables are used 958c2ecf20Sopenharmony_ci(apart from those in the kernel's per-process tables). The emulator is 968c2ecf20Sopenharmony_citherefore now fully re-entrant, rather than having just the restricted 978c2ecf20Sopenharmony_ciform of re-entrancy which is required by the Linux kernel. 988c2ecf20Sopenharmony_ci 998c2ecf20Sopenharmony_ci----------------------- Limitations of wm-FPU-emu ----------------------- 1008c2ecf20Sopenharmony_ci 1018c2ecf20Sopenharmony_ciThere are a number of differences between the current wm-FPU-emu 1028c2ecf20Sopenharmony_ci(version 2.01) and the 80486 FPU (apart from bugs). The differences 1038c2ecf20Sopenharmony_ciare fewer than those which applied to the 1.xx series of the emulator. 1048c2ecf20Sopenharmony_ciSome of the more important differences are listed below: 1058c2ecf20Sopenharmony_ci 1068c2ecf20Sopenharmony_ciThe Roundup flag does not have much meaning for the transcendental 1078c2ecf20Sopenharmony_cifunctions and its 80486 value with these functions is likely to differ 1088c2ecf20Sopenharmony_cifrom its emulator value. 1098c2ecf20Sopenharmony_ci 1108c2ecf20Sopenharmony_ciIn a few rare cases the Underflow flag obtained with the emulator will 1118c2ecf20Sopenharmony_cibe different from that obtained with an 80486. This occurs when the 1128c2ecf20Sopenharmony_cifollowing conditions apply simultaneously: 1138c2ecf20Sopenharmony_ci(a) the operands have a higher precision than the current setting of the 1148c2ecf20Sopenharmony_ci precision control (PC) flags. 1158c2ecf20Sopenharmony_ci(b) the underflow exception is masked. 1168c2ecf20Sopenharmony_ci(c) the magnitude of the exact result (before rounding) is less than 2^-16382. 1178c2ecf20Sopenharmony_ci(d) the magnitude of the final result (after rounding) is exactly 2^-16382. 1188c2ecf20Sopenharmony_ci(e) the magnitude of the exact result would be exactly 2^-16382 if the 1198c2ecf20Sopenharmony_ci operands were rounded to the current precision before the arithmetic 1208c2ecf20Sopenharmony_ci operation was performed. 1218c2ecf20Sopenharmony_ciIf all of these apply, the emulator will set the Underflow flag but a real 1228c2ecf20Sopenharmony_ci80486 will not. 1238c2ecf20Sopenharmony_ci 1248c2ecf20Sopenharmony_ciNOTE: Certain formats of Extended Real are UNSUPPORTED. They are 1258c2ecf20Sopenharmony_ciunsupported by the 80486. They are the Pseudo-NaNs, Pseudoinfinities, 1268c2ecf20Sopenharmony_ciand Unnormals. None of these will be generated by an 80486 or by the 1278c2ecf20Sopenharmony_ciemulator. Do not use them. The emulator treats them differently in 1288c2ecf20Sopenharmony_cidetail from the way an 80486 does. 1298c2ecf20Sopenharmony_ci 1308c2ecf20Sopenharmony_ciSelf modifying code can cause the emulator to fail. An example of such 1318c2ecf20Sopenharmony_cicode is: 1328c2ecf20Sopenharmony_ci movl %esp,[%ebx] 1338c2ecf20Sopenharmony_ci fld1 1348c2ecf20Sopenharmony_ciThe FPU instruction may be (usually will be) loaded into the pre-fetch 1358c2ecf20Sopenharmony_ciqueue of the CPU before the mov instruction is executed. If the 1368c2ecf20Sopenharmony_cidestination of the 'movl' overlaps the FPU instruction then the bytes 1378c2ecf20Sopenharmony_ciin the prefetch queue and memory will be inconsistent when the FPU 1388c2ecf20Sopenharmony_ciinstruction is executed. The emulator will be invoked but will not be 1398c2ecf20Sopenharmony_ciable to find the instruction which caused the device-not-present 1408c2ecf20Sopenharmony_ciexception. For this case, the emulator cannot emulate the behaviour of 1418c2ecf20Sopenharmony_cian 80486DX. 1428c2ecf20Sopenharmony_ci 1438c2ecf20Sopenharmony_ciHandling of the address size override prefix byte (0x67) has not been 1448c2ecf20Sopenharmony_ciextensively tested yet. A major problem exists because using it in 1458c2ecf20Sopenharmony_civm86 mode can cause a general protection fault. Address offsets 1468c2ecf20Sopenharmony_cigreater than 0xffff appear to be illegal in vm86 mode but are quite 1478c2ecf20Sopenharmony_ciacceptable (and work) in real mode. A small test program developed to 1488c2ecf20Sopenharmony_cicheck the addressing, and which runs successfully in real mode, 1498c2ecf20Sopenharmony_cicrashes dosemu under Linux and also brings Windows down with a general 1508c2ecf20Sopenharmony_ciprotection fault message when run under the MS-DOS prompt of Windows 1518c2ecf20Sopenharmony_ci3.1. (The program simply reads data from a valid address). 1528c2ecf20Sopenharmony_ci 1538c2ecf20Sopenharmony_ciThe emulator supports 16-bit protected mode, with one difference from 1548c2ecf20Sopenharmony_cian 80486DX. A 80486DX will allow some floating point instructions to 1558c2ecf20Sopenharmony_ciwrite a few bytes below the lowest address of the stack. The emulator 1568c2ecf20Sopenharmony_ciwill not allow this in 16-bit protected mode: no instructions are 1578c2ecf20Sopenharmony_ciallowed to write outside the bounds set by the protection. 1588c2ecf20Sopenharmony_ci 1598c2ecf20Sopenharmony_ci----------------------- Performance of wm-FPU-emu ----------------------- 1608c2ecf20Sopenharmony_ci 1618c2ecf20Sopenharmony_ciSpeed. 1628c2ecf20Sopenharmony_ci----- 1638c2ecf20Sopenharmony_ci 1648c2ecf20Sopenharmony_ciThe speed of floating point computation with the emulator will depend 1658c2ecf20Sopenharmony_ciupon instruction mix. Relative performance is best for the instructions 1668c2ecf20Sopenharmony_ciwhich require most computation. The simple instructions are adversely 1678c2ecf20Sopenharmony_ciaffected by the FPU instruction trap overhead. 1688c2ecf20Sopenharmony_ci 1698c2ecf20Sopenharmony_ci 1708c2ecf20Sopenharmony_ciTiming: Some simple timing tests have been made on the emulator functions. 1718c2ecf20Sopenharmony_ciThe times include load/store instructions. All times are in microseconds 1728c2ecf20Sopenharmony_cimeasured on a 33MHz 386 with 64k cache. The Turbo C tests were under 1738c2ecf20Sopenharmony_cims-dos, the next two columns are for emulators running with the djgpp 1748c2ecf20Sopenharmony_cims-dos extender. The final column is for wm-FPU-emu in Linux 0.97, 1758c2ecf20Sopenharmony_ciusing libm4.0 (hard). 1768c2ecf20Sopenharmony_ci 1778c2ecf20Sopenharmony_cifunction Turbo C djgpp 1.06 WM-emu387 wm-FPU-emu 1788c2ecf20Sopenharmony_ci 1798c2ecf20Sopenharmony_ci + 60.5 154.8 76.5 139.4 1808c2ecf20Sopenharmony_ci - 61.1-65.5 157.3-160.8 76.2-79.5 142.9-144.7 1818c2ecf20Sopenharmony_ci * 71.0 190.8 79.6 146.6 1828c2ecf20Sopenharmony_ci / 61.2-75.0 261.4-266.9 75.3-91.6 142.2-158.1 1838c2ecf20Sopenharmony_ci 1848c2ecf20Sopenharmony_ci sin() 310.8 4692.0 319.0 398.5 1858c2ecf20Sopenharmony_ci cos() 284.4 4855.2 308.0 388.7 1868c2ecf20Sopenharmony_ci tan() 495.0 8807.1 394.9 504.7 1878c2ecf20Sopenharmony_ci atan() 328.9 4866.4 601.1 419.5-491.9 1888c2ecf20Sopenharmony_ci 1898c2ecf20Sopenharmony_ci sqrt() 128.7 crashed 145.2 227.0 1908c2ecf20Sopenharmony_ci log() 413.1-419.1 5103.4-5354.21 254.7-282.2 409.4-437.1 1918c2ecf20Sopenharmony_ci exp() 479.1 6619.2 469.1 850.8 1928c2ecf20Sopenharmony_ci 1938c2ecf20Sopenharmony_ci 1948c2ecf20Sopenharmony_ciThe performance under Linux is improved by the use of look-ahead code. 1958c2ecf20Sopenharmony_ciThe following results show the improvement which is obtained under 1968c2ecf20Sopenharmony_ciLinux due to the look-ahead code. Also given are the times for the 1978c2ecf20Sopenharmony_cioriginal Linux emulator with the 4.1 'soft' lib. 1988c2ecf20Sopenharmony_ci 1998c2ecf20Sopenharmony_ci [ Linus' note: I changed look-ahead to be the default under linux, as 2008c2ecf20Sopenharmony_ci there was no reason not to use it after I had edited it to be 2018c2ecf20Sopenharmony_ci disabled during tracing ] 2028c2ecf20Sopenharmony_ci 2038c2ecf20Sopenharmony_ci wm-FPU-emu w original w 2048c2ecf20Sopenharmony_ci look-ahead 'soft' lib 2058c2ecf20Sopenharmony_ci + 106.4 190.2 2068c2ecf20Sopenharmony_ci - 108.6-111.6 192.4-216.2 2078c2ecf20Sopenharmony_ci * 113.4 193.1 2088c2ecf20Sopenharmony_ci / 108.8-124.4 700.1-706.2 2098c2ecf20Sopenharmony_ci 2108c2ecf20Sopenharmony_ci sin() 390.5 2642.0 2118c2ecf20Sopenharmony_ci cos() 381.5 2767.4 2128c2ecf20Sopenharmony_ci tan() 496.5 3153.3 2138c2ecf20Sopenharmony_ci atan() 367.2-435.5 2439.4-3396.8 2148c2ecf20Sopenharmony_ci 2158c2ecf20Sopenharmony_ci sqrt() 195.1 4732.5 2168c2ecf20Sopenharmony_ci log() 358.0-387.5 3359.2-3390.3 2178c2ecf20Sopenharmony_ci exp() 619.3 4046.4 2188c2ecf20Sopenharmony_ci 2198c2ecf20Sopenharmony_ci 2208c2ecf20Sopenharmony_ciThese figures are now somewhat out-of-date. The emulator has become 2218c2ecf20Sopenharmony_ciprogressively slower for most functions as more of the 80486 features 2228c2ecf20Sopenharmony_cihave been implemented. 2238c2ecf20Sopenharmony_ci 2248c2ecf20Sopenharmony_ci 2258c2ecf20Sopenharmony_ci----------------------- Accuracy of wm-FPU-emu ----------------------- 2268c2ecf20Sopenharmony_ci 2278c2ecf20Sopenharmony_ci 2288c2ecf20Sopenharmony_ciThe accuracy of the emulator is in almost all cases equal to or better 2298c2ecf20Sopenharmony_cithan that of an Intel 80486 FPU. 2308c2ecf20Sopenharmony_ci 2318c2ecf20Sopenharmony_ciThe results of the basic arithmetic functions (+,-,*,/), and fsqrt 2328c2ecf20Sopenharmony_cimatch those of an 80486 FPU. They are the best possible; the error for 2338c2ecf20Sopenharmony_cithese never exceeds 1/2 an lsb. The fprem and fprem1 instructions 2348c2ecf20Sopenharmony_cireturn exact results; they have no error. 2358c2ecf20Sopenharmony_ci 2368c2ecf20Sopenharmony_ci 2378c2ecf20Sopenharmony_ciThe following table compares the emulator accuracy for the sqrt(), 2388c2ecf20Sopenharmony_citrig and log functions against the Turbo C "emulator". For this table, 2398c2ecf20Sopenharmony_cieach function was tested at about 400 points. Ideal worst-case results 2408c2ecf20Sopenharmony_ciwould be 64 bits. The reduced Turbo C accuracy of cos() and tan() for 2418c2ecf20Sopenharmony_ciarguments greater than pi/4 can be thought of as being related to the 2428c2ecf20Sopenharmony_ciprecision of the argument x; e.g. an argument of pi/2-(1e-10) which is 2438c2ecf20Sopenharmony_ciaccurate to 64 bits can result in a relative accuracy in cos() of 2448c2ecf20Sopenharmony_ciabout 64 + log2(cos(x)) = 31 bits. 2458c2ecf20Sopenharmony_ci 2468c2ecf20Sopenharmony_ci 2478c2ecf20Sopenharmony_ciFunction Tested x range Worst result Turbo C 2488c2ecf20Sopenharmony_ci (relative bits) 2498c2ecf20Sopenharmony_ci 2508c2ecf20Sopenharmony_cisqrt(x) 1 .. 2 64.1 63.2 2518c2ecf20Sopenharmony_ciatan(x) 1e-10 .. 200 64.2 62.8 2528c2ecf20Sopenharmony_cicos(x) 0 .. pi/2-(1e-10) 64.4 (x <= pi/4) 62.4 2538c2ecf20Sopenharmony_ci 64.1 (x = pi/2-(1e-10)) 31.9 2548c2ecf20Sopenharmony_cisin(x) 1e-10 .. pi/2 64.0 62.8 2558c2ecf20Sopenharmony_citan(x) 1e-10 .. pi/2-(1e-10) 64.0 (x <= pi/4) 62.1 2568c2ecf20Sopenharmony_ci 64.1 (x = pi/2-(1e-10)) 31.9 2578c2ecf20Sopenharmony_ciexp(x) 0 .. 1 63.1 ** 62.9 2588c2ecf20Sopenharmony_cilog(x) 1+1e-6 .. 2 63.8 ** 62.1 2598c2ecf20Sopenharmony_ci 2608c2ecf20Sopenharmony_ci** The accuracy for exp() and log() is low because the FPU (emulator) 2618c2ecf20Sopenharmony_cidoes not compute them directly; two operations are required. 2628c2ecf20Sopenharmony_ci 2638c2ecf20Sopenharmony_ci 2648c2ecf20Sopenharmony_ciThe emulator passes the "paranoia" tests (compiled with gcc 2.3.3 or 2658c2ecf20Sopenharmony_cilater) for 'float' variables (24 bit precision numbers) when precision 2668c2ecf20Sopenharmony_cicontrol is set to 24, 53 or 64 bits, and for 'double' variables (53 2678c2ecf20Sopenharmony_cibit precision numbers) when precision control is set to 53 bits (a 2688c2ecf20Sopenharmony_ciproperly performing FPU cannot pass the 'paranoia' tests for 'double' 2698c2ecf20Sopenharmony_civariables when precision control is set to 64 bits). 2708c2ecf20Sopenharmony_ci 2718c2ecf20Sopenharmony_ciThe code for reducing the argument for the trig functions (fsin, fcos, 2728c2ecf20Sopenharmony_cifptan and fsincos) has been improved and now effectively uses a value 2738c2ecf20Sopenharmony_cifor pi which is accurate to more than 128 bits precision. As a 2748c2ecf20Sopenharmony_ciconsequence, the accuracy of these functions for large arguments has 2758c2ecf20Sopenharmony_cibeen dramatically improved (and is now very much better than an 80486 2768c2ecf20Sopenharmony_ciFPU). There is also now no degradation of accuracy for fcos and fptan 2778c2ecf20Sopenharmony_cifor operands close to pi/2. Measured results are (note that the 2788c2ecf20Sopenharmony_cidefinition of accuracy has changed slightly from that used for the 2798c2ecf20Sopenharmony_ciabove table): 2808c2ecf20Sopenharmony_ci 2818c2ecf20Sopenharmony_ciFunction Tested x range Worst result 2828c2ecf20Sopenharmony_ci (absolute bits) 2838c2ecf20Sopenharmony_ci 2848c2ecf20Sopenharmony_cicos(x) 0 .. 9.22e+18 62.0 2858c2ecf20Sopenharmony_cisin(x) 1e-16 .. 9.22e+18 62.1 2868c2ecf20Sopenharmony_citan(x) 1e-16 .. 9.22e+18 61.8 2878c2ecf20Sopenharmony_ci 2888c2ecf20Sopenharmony_ciIt is possible with some effort to find very large arguments which 2898c2ecf20Sopenharmony_cigive much degraded precision. For example, the integer number 2908c2ecf20Sopenharmony_ci 8227740058411162616.0 2918c2ecf20Sopenharmony_ciis within about 10e-7 of a multiple of pi. To find the tan (for 2928c2ecf20Sopenharmony_ciexample) of this number to 64 bits precision it would be necessary to 2938c2ecf20Sopenharmony_cihave a value of pi which had about 150 bits precision. The FPU 2948c2ecf20Sopenharmony_ciemulator computes the result to about 42.6 bits precision (the correct 2958c2ecf20Sopenharmony_ciresult is about -9.739715e-8). On the other hand, an 80486 FPU returns 2968c2ecf20Sopenharmony_ci0.01059, which in relative terms is hopelessly inaccurate. 2978c2ecf20Sopenharmony_ci 2988c2ecf20Sopenharmony_ciFor arguments close to critical angles (which occur at multiples of 2998c2ecf20Sopenharmony_cipi/2) the emulator is more accurate than an 80486 FPU. For very large 3008c2ecf20Sopenharmony_ciarguments, the emulator is far more accurate. 3018c2ecf20Sopenharmony_ci 3028c2ecf20Sopenharmony_ci 3038c2ecf20Sopenharmony_ciPrior to version 1.20 of the emulator, the accuracy of the results for 3048c2ecf20Sopenharmony_cithe transcendental functions (in their principal range) was not as 3058c2ecf20Sopenharmony_cigood as the results from an 80486 FPU. From version 1.20, the accuracy 3068c2ecf20Sopenharmony_cihas been considerably improved and these functions now give measured 3078c2ecf20Sopenharmony_ciworst-case results which are better than the worst-case results given 3088c2ecf20Sopenharmony_ciby an 80486 FPU. 3098c2ecf20Sopenharmony_ci 3108c2ecf20Sopenharmony_ciThe following table gives the measured results for the emulator. The 3118c2ecf20Sopenharmony_cinumber of randomly selected arguments in each case is about half a 3128c2ecf20Sopenharmony_cimillion. The group of three columns gives the frequency of the given 3138c2ecf20Sopenharmony_ciaccuracy in number of times per million, thus the second of these 3148c2ecf20Sopenharmony_cicolumns shows that an accuracy of between 63.80 and 63.89 bits was 3158c2ecf20Sopenharmony_cifound at a rate of 133 times per one million measurements for fsin. 3168c2ecf20Sopenharmony_ciThe results show that the fsin, fcos and fptan instructions return 3178c2ecf20Sopenharmony_ciresults which are in error (i.e. less accurate than the best possible 3188c2ecf20Sopenharmony_ciresult (which is 64 bits)) for about one per cent of all arguments 3198c2ecf20Sopenharmony_cibetween -pi/2 and +pi/2. The other instructions have a lower 3208c2ecf20Sopenharmony_cifrequency of results which are in error. The last two columns give 3218c2ecf20Sopenharmony_cithe worst accuracy which was found (in bits) and the approximate value 3228c2ecf20Sopenharmony_ciof the argument which produced it. 3238c2ecf20Sopenharmony_ci 3248c2ecf20Sopenharmony_ci frequency (per M) 3258c2ecf20Sopenharmony_ci ------------------- --------------- 3268c2ecf20Sopenharmony_ciinstr arg range # tests 63.7 63.8 63.9 worst at arg 3278c2ecf20Sopenharmony_ci bits bits bits bits 3288c2ecf20Sopenharmony_ci----- ------------ ------- ---- ---- ----- ----- -------- 3298c2ecf20Sopenharmony_cifsin (0,pi/2) 547756 0 133 10673 63.89 0.451317 3308c2ecf20Sopenharmony_cifcos (0,pi/2) 547563 0 126 10532 63.85 0.700801 3318c2ecf20Sopenharmony_cifptan (0,pi/2) 536274 11 267 10059 63.74 0.784876 3328c2ecf20Sopenharmony_cifpatan 4 quadrants 517087 0 8 1855 63.88 0.435121 (4q) 3338c2ecf20Sopenharmony_cifyl2x (0,20) 541861 0 0 1323 63.94 1.40923 (x) 3348c2ecf20Sopenharmony_cifyl2xp1 (-.293,.414) 520256 0 0 5678 63.93 0.408542 (x) 3358c2ecf20Sopenharmony_cif2xm1 (-1,1) 538847 4 481 6488 63.79 0.167709 3368c2ecf20Sopenharmony_ci 3378c2ecf20Sopenharmony_ci 3388c2ecf20Sopenharmony_ciTests performed on an 80486 FPU showed results of lower accuracy. The 3398c2ecf20Sopenharmony_cifollowing table gives the results which were obtained with an AMD 3408c2ecf20Sopenharmony_ci486DX2/66 (other tests indicate that an Intel 486DX produces 3418c2ecf20Sopenharmony_ciidentical results). The tests were basically the same as those used 3428c2ecf20Sopenharmony_cito measure the emulator (the values, being random, were in general not 3438c2ecf20Sopenharmony_cithe same). The total number of tests for each instruction are given 3448c2ecf20Sopenharmony_ciat the end of the table, in case each about 100k tests were performed. 3458c2ecf20Sopenharmony_ciAnother line of figures at the end of the table shows that most of the 3468c2ecf20Sopenharmony_ciinstructions return results which are in error for more than 10 3478c2ecf20Sopenharmony_cipercent of the arguments tested. 3488c2ecf20Sopenharmony_ci 3498c2ecf20Sopenharmony_ciThe numbers in the body of the table give the approx number of times a 3508c2ecf20Sopenharmony_ciresult of the given accuracy in bits (given in the left-most column) 3518c2ecf20Sopenharmony_ciwas obtained per one million arguments. For three of the instructions, 3528c2ecf20Sopenharmony_citwo columns of results are given: * The second column for f2xm1 gives 3538c2ecf20Sopenharmony_cithe number cases where the results of the first column were for a 3548c2ecf20Sopenharmony_cipositive argument, this shows that this instruction gives better 3558c2ecf20Sopenharmony_ciresults for positive arguments than it does for negative. * In the 3568c2ecf20Sopenharmony_cicases of fcos and fptan, the first column gives the results when all 3578c2ecf20Sopenharmony_cicases where arguments greater than 1.5 were removed from the results 3588c2ecf20Sopenharmony_cigiven in the second column. Unlike the emulator, an 80486 FPU returns 3598c2ecf20Sopenharmony_ciresults of relatively poor accuracy for these instructions when the 3608c2ecf20Sopenharmony_ciargument approaches pi/2. The table does not show those cases when the 3618c2ecf20Sopenharmony_ciaccuracy of the results were less than 62 bits, which occurs quite 3628c2ecf20Sopenharmony_cioften for fsin and fptan when the argument approaches pi/2. This poor 3638c2ecf20Sopenharmony_ciaccuracy is discussed above in relation to the Turbo C "emulator", and 3648c2ecf20Sopenharmony_cithe accuracy of the value of pi. 3658c2ecf20Sopenharmony_ci 3668c2ecf20Sopenharmony_ci 3678c2ecf20Sopenharmony_cibits f2xm1 f2xm1 fpatan fcos fcos fyl2x fyl2xp1 fsin fptan fptan 3688c2ecf20Sopenharmony_ci62.0 0 0 0 0 437 0 0 0 0 925 3698c2ecf20Sopenharmony_ci62.1 0 0 10 0 894 0 0 0 0 1023 3708c2ecf20Sopenharmony_ci62.2 14 0 0 0 1033 0 0 0 0 945 3718c2ecf20Sopenharmony_ci62.3 57 0 0 0 1202 0 0 0 0 1023 3728c2ecf20Sopenharmony_ci62.4 385 0 0 10 1292 0 23 0 0 1178 3738c2ecf20Sopenharmony_ci62.5 1140 0 0 119 1649 0 39 0 0 1149 3748c2ecf20Sopenharmony_ci62.6 2037 0 0 189 1620 0 16 0 0 1169 3758c2ecf20Sopenharmony_ci62.7 5086 14 0 646 2315 10 101 35 39 1402 3768c2ecf20Sopenharmony_ci62.8 8818 86 0 984 3050 59 287 131 224 2036 3778c2ecf20Sopenharmony_ci62.9 11340 1355 0 2126 4153 79 605 357 321 1948 3788c2ecf20Sopenharmony_ci63.0 15557 4750 0 3319 5376 246 1281 862 808 2688 3798c2ecf20Sopenharmony_ci63.1 20016 8288 0 4620 6628 511 2569 1723 1510 3302 3808c2ecf20Sopenharmony_ci63.2 24945 11127 10 6588 8098 1120 4470 2968 2990 4724 3818c2ecf20Sopenharmony_ci63.3 25686 12382 69 8774 10682 1906 6775 4482 5474 7236 3828c2ecf20Sopenharmony_ci63.4 29219 14722 79 11109 12311 3094 9414 7259 8912 10587 3838c2ecf20Sopenharmony_ci63.5 30458 14936 393 13802 15014 5874 12666 9609 13762 15262 3848c2ecf20Sopenharmony_ci63.6 32439 16448 1277 17945 19028 10226 15537 14657 19158 20346 3858c2ecf20Sopenharmony_ci63.7 35031 16805 4067 23003 23947 18910 20116 21333 25001 26209 3868c2ecf20Sopenharmony_ci63.8 33251 15820 7673 24781 25675 24617 25354 24440 29433 30329 3878c2ecf20Sopenharmony_ci63.9 33293 16833 18529 28318 29233 31267 31470 27748 29676 30601 3888c2ecf20Sopenharmony_ci 3898c2ecf20Sopenharmony_ciPer cent with error: 3908c2ecf20Sopenharmony_ci 30.9 3.2 18.5 9.8 13.1 11.6 17.4 3918c2ecf20Sopenharmony_ciTotal arguments tested: 3928c2ecf20Sopenharmony_ci 70194 70099 101784 100641 100641 101799 128853 114893 102675 102675 3938c2ecf20Sopenharmony_ci 3948c2ecf20Sopenharmony_ci 3958c2ecf20Sopenharmony_ci------------------------- Contributors ------------------------------- 3968c2ecf20Sopenharmony_ci 3978c2ecf20Sopenharmony_ciA number of people have contributed to the development of the 3988c2ecf20Sopenharmony_ciemulator, often by just reporting bugs, sometimes with suggested 3998c2ecf20Sopenharmony_cifixes, and a few kind people have provided me with access in one way 4008c2ecf20Sopenharmony_cior another to an 80486 machine. Contributors include (to those people 4018c2ecf20Sopenharmony_ciwho I may have forgotten, please forgive me): 4028c2ecf20Sopenharmony_ci 4038c2ecf20Sopenharmony_ciLinus Torvalds 4048c2ecf20Sopenharmony_ciTommy.Thorn@daimi.aau.dk 4058c2ecf20Sopenharmony_ciAndrew.Tridgell@anu.edu.au 4068c2ecf20Sopenharmony_ciNick Holloway, alfie@dcs.warwick.ac.uk 4078c2ecf20Sopenharmony_ciHermano Moura, moura@dcs.gla.ac.uk 4088c2ecf20Sopenharmony_ciJon Jagger, J.Jagger@scp.ac.uk 4098c2ecf20Sopenharmony_ciLennart Benschop 4108c2ecf20Sopenharmony_ciBrian Gallew, geek+@CMU.EDU 4118c2ecf20Sopenharmony_ciThomas Staniszewski, ts3v+@andrew.cmu.edu 4128c2ecf20Sopenharmony_ciMartin Howell, mph@plasma.apana.org.au 4138c2ecf20Sopenharmony_ciM Saggaf, alsaggaf@athena.mit.edu 4148c2ecf20Sopenharmony_ciPeter Barker, PETER@socpsy.sci.fau.edu 4158c2ecf20Sopenharmony_citom@vlsivie.tuwien.ac.at 4168c2ecf20Sopenharmony_ciDan Russel, russed@rpi.edu 4178c2ecf20Sopenharmony_ciDaniel Carosone, danielce@ee.mu.oz.au 4188c2ecf20Sopenharmony_cicae@jpmorgan.com 4198c2ecf20Sopenharmony_ciHamish Coleman, t933093@minyos.xx.rmit.oz.au 4208c2ecf20Sopenharmony_ciBruce Evans, bde@kralizec.zeta.org.au 4218c2ecf20Sopenharmony_ciTimo Korvola, Timo.Korvola@hut.fi 4228c2ecf20Sopenharmony_ciRick Lyons, rick@razorback.brisnet.org.au 4238c2ecf20Sopenharmony_ciRick, jrs@world.std.com 4248c2ecf20Sopenharmony_ci 4258c2ecf20Sopenharmony_ci...and numerous others who responded to my request for help with 4268c2ecf20Sopenharmony_cia real 80486. 4278c2ecf20Sopenharmony_ci 428