As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers.
Alternatively you can here view or download the uninterpreted source code file.

1 2 ============================================================================ 3 This file describes the algorithms used in MAPM. July 24, 2003 4 ============================================================================ 5 (If this file doesn't look right, set your tab-stops to 8.) 6 ============================================================================ 7 8 ADD: There is nothing real interesting about the ADD function. 9 If the signs of the numbers are the same, continue with 10 the add routine. If they are opposite, change one of them 11 and call the subtract function. In order to perform the 12 add, the number with the smaller exponent is scaled so 13 the exponents match, then the byte-wise add is performed. 14 15 SUBTRACT: There is nothing real interesting about the SUBTRACT function 16 either. If the signs of the numbers are the same, continue 17 with the subtract routine. If they are opposite, change one 18 of them and call the add function. In order to perform the 19 subtraction, the number with the smaller exponent is scaled 20 so the exponents match, then the byte-wise subtract is 21 performed. 22 23 MULTIPLY: Three multiply functions are used : 24 25 The multiply function uses the simple O(n^2) model for 26 small input numbers. 27 28 It uses an FFT based multiplication next. The FFT used 29 is from Takuya OOURA (email: ooura@mmm.t.u-tokyo.ac.jp) 30 This method results in only O(N * Log2(N)) growth. This 31 method is used up to approx 512K digits. 32 33 For numbers > 512K digits, the following algorithm is used. 34 (sometimes called the divide-and-conquer technique.) 35 When the divide-and-conquer has divided down to 512K digits, 36 the FFT algorithm is used to finish up. 37 38 The divide-and-conquer method results in approx O(N ^ 1.585) 39 growth. 40 41 assume we have 2 numbers (a & b) with 2N digits. 42 43 let : a = (2^N) * A1 + A0 , b = (2^N) * B1 + B0 44 45 where 'A1' is the 'most significant half' of 'a' and 46 'A0' is the 'least significant half' of 'a'. Same for 47 B1 and B0. 48 49 Now use the identity : 50 51 2N N N N 52 ab = (2 + 2 ) A1B1 + 2 (A1-A0)(B0-B1) + (2 + 1)A0B0 53 54 55 The original problem of multiplying 2 (2N) digit numbers has 56 been reduced to 3 multiplications of N digit numbers plus 57 some additions, subtractions, and shifts. 58 59 This multiplication algorithm is used recursively. 60 61 RECIPROCAL: Used Newton's method. The following iteration was used : 62 63 X = X * [ 2 - N * X ] 64 n+1 65 66 DIVIDE: Two division functions are used : 67 68 For dividing numbers < 250 digits 69 70 Used Knuth's division algorithm from 'The Art of Computer 71 Programming, Volume 2' with a slight modification. I 72 determine right in step D3 whether q-hat is too big so 73 step D6 is unnecessary. I use the first 3 (base 100) digits 74 of the numerator and the first 2 digits of the denominator 75 to determine our trial divisor, instead of Knuth's 2 & 1. 76 77 For dividing numbers >= 250 digits 78 79 Use the iterative reciprocal function from above and 80 then multiply. 81 82 83 NOTE: There is a subtle operational difference between these 84 two division functions. The first one truncates the divide 85 process at the specified digit (result is not *rounded*). 86 The second function using the reciprocal returns a rounded 87 result. For example, when dividing (2 / 3), 88 89 the first function will return --> 0.66.......66 90 the second function will return --> 0.66.......67 91 92 The first function is used for the integer_divide function 93 (see below). Correct results require that the result remain 94 un-rounded. It's also used for < 250 digits because that 95 algorithm is faster for a fewer number of digits. 96 97 INTEGER_DIVIDE: Call Knuth's divide function with a few extra decimal places 98 of precision and then truncates the result to the nearest 99 integer. 100 101 FACTORIAL: Compute a series of partial products so the number of 102 significant digits are nearly the same for all partial 103 products and also the number of significant digits are 104 near a power of 2. This is the optimum setup for the 105 fast multiplication algorithm. See mapmfact.c for all 106 the details. 107 108 RANDOM NUMBERS: Used Knuth's The Art of Computer Programming, Volume 2 as 109 the basis. Assuming the random number is X, compute : 110 (using all integer math) 111 112 X = (a * X + c) MOD m 113 114 From Knuth: 115 116 'm' should be large, at least 2^30 : we use 1.0E+15 117 118 'a' should be between .01m and .99m and not have a simple 119 pattern. 'a' should not have any large factors in common 120 with 'm' and (since 'm' is a power of 10) if 'a' MOD 200 121 = 21 then all 'm' different possible values will be 122 generated before 'X' starts to repeat. 123 124 We use 'a' = 716805947629621. 125 126 'a' MOD 200 = 21 and 'a' is also a prime number. The file 127 mapm_rnd.c contains many potential multipliers commented 128 out that are all prime and meet 'a' MOD 200 = 21. 129 130 There are few restrictions on 'c' except 'c' can have no 131 factor in common with 'm', hence we set 'c' = 'a'. 132 133 On the first call, the system time is used to initialize X. 134 135 SQUARE ROOT: Used Newton's method. The following iteration was used : 136 (which really finds 1/sqrt(N), multiply by N when complete 137 to get the final sqrt). 138 139 X = 0.5 * X * [ 3 - N * X^2 ] 140 n+1 141 142 143 CUBE ROOT: Used Newton's method. The following iteration was used : 144 (which really finds 1/cbrt(N), to find the final cbrt 145 compute N * X ^ 2) 146 147 4 148 X = [ 4 * X - N * X ] / 3 149 n+1 150 151 152 EXP: The exp function uses a modification of the Taylor series. 153 The algorithm was borrowed from David H. Bailey's 'MPFUN' 154 package. This is a multiple precision floating point 155 package for Fortran. Mr. Bailey's description for EXP 156 follows verbatim : 157 " 158 Exp (t) = (1 + r + r^2 / 2! + r^3 / 3! + r^4 / 4! ...) ^ q * 2 ^ n 159 160 where q = 256, r = t' / q, t' = t - n Log(2) and where n is chosen so 161 that -0.5 Log(2) < t' <= 0.5 Log(2). Reducing t mod Log(2) and 162 dividing by 256 insures that -0.001 < r <= 0.001, which accelerates 163 convergence in the above series. 164 " 165 166 167 POW: Calls the EXP function. The formula used is : 168 169 Y A 170 X = e where A = Y * log(X) 171 172 173 INTEGER POW: Compute X^N when N is an integer. Used Knuth's algorithm 174 from The Art of Computer Programming, Volume 2. 175 176 A1 : Y = 1, Z = X 177 A2 : is N even/odd? and then N = N / 2 178 if N was even, goto step A5 179 A3 : Y = Z * Y 180 A4 : if N == 0, terminate with Y as the answer 181 A5 : Z = Z * Z 182 A6 : goto A2 183 184 185 LOG: If the input is very close to '1', use a series expansion. 186 Calculate log (1 + x) with the following series: 187 188 x 189 y = ----- ( |y| < 1 ) 190 x + 2 191 192 [ 1 + y ] y^3 y^5 y^7 193 log [-------] = 2 * [ y + --- + --- + --- ... ] 194 [ 1 - y ] 3 5 7 195 196 197 For numbers not near '1' : 198 199 For < 360 decimal places the following iteration was used : 200 201 exp(X) - N 202 X = X - 2 * ------------ 203 n+1 exp(X) + N 204 205 This is a cubically convergent algorithm 206 (each iteration yields 3X more digits) 207 208 209 For >= 360 decimal places, use the following algorithm 210 to compute log(N): 211 212 213 Let 'X' be 'close' to the solution. Use the cubically 214 convergent algorithm above to get 110 decimal places. 215 216 Let Y = ( N * exp(-X) ) - 1 217 218 Then : 219 220 log(N) = X + log(1 + Y) 221 222 Since 'Y' will be small, we can use the efficient 223 'log_near_1' algorithm from above. 224 225 226 LOG10: Calls the LOG function. The formula used is : 227 228 log10(x) = A * log(x) where A = log (e) [0.43429448190...] 229 10 230 231 SIN: The sin function uses the traditional series expansion, 232 though not right away. Two translations are performed 233 first and the series expansion is done in the third step. 234 The reason for the manipulations is to minimize the 235 magnitude of the number passed to the series expansion. 236 The smaller that number is, the faster the series will 237 converge to the precision desired. 238 239 Step 1: Limit input argument to +/- PI. 240 241 Step 2: Use the multiple angle identity for sin(5x) : 242 We will use this recursively 3 times, i.e. 243 we will multiply the input angle by 1 / (5*5*5). 244 245 sin (5x) == 16 * sin^5 (x) - 20 * sin^3 (x) + 5 * sin(x) 246 247 [ Step 2 yields a worst case |x| = ~0.0251 (PI / 125) ] 248 249 Step 3: Traditional series expansion : 250 251 252 x^3 x^5 x^7 x^9 253 sin(x) == x - --- + --- - --- + --- ... 254 3! 5! 7! 9! 255 256 257 COS: The cos function is very similiar to sin. 258 259 Step 1: Limit input argument to +/- PI. 260 261 Step 2: Use the multiple angle identity for cos(4x). 262 We will use this recursively 3 OR 4 times, i.e. 263 we will multiply the input angle by 1 / (4 ^ 3) 264 OR 1 / (4 ^ 4). If |x| is < 1 radian, we will 265 recurse 3 times. If |x| >= 1 radian, we will 266 recurse 4 times. 267 268 cos (4x) == 8 * [ cos^4 (x) - cos^2 (x) ] + 1 269 270 [ Step 2 yields a worst case |x| = ~0.0156 ] 271 [ (1 / 64) > (PI / 256) ] 272 273 Step 4: Traditional series expansion : 274 275 276 x^2 x^4 x^6 x^8 277 cos(x) == 1 - --- + --- - --- + --- ... 278 2! 4! 6! 8! 279 280 281 TAN: Compute cos(x), then compute sin(x) = sqrt(1 - cos(x) ^ 2). 282 283 Finally, tan(x) = sin(x) / cos(x). 284 285 Note that we need to keep track of the sign for the 286 'sin' after the sqrt since that result will always be 287 positive. 288 289 ARC_SIN: Since the 'arc' family of functions all converge very 290 slowly, we will call the sin/cos functions and iterate 291 using Newton's method. We actually just call the cos 292 function and determine the sin value, as was done for TAN. 293 The following iteration was used : 294 295 sin(X) - N 296 X = X - ------------ 297 n+1 cos(X) 298 299 If we need the arcsin(x) and the magnitude of 300 'x' is > 0.85, use the following identities: 301 302 For x > 0 : arcsin(x) = arccos(sqrt(1 - x^2)) 303 For x < 0 : arcsin(x) = -arccos(sqrt(1 - x^2)) 304 305 If the magnitude of 'x' is close to 0 (< 1.0E-4), 306 use the following identity using arc-tan-near-0. 307 308 x 309 arcsin (x) == arctan [ --------------- ] 310 sqrt(1 - x^2) 311 312 313 ARC_COS: The arccos function is similiar to the arcsin for the same 314 reasons. The following iteration was used : 315 316 cos(X) - N 317 X = X + ------------ 318 n+1 sin(X) 319 320 If we need the arccos(x) and the magnitude of 321 'x' is > 0.85, use the following identities: 322 323 For x > 0 : arccos(x) = arcsin(sqrt(1 - x^2)) 324 For x < 0 : arccos(x) = PI - arcsin(sqrt(1 - x^2)) 325 326 If the magnitude of 'x' is close to 0 (< 1.0E-4) 327 use the following identity using arc-sin-near-0. 328 329 arccos (x) == PI / 2 - arcsin (x) 330 331 332 ARC_TAN: Use the identity : 333 x 334 arctan (x) == arcsin [ --------------- ] 335 sqrt(1 + x^2) 336 337 338 If the magnitude of 'x' is large (> 1000), then use : 339 340 arctan(x) = (PI / 2) - arctan(1 / |x|) 341 342 and sign of result = sign of original input 343 344 345 If the magnitude of 'x' is close to 0 (< 1.0E-4) 346 use the following series: 347 348 x^3 x^5 x^7 x^9 349 arctan (x) = x - --- + --- - --- + --- ... 350 3 5 7 9 351 352 353 ARC_TAN2: Handle the special cases, x and/or y == 0. 354 Call the arctan function with (y / x) as the argument. 355 Adjust the returned value based on the signs of y, x. 356 357 358 HYPERBOLIC FUNCTIONS: Use the following identities : 359 360 SINH(x) == 0.5 * [ exp(x) - exp(-x) ] 361 362 COSH(x) == 0.5 * [ exp(x) + exp(-x) ] 363 364 TANH(x) == [ exp(x) - exp(-x) ] / [ exp(x) + exp(-x) ] 365 366 ARCSINH(x) == log [ x + sqrt(x^2 + 1) ] ( and also use asinh(-x) = -asinh(x) ) 367 368 ARCCOSH(x) == log [ x + sqrt(x^2 - 1) ] ( x >= 1.0 ) 369 370 ARCTANH(x) == 0.5 * log [ (1 + x) / (1 - x) ] ( |x| < 1.0 ) 371 372 ============================================================================