"Fossies" - the Fresh Open Source Software Archive 
Member "mapm_4.9.5a/DOCS/algorithms.used" (21 Feb 2010, 11900 Bytes) of package /linux/misc/old/mapm-4.9.5a.tar.gz:
As a special service "Fossies" has tried to format the requested text file into HTML format (style:
standard) with prefixed line numbers.
Alternatively you can here
view or
download the uninterpreted source code file.
1
2 ============================================================================
3 This file describes the algorithms used in MAPM. July 24, 2003
4 ============================================================================
5 (If this file doesn't look right, set your tab-stops to 8.)
6 ============================================================================
7
8 ADD: There is nothing real interesting about the ADD function.
9 If the signs of the numbers are the same, continue with
10 the add routine. If they are opposite, change one of them
11 and call the subtract function. In order to perform the
12 add, the number with the smaller exponent is scaled so
13 the exponents match, then the byte-wise add is performed.
14
15 SUBTRACT: There is nothing real interesting about the SUBTRACT function
16 either. If the signs of the numbers are the same, continue
17 with the subtract routine. If they are opposite, change one
18 of them and call the add function. In order to perform the
19 subtraction, the number with the smaller exponent is scaled
20 so the exponents match, then the byte-wise subtract is
21 performed.
22
23 MULTIPLY: Three multiply functions are used :
24
25 The multiply function uses the simple O(n^2) model for
26 small input numbers.
27
28 It uses an FFT based multiplication next. The FFT used
29 is from Takuya OOURA (email: ooura@mmm.t.u-tokyo.ac.jp)
30 This method results in only O(N * Log2(N)) growth. This
31 method is used up to approx 512K digits.
32
33 For numbers > 512K digits, the following algorithm is used.
34 (sometimes called the divide-and-conquer technique.)
35 When the divide-and-conquer has divided down to 512K digits,
36 the FFT algorithm is used to finish up.
37
38 The divide-and-conquer method results in approx O(N ^ 1.585)
39 growth.
40
41 assume we have 2 numbers (a & b) with 2N digits.
42
43 let : a = (2^N) * A1 + A0 , b = (2^N) * B1 + B0
44
45 where 'A1' is the 'most significant half' of 'a' and
46 'A0' is the 'least significant half' of 'a'. Same for
47 B1 and B0.
48
49 Now use the identity :
50
51 2N N N N
52 ab = (2 + 2 ) A1B1 + 2 (A1-A0)(B0-B1) + (2 + 1)A0B0
53
54
55 The original problem of multiplying 2 (2N) digit numbers has
56 been reduced to 3 multiplications of N digit numbers plus
57 some additions, subtractions, and shifts.
58
59 This multiplication algorithm is used recursively.
60
61 RECIPROCAL: Used Newton's method. The following iteration was used :
62
63 X = X * [ 2 - N * X ]
64 n+1
65
66 DIVIDE: Two division functions are used :
67
68 For dividing numbers < 250 digits
69
70 Used Knuth's division algorithm from 'The Art of Computer
71 Programming, Volume 2' with a slight modification. I
72 determine right in step D3 whether q-hat is too big so
73 step D6 is unnecessary. I use the first 3 (base 100) digits
74 of the numerator and the first 2 digits of the denominator
75 to determine our trial divisor, instead of Knuth's 2 & 1.
76
77 For dividing numbers >= 250 digits
78
79 Use the iterative reciprocal function from above and
80 then multiply.
81
82
83 NOTE: There is a subtle operational difference between these
84 two division functions. The first one truncates the divide
85 process at the specified digit (result is not *rounded*).
86 The second function using the reciprocal returns a rounded
87 result. For example, when dividing (2 / 3),
88
89 the first function will return --> 0.66.......66
90 the second function will return --> 0.66.......67
91
92 The first function is used for the integer_divide function
93 (see below). Correct results require that the result remain
94 un-rounded. It's also used for < 250 digits because that
95 algorithm is faster for a fewer number of digits.
96
97 INTEGER_DIVIDE: Call Knuth's divide function with a few extra decimal places
98 of precision and then truncates the result to the nearest
99 integer.
100
101 FACTORIAL: Compute a series of partial products so the number of
102 significant digits are nearly the same for all partial
103 products and also the number of significant digits are
104 near a power of 2. This is the optimum setup for the
105 fast multiplication algorithm. See mapmfact.c for all
106 the details.
107
108 RANDOM NUMBERS: Used Knuth's The Art of Computer Programming, Volume 2 as
109 the basis. Assuming the random number is X, compute :
110 (using all integer math)
111
112 X = (a * X + c) MOD m
113
114 From Knuth:
115
116 'm' should be large, at least 2^30 : we use 1.0E+15
117
118 'a' should be between .01m and .99m and not have a simple
119 pattern. 'a' should not have any large factors in common
120 with 'm' and (since 'm' is a power of 10) if 'a' MOD 200
121 = 21 then all 'm' different possible values will be
122 generated before 'X' starts to repeat.
123
124 We use 'a' = 716805947629621.
125
126 'a' MOD 200 = 21 and 'a' is also a prime number. The file
127 mapm_rnd.c contains many potential multipliers commented
128 out that are all prime and meet 'a' MOD 200 = 21.
129
130 There are few restrictions on 'c' except 'c' can have no
131 factor in common with 'm', hence we set 'c' = 'a'.
132
133 On the first call, the system time is used to initialize X.
134
135 SQUARE ROOT: Used Newton's method. The following iteration was used :
136 (which really finds 1/sqrt(N), multiply by N when complete
137 to get the final sqrt).
138
139 X = 0.5 * X * [ 3 - N * X^2 ]
140 n+1
141
142
143 CUBE ROOT: Used Newton's method. The following iteration was used :
144 (which really finds 1/cbrt(N), to find the final cbrt
145 compute N * X ^ 2)
146
147 4
148 X = [ 4 * X - N * X ] / 3
149 n+1
150
151
152 EXP: The exp function uses a modification of the Taylor series.
153 The algorithm was borrowed from David H. Bailey's 'MPFUN'
154 package. This is a multiple precision floating point
155 package for Fortran. Mr. Bailey's description for EXP
156 follows verbatim :
157 "
158 Exp (t) = (1 + r + r^2 / 2! + r^3 / 3! + r^4 / 4! ...) ^ q * 2 ^ n
159
160 where q = 256, r = t' / q, t' = t - n Log(2) and where n is chosen so
161 that -0.5 Log(2) < t' <= 0.5 Log(2). Reducing t mod Log(2) and
162 dividing by 256 insures that -0.001 < r <= 0.001, which accelerates
163 convergence in the above series.
164 "
165
166
167 POW: Calls the EXP function. The formula used is :
168
169 Y A
170 X = e where A = Y * log(X)
171
172
173 INTEGER POW: Compute X^N when N is an integer. Used Knuth's algorithm
174 from The Art of Computer Programming, Volume 2.
175
176 A1 : Y = 1, Z = X
177 A2 : is N even/odd? and then N = N / 2
178 if N was even, goto step A5
179 A3 : Y = Z * Y
180 A4 : if N == 0, terminate with Y as the answer
181 A5 : Z = Z * Z
182 A6 : goto A2
183
184
185 LOG: If the input is very close to '1', use a series expansion.
186 Calculate log (1 + x) with the following series:
187
188 x
189 y = ----- ( |y| < 1 )
190 x + 2
191
192 [ 1 + y ] y^3 y^5 y^7
193 log [-------] = 2 * [ y + --- + --- + --- ... ]
194 [ 1 - y ] 3 5 7
195
196
197 For numbers not near '1' :
198
199 For < 360 decimal places the following iteration was used :
200
201 exp(X) - N
202 X = X - 2 * ------------
203 n+1 exp(X) + N
204
205 This is a cubically convergent algorithm
206 (each iteration yields 3X more digits)
207
208
209 For >= 360 decimal places, use the following algorithm
210 to compute log(N):
211
212
213 Let 'X' be 'close' to the solution. Use the cubically
214 convergent algorithm above to get 110 decimal places.
215
216 Let Y = ( N * exp(-X) ) - 1
217
218 Then :
219
220 log(N) = X + log(1 + Y)
221
222 Since 'Y' will be small, we can use the efficient
223 'log_near_1' algorithm from above.
224
225
226 LOG10: Calls the LOG function. The formula used is :
227
228 log10(x) = A * log(x) where A = log (e) [0.43429448190...]
229 10
230
231 SIN: The sin function uses the traditional series expansion,
232 though not right away. Two translations are performed
233 first and the series expansion is done in the third step.
234 The reason for the manipulations is to minimize the
235 magnitude of the number passed to the series expansion.
236 The smaller that number is, the faster the series will
237 converge to the precision desired.
238
239 Step 1: Limit input argument to +/- PI.
240
241 Step 2: Use the multiple angle identity for sin(5x) :
242 We will use this recursively 3 times, i.e.
243 we will multiply the input angle by 1 / (5*5*5).
244
245 sin (5x) == 16 * sin^5 (x) - 20 * sin^3 (x) + 5 * sin(x)
246
247 [ Step 2 yields a worst case |x| = ~0.0251 (PI / 125) ]
248
249 Step 3: Traditional series expansion :
250
251
252 x^3 x^5 x^7 x^9
253 sin(x) == x - --- + --- - --- + --- ...
254 3! 5! 7! 9!
255
256
257 COS: The cos function is very similiar to sin.
258
259 Step 1: Limit input argument to +/- PI.
260
261 Step 2: Use the multiple angle identity for cos(4x).
262 We will use this recursively 3 OR 4 times, i.e.
263 we will multiply the input angle by 1 / (4 ^ 3)
264 OR 1 / (4 ^ 4). If |x| is < 1 radian, we will
265 recurse 3 times. If |x| >= 1 radian, we will
266 recurse 4 times.
267
268 cos (4x) == 8 * [ cos^4 (x) - cos^2 (x) ] + 1
269
270 [ Step 2 yields a worst case |x| = ~0.0156 ]
271 [ (1 / 64) > (PI / 256) ]
272
273 Step 4: Traditional series expansion :
274
275
276 x^2 x^4 x^6 x^8
277 cos(x) == 1 - --- + --- - --- + --- ...
278 2! 4! 6! 8!
279
280
281 TAN: Compute cos(x), then compute sin(x) = sqrt(1 - cos(x) ^ 2).
282
283 Finally, tan(x) = sin(x) / cos(x).
284
285 Note that we need to keep track of the sign for the
286 'sin' after the sqrt since that result will always be
287 positive.
288
289 ARC_SIN: Since the 'arc' family of functions all converge very
290 slowly, we will call the sin/cos functions and iterate
291 using Newton's method. We actually just call the cos
292 function and determine the sin value, as was done for TAN.
293 The following iteration was used :
294
295 sin(X) - N
296 X = X - ------------
297 n+1 cos(X)
298
299 If we need the arcsin(x) and the magnitude of
300 'x' is > 0.85, use the following identities:
301
302 For x > 0 : arcsin(x) = arccos(sqrt(1 - x^2))
303 For x < 0 : arcsin(x) = -arccos(sqrt(1 - x^2))
304
305 If the magnitude of 'x' is close to 0 (< 1.0E-4),
306 use the following identity using arc-tan-near-0.
307
308 x
309 arcsin (x) == arctan [ --------------- ]
310 sqrt(1 - x^2)
311
312
313 ARC_COS: The arccos function is similiar to the arcsin for the same
314 reasons. The following iteration was used :
315
316 cos(X) - N
317 X = X + ------------
318 n+1 sin(X)
319
320 If we need the arccos(x) and the magnitude of
321 'x' is > 0.85, use the following identities:
322
323 For x > 0 : arccos(x) = arcsin(sqrt(1 - x^2))
324 For x < 0 : arccos(x) = PI - arcsin(sqrt(1 - x^2))
325
326 If the magnitude of 'x' is close to 0 (< 1.0E-4)
327 use the following identity using arc-sin-near-0.
328
329 arccos (x) == PI / 2 - arcsin (x)
330
331
332 ARC_TAN: Use the identity :
333 x
334 arctan (x) == arcsin [ --------------- ]
335 sqrt(1 + x^2)
336
337
338 If the magnitude of 'x' is large (> 1000), then use :
339
340 arctan(x) = (PI / 2) - arctan(1 / |x|)
341
342 and sign of result = sign of original input
343
344
345 If the magnitude of 'x' is close to 0 (< 1.0E-4)
346 use the following series:
347
348 x^3 x^5 x^7 x^9
349 arctan (x) = x - --- + --- - --- + --- ...
350 3 5 7 9
351
352
353 ARC_TAN2: Handle the special cases, x and/or y == 0.
354 Call the arctan function with (y / x) as the argument.
355 Adjust the returned value based on the signs of y, x.
356
357
358 HYPERBOLIC FUNCTIONS: Use the following identities :
359
360 SINH(x) == 0.5 * [ exp(x) - exp(-x) ]
361
362 COSH(x) == 0.5 * [ exp(x) + exp(-x) ]
363
364 TANH(x) == [ exp(x) - exp(-x) ] / [ exp(x) + exp(-x) ]
365
366 ARCSINH(x) == log [ x + sqrt(x^2 + 1) ] ( and also use asinh(-x) = -asinh(x) )
367
368 ARCCOSH(x) == log [ x + sqrt(x^2 - 1) ] ( x >= 1.0 )
369
370 ARCTANH(x) == 0.5 * log [ (1 + x) / (1 - x) ] ( |x| < 1.0 )
371
372 ============================================================================