As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers.
Alternatively you can here view or download the uninterpreted source code file.

1 Chapter 11 2 3 TIME SERIES ANALYSIS 4 5 Summary 6 7 Time series functions support the use of models 8 in practical applications requiring analysis 9 and/or forecasting of series. The basic 10 time-domain model is a series of the 11 autoregressive moving average (ARMA) type. The 12 software can successfully deal with 13 generalizations, such as integrated (ARIMA) 14 models and factor models. Specific areas covered 15 are: 16 17 o Generation and Prediction of ARMA Time Series 18 o Estimation of ARIMA Model Parameters 19 o Estimation of Factor Model Parameters 20 o Model Identification and Evaluation 21 o Time Series Utility Operations 22 23 Factor models are a powerful generalization of 24 the Box-Jenkins approach. They support the 25 extraction of meaningful quantitative measures 26 from series with highly correlated noise. 27 28 ------------------------------------------------------------------------------- 29 30 Notes on Contents 31 32 The functions of this library segment are used for the analysis and 33 estimation of Box Jenkins time series models. These models have been 34 applied to a wide range of problems in forecasting, process control, and 35 the characterization of time series. Both standard ARIMA models and an 36 important generalization to factor models can be identified and estimated. 37 38 o Generation and Prediction of ARMA Time Series: 39 40 sarma -------- generate successive terms in an ARMA series. 41 parma -------- predict the values of an ARMA time series. 42 43 44 o Estimate the Parameters of an ARIMA Model: 45 46 evmod -------- compute the residual of an ARIMA series. 47 drmod -------- compute the model residual and the derivatives 48 with respect to model parameters. 49 seqts -------- perform a sequential update of model parameters. 50 fixts -------- perform a batch mode, Gauss-Newton update of 51 the ARIMA model parameters. 52 53 54 o Estimate the Parameters of a Factor Model: 55 56 evfmod ------- compute the residual in a time series Factor model. 57 drfmod ------- compute the model residual and the derivatives 58 with respect to model parameters for a Factor model. 59 seqtsf ------- perform a sequential update of Factor model parameters. 60 fixtsf ------- compute a batch mode, Gauss-Newton, update of Factor 61 model parameters. 62 63 64 o Model Identification and Evaluation: 65 66 sany -------- compute both direct and inverse autocorrelation 67 coefficients of a time series. 68 resid ------- analyze the residuals of a time series model. 69 70 The residuals of a model should be consistent with the assumption that 71 the series is driven by "white" noise. Several tests of this hypothesis are 72 applied in the resid() function. 73 74 75 o Time Series Utility Operations: 76 77 xmean ------- subtract the mean value of an input series from 78 each term. 79 sdiff ------- apply iterations of the sequential difference operator 80 to a series. 81 sintg ------- invert the sequential difference operation (ie. 82 integrate the series). 83 84 ------------------------------------------------------------------------------- 85 86 General Technical Comments 87 88 The fundamental ARMA model underlying such applications is a stationary 89 stochastic process driven by uncorrelated "white" noise. The model equation 90 takes the form 91 92 x[i] = e[i] + Sum(j=1 to m) a[j]*x[i-la[j]] - 93 Sum(k=1 to n) b[k]*e[k-lm[k]] , 94 95 where x[i] are the series values and e[i] are the values of the driving 96 noise. Nonstationary series can be addressed by employing differenced series, 97 with 98 99 x[i] = D^d{ y[i] }, D is the difference operator 100 D{ y[i] } = y[i] - y[i-1] , 101 102 and d is the order of difference. Models with d>0 are referred to as ARIMA 103 models, with the I standing for integrated. 104 105 Factor models provide a powerful generalization of the time series model 106 in which a series has both a value y[i] and a condition code c(i) associated 107 with the "time" index i. The basic model equation takes the form 108 109 y[i] = f[c[i]] + v[i] , 110 111 where v[i] is an ARIMA model series. The condition index c[i] specifies a 112 factor parameter f, which represents a mean value corresponding to the coded 113 condition. This condition concept is quite general. The index c[i] can be 114 used to distinguish physical conditions at different times, effects periodic 115 in time, or epochs separated by a change in the environment. Factor models 116 are used to estimate a quantitative measure of the effect of changing 117 conditions from highly correlated input series. 118 119 The estimation process for both ARIMA and Factor models employs a mix of 120 sequential and batch mode passes through the observed series. This 121 combination combines the powerful search capabilities of the sequential 122 technique with the refined estimates produced by using good initial 123 parameters in Gauss-Newton processing. Sequential estimation can also be used 124 to develop adaptive models. 125 126 ------------------------------------------------------------------------------- 127 FUNCTION SYNOPSES 128 ------------------------------------------------------------------------------- 129 130 External Variables for ARMA Models: 131 _______________________________________________________________________________ 132 133 The header files arma.h and ccmath.h contain definitions of the basic 134 model specification structure used in the time series functions. One of 135 these header files must be included. 136 137 arma.h 138 139 #ifndef NULL 140 #define NULL 0L 141 #endif 142 struct mcof {double cf; int lag;}; 143 144 The time series model's structure is specified in external arrays for 145 both autoregressive (par) and moving average (pma) parameters. Each element 146 of these structure arrays contains a model parameter (p->cf) and the lag 147 (p->lag) at which it acts. The external variables required for model 148 generation and estimation are listed below. Their declarations must appear 149 in any program that uses the ARMA model evaluation or estimation functions. 150 151 struct mcof *par,*pma; 152 int nar,nma,np; 153 154 par = pointer to store of structure array of autoregressive 155 parameters and lags (dimension=nar) 156 pma = pointer to store of structure array of moving average 157 parameters and lags (dimension=nma) 158 nar = number of autoregressive parameters 159 nma = number of moving average parameters 160 np = total number of parameters (np=nar+nma) 161 162 The estimation functions assume that parameters are stored in a single 163 structure array. Allocation of this storage is illustrated by the the 164 following sequence. 165 166 np=nar+nma; 167 par=(struct mcof *)calloc(np,sizeof(*par)); 168 pma=par+nar; 169 170 ------------------------------------------------------------------------------- 171 172 ARMA Model Generation and Prediction: 173 ------------------------------------------------------------------------------- 174 175 sarma 176 177 Generate successive terms in an autoregressive moving average (ARMA) 178 series. 179 180 double sarma(double er) 181 double er; 182 er = value of the driving error input 183 return value: y = current term of series 184 185 The model is generated by 186 187 y[k] = er[k] + Sum(i=0 to nar-1){ par[i]->cf*y[k-par[i]->lag]} 188 - Sum(j=0 to nma-1){ pma[j]->cf*er[k-pma[j]->lag]} . 189 190 The simulation must be initialized by a call of setsim. 191 192 setsim 193 194 void setsim(int k) 195 k = control flag, with 196 k=1 -> initialize simulation storage 197 k=0 -> free the storage allocated by setsim(1) 198 199 200 The storage initialized by setsim contains no history. 201 Lagged values of er and y are initially zero. The 202 generator is normally run through a number of initial 203 steps before storing simulated output. This serves to 204 eliminate transient startup effects. 205 206 ----------------------------------------------------------------- 207 208 parma 209 210 Predict values of an ARMA time series. 211 212 double parma(double *x,double *e) 213 x = pointer to storage of current series value 214 e = pointer to storage of current residual 215 return: xp = predicted value of series 216 217 218 Each call of parma generates a single step prediction. 219 Multi-step forward predictions are generated by multiple 220 calls with zeros loaded in the current residual. The 221 following code sequence generates a k-step prediction. 222 223 for(i=0; i<k ;++i){ 224 *(x+i+1)=parma(x+i,e+i,m); 225 *(e+i+1)=0.; 226 } 227 228 This loop alters "future" values of both the series 229 and the residuals. The starting point (x,e) must be 230 retained to support repeated predictions. 231 232 ------------------------------------------------------------------------------- 233 234 ARMA Model Evaluation: 235 ------------------------------------------------------------------------------- 236 237 evmod 238 239 Compute the residual in an ARMA model. 240 241 double evmod(double y) 242 y = observed value of time series 243 return value: ee = y-yp, where yp is the value predicted by the model 244 245 246 The model prediction yp is based on the lagged values of the 247 observations and estimated errors computed by evmod. 248 249 yp[k] = Sum(i=0 to nar-1) {par[i]->cf*y[k-par[i]->lag]} 250 - Sum(j=0 to nma-1) {pma[j]->cf*ee[j-pma[j]->lag]} , 251 252 where ee[k] = y[k] - yp[k] at "time" k. The lagged values of 253 observations and prediction error are automatically updated 254 by the function. 255 256 A call of setev must be used to initialize storage for evmod. 257 258 setev 259 260 void setev(int k) 261 k = control flag, with 262 k=1 -> initialize storage 263 k=0 -> free storage initialized by setev(1) 264 265 266 The storage initialized by setev contains no history. 267 Lagged values of ee and y are initially zero. 268 269 ----------------------------------------------------------------- 270 271 drmod 272 273 Compute the model residual and the derivatives of the predictor with 274 respect to model parameters (core function in model estimation). 275 276 double drmod(double y,double *dr) 277 y = observed value of the time series 278 dr = pointer to array of derivatives of the predictor yp with 279 respect to model parameters (order nar-autoregressive 280 followed by nma-moving average, dimension = np) 281 return value: ee = y-yp, where yp is the value predicted by 282 the time series model 283 284 285 The derivatives evaluated by this function are 286 287 dr[i] = Del(par[i]->cf){yp} for i=0 to nar-1 , and 288 289 dr[nar+j] = Del(pma[j]->cf){yp} for j=0 to nma-1 , 290 291 Here Del(t){x} denotes the partial derivative of x 292 with respect to the parameter t. The prediction 293 function yp is evaluated in the manner indicated 294 above (see evmod). 295 296 A call to setdr must be used to initialize internal storage. 297 298 setdr 299 300 setdr(int k) 301 k = control flag, with 302 k=1 -> initialize internal storage 303 k=0 -> free storage initialized by setdr(1) 304 305 306 The storage initialized by setdr contains no history 307 Lagged values of ee and y are initially zero. 308 309 ------------------------------------------------------------------------------- 310 311 ARMA Model Estimation: 312 ------------------------------------------------------------------------------- 313 314 The two estimation functions are complementary, since the sequential 315 form is an excellent search procedure while the Gauss-Newton algorithm 316 converges rapidly from a good estimate. In practice, the initial ARMA 317 parameters can normally be set to zero when estimation is started with 318 sequential passes. 319 320 seqts 321 322 Perform a sequential estimation update of the parameters of a 323 time series model. 324 325 double seqts(double *x,int n,double *var,int kf) 326 x = pointer to input array containing observed values of the series 327 n = length of the input series 328 var = pointer to matrix array (np by np) proportional 329 to the updated parameter variance matrix at exit 330 (the constant of proportionality = ssq/(n-np), 331 index order nar autoregressive parameters 332 followed by nma moving average parameters) 333 kf = control flag, with 334 kf=0 -> initialize var to the identity matrix 335 kf=1 -> use the input var matrix 336 return value: ssq = sum of the squares of the model residuals 337 338 339 ssq = Sum(i=0 to n-1) { (yp[i] - x[i])^2 } , 340 341 with yp[i] = prediction i-1 -> i 342 343 --------------------------------------------------------------- 344 345 fixts 346 347 Perform a fixed Gauss-Newton estimation iteration to fit time series 348 model parameters. 349 350 double fixts(double *x,int n,double *var,double *cr) 351 x = pointer to input array containing observed values of the series 352 n = length of the input series 353 var = pointer to matrix array (np by np) proportional 354 to the parameter variance matrix at exit 355 ( the constant of proportionality = ssq/(n-np) ) 356 cr = pointer to array of dimension np containing the parameter 357 corrections estimated by this function call 358 return value: ssq = sum of the squares of the model residuals 359 ssq = -1.0 -> singular var matrix 360 361 362 ssq = Sum(i=0 to n-1){ (yp[i] - x[i])^2 } 363 364 with yp[i] = prediction i-1 -> i 365 366 The index order for both cr and var is nar autoregressive 367 parameters followed by nma moving average parameters. 368 369 370 ------------------------------------------------------------------------------- 371 372 External Variables for Factor Models: 373 ------------------------------------------------------------------------------- 374 375 The basic factor model variables are delivered in a structure (of type 376 struct fmod), with components: 377 y.fac = integer encoding factor; and 378 y.val = corresponding value of series. 379 The structures employed in factor model estimation are defined in the 380 header files armaf.h and ccmath.h. One of these headers must be included. 381 382 armaf.h 383 384 #ifndef NULL 385 #define NULL 0L 386 #endif 387 struct mcof {double cf; int lag;}; 388 struct fmod {int fac; double val;}; 389 390 The structure of the model is specified in an external structure array, 391 including factor parameters (pfc), autoregressive parameters (par), and 392 moving average parameters (pma). Coefficients (p->cf) and lags (p->lag) are 393 required for autoregressive and moving average parameters. Only the 394 coefficient is employed for factors. Model parameters are delivered to 395 estimation and evaluation functions by the following external variables. 396 Their declarations must appear in any program that calls the factor model 397 evaluation and estimation functions. 398 399 struct mcof *pfc,*par,*pma; 400 int nfc,nar,nma,ndif,np; 401 402 pfc = pointer to store of structure array containing 403 factor parameters (dimension=nfc) 404 par = pointer to store of structure array of autoregressive 405 parameters and lags (dimension=nar) 406 pma = pointer to store of structure array of moving average 407 parameters and lags (dimension=nma) 408 nfc = number of factor parameters 409 nar = number of autoregressive parameters 410 nma = number of moving average parameters 411 np = total number of parameters (np=nfc+nar+nma) 412 ndif = order of difference used 413 414 The factor model estimation functions assume that the parameters are stored 415 in a single array. Allocation of external storage in the calling program 416 should employ a code sequence such as, 417 418 np=nfc+nar+nma; 419 pfc=calloc(np,sizeof(*pfc)); 420 par=pfc+nfc; pma=par+nar; 421 422 if the estimation functions are to be used. 423 424 ------------------------------------------------------------------------------- 425 426 Factor Model Evaluation: 427 ------------------------------------------------------------------------------- 428 429 evfmod 430 431 Compute the residual in a factor model. 432 433 double evfmod(struct fmod y) 434 y = structure containing: 435 y.val = difference, of order ndif, of observed values 436 the of time series, and 437 y.fac = factor corresponding to current value 438 return value: ee = y.val - yp, where yp is the value predicted 439 by the model 440 441 442 The model prediction yp is based on the lagged values of the 443 observations and estimated errors computed by evfmod. The 444 prediction is simply an ARMA model prediction with a mean, 445 computed by differencing the factor averages, imposed. 446 447 yp = D^d{f(k)} + Sum(i=0 to nar-1){ par[i]->cf*y[k-par[i]->lag] } 448 - Sum(j=0 to nma-1){ pma[j]->cf*ee[k-par[k]->lag] } . 449 450 Here D denotes the difference operator (see sdiff), 451 pcf[m(t)]->cf = f(k) , y[t]->fac = m(t) , and d = ndif . 452 453 The one-step prediction errors ee are identical to those we 454 would obtain without differencing the values of the input series. 455 456 A call to setevf must be used to initialize internal storage. 457 458 setevf 459 460 void setevf(int k) 461 k = control flag, with 462 k=1 -> initialize storage 463 k=0 -> free storage initialized by setevf(1) 464 465 The storage initialized by setevf contains no history. 466 Lagged values of er, y.val, and factor indices are 467 initially zero. 468 469 ------------------------------------------------------------------- 470 471 drfmod 472 473 Compute the model residual and the derivatives of the predictor with 474 respect to model parameters (core estimation function). 475 476 double drfmod(struct fmod y,double *dr) 477 y = structure containing: 478 y.val = difference, of order ndif, of observed values 479 the of time series, and 480 y.fac = factor corresponding to current value 481 dr = pointer to array of derivatives of the predictor yp 482 with respect to model parameters (order nfc-factor , 483 nar-autoregressive , nma-moving average) 484 return value: ee = y.val - yp, where yp is the value predicted 485 by the model 486 487 488 The derivatives computed by drfmod() are: 489 dr[m] = Del(par[m]->cf){ yp} for m=0 to nfc-1, 490 dr[nfc+i] = Del(par[i]->cf){ yp} for i=0 to nar-1, and 491 dr[nfc+nar+j] = Del(pma[j]->cf){ yp} for j=0 to nma-1 . 492 493 Here Del(t){x} denotes the partial derivative of x 494 with respect to the parameter t. The prediction yp is 495 identical to that discussed above (see evfmod). 496 497 The internal storage used by drfmod must be initialized by a 498 call of setdrf. 499 500 setdrf 501 502 setdrf(int k) 503 k = control flag, with 504 k=1 -> initialize storage 505 k=0 -> free storage initialized by setdrf(1) 506 507 508 The storage initialized by setdrf contains no history. 509 Lagged values of er, y.val and factor indices are 510 initially zero. 511 512 ------------------------------------------------------------------------------- 513 514 Factor Model Estimation: 515 ------------------------------------------------------------------------------- 516 517 The two estimation functions are complementary, since the sequential 518 form is an excellent search procedure while the Gauss-Newton algorithm 519 converges rapidly from a good estimate. In practice, the initial ARMA 520 parameters can normally be set to zero and a series mean used for initial 521 factor parameters. The estimation starts with a sequential search, and 522 fixed estimation is employed for the final passes. 523 524 525 seqtsf 526 527 Perform a sequential estimation update of the parameters of a time 528 series factor model. 529 530 double seqtsf(struct fmod *x,int n,double *var,int kf) 531 x = pointer to input structure array containing 532 x[i].val = difference, of order ndif, of observed 533 values the of time series, and 534 x[i].fac = factor corresponding to "time" i 535 n = length of the input series 536 var = pointer to matrix array (np by np) proportional to 537 the updated parameter variance matrix at exit 538 (the constant of proportionality = ssq/(n-np), 539 index order nfc factor parameters, nar 540 autoregressive parameters, nma moving 541 average parameters) 542 kf = control flag, with 543 kf=0 -> initialize var to the identity matrix 544 kf=1 -> use the input var matrix 545 return value: ssq = sum of the squares of the model residuals 546 547 548 ssq = Sum(i=0 to n-1){ (yp[i] - x[i])^2 } 549 550 yp[i] = prediction i-1 > i 551 552 If ndif>0, the variance matrix, var, is constrained to be 553 orthogonal to the vector u, defined by: 554 u[i] = 1 for i=0 to nfc-1, and u[i] = 0 for i >= nfc. 555 Thus, V*u = 0 , where V[i,j] = var[np*i+j] , and the 556 computed variance matrix V is singular when ndif>0. 557 558 --------------------------------------------------------------------- 559 560 fixtsf 561 562 Perform a fixed Gauss-Newton estimation iteration to update 563 parameters of a time series factor model. 564 565 double fixtsf(struct fmod *x,int n,double *var,double *cr) 566 x = pointer to input structure array containing 567 x[i].val = difference, of order ndif, of observed values 568 the of time series, and 569 x[i].fac = factor corresponding to "time" i 570 n = length of the input series 571 var = pointer to matrix array (np by np) proportional to the 572 updated parameter variance matrix at exit 573 (the constant of proportionality = ssq/(n-np) ) 574 cr = pointer to array of dimension np containing the parameter 575 corrections estimated by this iteration 576 return value: ssq = sum of the squares of the model residuals 577 578 579 ssq = Sum(i=0 to n-1){ (yp[i] - x[i])^2 } 580 581 yp[i] = prediction i-1 -> i 582 583 The index order for both cr and var is nfc factor parameters 584 followed by nar autoregressive parameters followed by nma moving 585 average parameters. The constraint on var, defined above (see 586 seqtsf), applies when ndif>0. 587 588 589 ------------------------------------------------------------------------------- 590 591 Model Identification and Evaluation: 592 ------------------------------------------------------------------------------- 593 594 sany 595 596 Compute the direct and inverse autocorrelation coefficients 597 of a time series to support model identification. 598 599 int sany(double *x,int n,double *pm,double *cd,double *ci, \ 600 int nd,int ms,int lag) 601 x = pointer to start of input series 602 (The series values are altered by the function, at exit the 603 power spectra is stored in the array, starting at x+ndif. ) 604 n = length of input series 605 pm = pointer to store for mean value (xm) of series 606 cd = pointer to array to be loaded with direct 607 autocorrelation coefficients (i = 1 to lag) 608 cd[0] = 2nd moment of direct series 609 ci = pointer array to be loaded with inverse 610 autocorrelation coefficients (i = 1 to lag) 611 ci[0]= 2nd moment of inverse series 612 nd = order of differencing applied to input series 613 (If nd=0, the mean is subtracted.) 614 ms = order of smoothing of spectra 615 lag = maximum lag for which direct and inverse 616 autocorrelations are computed 617 return value: n = length of series used in Fourier 618 analysis of correlations 619 620 621 The direct and inverse autocorrelations are computed by applying 622 an inverse Fourier transform to the complex series 623 624 { ps(w[j]) , 1./ps(w[j]) }, where ps(w[j]) 625 626 is the power spectra of the input series. Direct autocorrelations 627 have a simple pattern for pure moving average series, while inverse 628 autocorrelations are simple for pure autoregressive series. 629 630 ---------------------------------------------------------------------- 631 632 resid 633 634 Perform an analysis of the residuals of a time series model. 635 636 int resid(double *x,int n,int lag,double **pac,int nbin, \ 637 double xa,double xb,int **phs,int *cks) 638 x = pointer to start of input series of model residuals 639 (one-step prediction errors). (This series is altered 640 to an unsmoothed power spectra by the function.) 641 n = length of input series 642 pac = pointer to array containing: 643 *pac = sum of squares over series 644 *(pac+k) = kth autocorrelation coefficient, for k=1 to lag 645 lag = maximum autocorrelation lag desired 646 nbin = number of histogram bins ( bin size = (xb-xa)/nbin ) 647 xa = lower limit of error histogram 648 xb = upper limit of error histogram 649 phs = pointer to store of error histogram, with 650 *(phs-1) = number of residuals <xa 651 *(phs+nbin) = number of residuals >xb 652 *(phs+i) = number in ith bin i=0,1, - ,nbin-1 653 cks = pointer to array containing Kolmogorav-Smirnov statistics, 654 based on cumulative power spectra, with 655 cks[0] = points outside .25 significance threshold 656 cks[1] = points outside .05 significance threshold 657 return value: n = series length used in power spectra computation 658 659 660 The autocorrelations have an asymptotic large sample distribution 661 with variance ra2 = 1/(n-np) , where np is the number of 662 model parameters employed. The cumulative spectra is computed using 663 664 I(j) = I(j-1) + ps(w[j]) + ps(w[j-1]) , 665 666 for j = 1,2, -- ,n/2 , starting with I(0) = 0. Storage allocated 667 for autocorrelation and histograms may be released with free(pac) 668 and free(phs-1) respectively. 669 670 671 ------------------------------------------------------------------------------- 672 673 Utility Series Operations: 674 ------------------------------------------------------------------------------- 675 676 xmean 677 678 Subtract the mean value of a series from each term. 679 680 double xmean(double *x,int n) 681 x = pointer to start of input series 682 (at exit each term is modified by subtracting the mean xm) 683 n = length of the series 684 return value: xm = series mean 685 686 687 xm = { Sum(i=0 to n-1) x[i] }/n 688 689 ----------------------------------------------------------------- 690 691 sdiff 692 693 Apply iterations of the sequential difference operator to a series. 694 695 double sdiff(double y,int nd,int k) 696 y = current value of time series (in sequence) 697 nd = order of differencing (1 <= nd <= 6) 698 k = control flag, with 699 k=0 -> set initial differences to zero 700 k!=0 -> normal operation 701 return value: u = value of series with difference order nd 702 703 ---------------------------------------------------------------- 704 705 sintg 706 707 Invert the differencing of a time series (integration). 708 709 double sintg(double y,int ndint ,k) 710 y = current value of series 711 nd = order of integration (1<= nd <=6 712 inverse of difference operation) 713 k = control flag, with 714 k=0 -> set initial sums to zero 715 k!=0 -> normal operation 716 return value: u = value of series integrated to order nd 717 718 719 The use of zeros for initialization implies that the first 720 correctly differenced (integrated) term in a sequence will 721 have index nd. The action of these operators is specified by 722 723 D^m{x[j]} = Sum(i=0 to m) { [m!/i!*(m-i)!](-1)^i *x[j-i] } 724 725 I^n{x[j]} = Sum(j=0 to n) { [n!/j!(n-j)!] *x[j-i] }