As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) TeX and LaTeX source code syntax highlighting (style: standard) with prefixed line numbers.
Alternatively you can here view or download the uninterpreted source code file.

1 \chapter{Dynamic panel models} 2 \label{chap:dpanel} 3 4 \newcommand{\by}{\boldsymbol{y}} 5 \newcommand{\bx}{\boldsymbol{x}} 6 \newcommand{\bv}{\boldsymbol{v}} 7 \newcommand{\bX}{\boldsymbol{X}} 8 \newcommand{\bW}{\boldsymbol{W}} 9 \newcommand{\bZ}{\boldsymbol{Z}} 10 \newcommand{\bA}{\boldsymbol{A}} 11 \newcommand{\biota}{\bm{\iota}} 12 13 \DefineVerbatimEnvironment% 14 {code}{Verbatim} 15 {fontsize=\small, xleftmargin=1em} 16 17 \newenvironment% 18 {altcode}% 19 {\vspace{1ex}\small\leftmargin 1em}{\vspace{1ex}} 20 21 The primary command for estimating dynamic panel models in gretl is 22 \texttt{dpanel}. The closely related \texttt{arbond} command predated 23 \texttt{dpanel}, and is still present, but whereas \texttt{arbond} 24 only supports the so-called ``difference'' estimator 25 \citep{arellano-bond91}, \texttt{dpanel} in addition offers the 26 ``system'' estimator \citep{blundell-bond98}, which has become the 27 method of choice in the applied literature. 28 29 \section{Introduction} 30 \subsection{Notation} 31 \label{sec:notation} 32 33 A dynamic linear panel data model can be represented as follows 34 (in notation based on \cite{arellano03}): 35 \begin{equation} 36 \label{eq:dpd-def} 37 y_{it} = \alpha y_{i,t-1} + \beta'x_{it} + \eta_{i} + v_{it} 38 \end{equation} 39 40 The main idea behind the difference estimator is to sweep out the 41 individual effect via differencing. First-differencing eq.\ 42 (\ref{eq:dpd-def}) yields 43 \begin{equation} 44 \label{eq:dpd-dif} 45 \Delta y_{it} = \alpha \Delta y_{i,t-1} + \beta'\Delta x_{it} + 46 \Delta v_{it} = \gamma' W_{it} + \Delta v_{it} , 47 \end{equation} 48 in obvious notation. The error term of (\ref{eq:dpd-dif}) is, by 49 construction, autocorrelated and also correlated with the lagged 50 dependent variable, so an estimator that takes both issues into 51 account is needed. The endogeneity issue is solved by noting that all 52 values of $y_{i,t-k}$ with $k>1$ can be used as instruments for 53 $\Delta y_{i,t-1}$: unobserved values of $y_{i,t-k}$ (whether missing 54 or pre-sample) can safely be substituted with 0. In the language of 55 GMM, this amounts to using the relation 56 \begin{equation} 57 \label{eq:OC-dif} 58 E(\Delta v_{it} \cdot y_{i,t-k}) = 0, \quad k>1 59 \end{equation} 60 as an orthogonality condition. 61 62 Autocorrelation is dealt with by noting that if $v_{it}$ is white 63 noise, the covariance matrix of the vector whose typical element is 64 $\Delta v_{it}$ is proportional to a matrix $H$ that has 2 on the main 65 diagonal, $-1$ on the first subdiagonals and 0 elsewhere. One-step 66 GMM estimation of equation (\ref{eq:dpd-dif}) amounts to computing 67 \begin{equation} 68 \label{eq:dif-gmm} 69 \hat{\gamma} = \left[ 70 \left( \sum_{i=1}^N \bW_i'\bZ_i \right) \bA_N 71 \left( \sum_{i=1}^N \bZ_i'\bW_i \right) \right]^{-1} 72 \left( \sum_{i=1}^N \bW_i'\bZ_i \right) \bA_N 73 \left( \sum_{i=1}^N \bZ_i'\Delta \by_i \right) 74 \end{equation} 75 where 76 \begin{align*} 77 \Delta \by_i & = 78 \left[ \begin{array}{ccc} 79 \Delta y_{i,3} & \cdots & \Delta y_{i,T} 80 \end{array} \right]' \\ 81 \bW_i & = 82 \left[ \begin{array}{ccc} 83 \Delta y_{i,2} & \cdots & \Delta y_{i,T-1} \\ 84 \Delta x_{i,3} & \cdots & \Delta x_{i,T} \\ 85 \end{array} \right]' \\ 86 \bZ_i & = 87 \left[ \begin{array}{ccccccc} 88 y_{i1} & 0 & 0 & \cdots & 0 & \Delta x_{i3}\\ 89 0 & y_{i1} & y_{i2} & \cdots & 0 & \Delta x_{i4}\\ 90 & & \vdots \\ 91 0 & 0 & 0 & \cdots & y_{i, T-2} & \Delta x_{iT} \\ 92 \end{array} \right]' \\ 93 \intertext{and} 94 \bA_N & = \left( \sum_{i=1}^N \bZ_i' H \bZ_i \right)^{-1} 95 \end{align*} 96 97 Once the 1-step estimator is computed, the sample covariance matrix of 98 the estimated residuals can be used instead of $H$ to obtain 2-step 99 estimates, which are not only consistent but asymptotically 100 efficient. (In principle the process may be iterated, but nobody seems 101 to be interested.) Standard GMM theory applies, except for one thing: 102 \cite{Windmeijer05} has computed finite-sample corrections to the 103 asymptotic covariance matrix of the parameters, which are nowadays 104 almost universally used. 105 106 The difference estimator is consistent, but has been shown to have 107 poor properties in finite samples when $\alpha$ is near one. People 108 these days prefer the so-called ``system'' estimator, which 109 complements the differenced data (with lagged levels used as 110 instruments) with data in levels (using lagged differences as 111 instruments). The system estimator relies on an extra orthogonality 112 condition which has to do with the earliest value of the dependent 113 variable $y_{i,1}$. The interested reader is referred to \citet[pp.\ 114 124--125]{blundell-bond98} for details, but here it suffices to say 115 that this condition is satisfied in mean-stationary models and brings 116 an improvement in efficiency that may be substantial in many cases. 117 118 The set of orthogonality conditions exploited in the system approach 119 is not very much larger than with the difference estimator since most 120 of the possible orthogonality conditions associated with the equations 121 in levels are redundant, given those already used for the equations in 122 differences. 123 124 The key equations of the system estimator can be written as 125 126 \begin{equation} 127 \label{eq:sys-gmm} 128 \tilde{\gamma} = \left[ 129 \left( \sum_{i=1}^N \tilde{\bW}'\tilde{\bZ} \right) \bA_N 130 \left( \sum_{i=1}^N \tilde{\bZ}'\tilde{\bW} \right) \right]^{-1} 131 \left( \sum_{i=1}^N \tilde{\bW}'\tilde{\bZ} \right) \bA_N 132 \left( \sum_{i=1}^N \tilde{\bZ}'\Delta \tilde{\by}_i \right) 133 \end{equation} 134 where 135 \begin{align*} 136 \Delta \tilde{\by}_i & = 137 \left[ \begin{array}{ccccccc} 138 \Delta y_{i3} & \cdots & \Delta y_{iT} & y_{i3} & \cdots & y_{iT} 139 \end{array} \right]' \\ 140 \tilde{\bW}_i & = 141 \left[ \begin{array}{cccccc} 142 \Delta y_{i2} & \cdots & \Delta y_{i,T-1} & y_{i2} & \cdots & y_{i,T-1} \\ 143 \Delta x_{i3} & \cdots & \Delta x_{iT} & x_{i3} & \cdots & x_{iT} \\ 144 \end{array} \right]' \\ 145 \tilde{\bZ}_i & = 146 \left[ \begin{array}{ccccccccc} 147 y_{i1} & 0 & 0 & \cdots & 0 & 0 & \cdots & 0 & \Delta x_{i,3}\\ 148 0 & y_{i1} & y_{i2} & \cdots & 0 & 0 & \cdots & 0 & \Delta x_{i,4}\\ 149 & & \vdots \\ 150 0 & 0 & 0 & \cdots & y_{i, T-2} & 0 & \cdots & 0 & \Delta x_{iT}\\ 151 & & \vdots \\ 152 0 & 0 & 0 & \cdots & 0 & \Delta y_{i2} & \cdots & 0 & x_{i3}\\ 153 & & \vdots \\ 154 0 & 0 & 0 & \cdots & 0 & 0 & \cdots & \Delta y_{i,T-1} & x_{iT}\\ 155 \end{array} \right]' \\ 156 \intertext{and} 157 \bA_N & = \left( \sum_{i=1}^N \tilde{\bZ}' H^* \tilde{\bZ} \right)^{-1} 158 \end{align*} 159 160 In this case choosing a precise form for the matrix $H^*$ for the 161 first step is no trivial matter. Its north-west block should be as 162 similar as possible to the covariance matrix of the vector $\Delta 163 v_{it}$, so the same choice as the ``difference'' estimator is 164 appropriate. Ideally, the south-east block should be proportional to 165 the covariance matrix of the vector $\biota \eta_i + \bv$, that is 166 $\sigma^2_{v} I + \sigma^2_{\eta} \biota \biota'$; but since 167 $\sigma^2_{\eta}$ is unknown and any positive definite matrix renders 168 the estimator consistent, people just use $I$. The off-diagonal blocks 169 should, in principle, contain the covariances between $\Delta v_{is}$ 170 and $v_{it}$, which would be an identity matrix if $v_{it}$ is white 171 noise. However, since the south-east block is typically given a 172 conventional value anyway, the benefit in making this choice is not 173 obvious. Some packages use $I$; others use a zero matrix. 174 Asymptotically, it should not matter, but on real datasets the 175 difference between the resulting estimates can be noticeable. 176 177 \subsection{Rank deficiency} 178 \label{sec:rankdef} 179 180 Both the difference estimator (\ref{eq:dif-gmm}) and the system 181 estimator (\ref{eq:sys-gmm}) depend for their existence on the 182 invertibility of $\bA_N$. This matrix may turn out to be singular for 183 several reasons. However, this does not mean that the estimator is not 184 computable: in some cases, adjustments are possible such that the 185 estimator does exist, but the user should be aware that in these cases 186 not all software packages use the same strategy and replication of 187 results may prove difficult or even impossible. 188 189 A first reason why $\bA_N$ may be singular could be the unavailability 190 of instruments, chiefly because of missing observations. This case is 191 easy to handle. If a particular row of $\tilde{\bZ}_i$ is zero for all 192 units, the corresponding orthogonality condition (or the corresponding 193 instrument if you prefer) is automatically dropped; of course, the 194 overidentification rank is adjusted for testing purposes. 195 196 Even if no instruments are zero, however, $\bA_N$ could be rank 197 deficient. A trivial case occurs if there are collinear instruments, 198 but a less trivial case may arise when $T$ (the total number of time 199 periods available) is not much smaller than $N$ (the number of units), 200 as, for example, in some macro datasets where the units are 201 countries. The total number of potentially usable orthogonality 202 conditions is $O(T^2)$, which may well exceed $N$ in some cases. Of 203 course $\bA_N$ is the sum of $N$ matrices which have, at most, rank $2T - 204 3$ and therefore it could well happen that the sum is singular. 205 206 In all these cases, what we consider the ``proper'' way to go is to 207 substitute the pseudo-inverse of $\bA_N$ (Moore--Penrose) for its regular 208 inverse. Again, our choice is shared by some software packages, but 209 not all, so replication may be hard. 210 211 212 \subsection{Treatment of missing values} 213 214 Textbooks seldom bother with missing values, but in some cases their 215 treatment may be far from obvious. This is especially true if missing 216 values are interspersed between valid observations. For example, 217 consider the plain difference estimator with one lag, so 218 \[ 219 y_t = \alpha y_{t-1} + \eta + \epsilon_t 220 \] 221 where the $i$ index is omitted for clarity. Suppose you have an 222 individual with $t=1\ldots5$, for which $y_3$ is missing. It may seem 223 that the data for this individual are unusable, because 224 differencing $y_t$ would produce something like 225 \[ 226 \begin{array}{c|ccccc} 227 t & 1 & 2 & 3 & 4 & 5 \\ 228 \hline 229 y_t & * & * & \circ & * & * \\ 230 \Delta y_t & \circ & * & \circ & \circ & * 231 \end{array} 232 \] 233 where $*$ = nonmissing and $\circ$ = missing. Estimation seems to be 234 unfeasible, since there are no periods in which $\Delta y_t$ and 235 $\Delta y_{t-1}$ are both observable. 236 237 However, we can use a $k$-difference operator and get 238 \[ 239 \Delta_k y_t = \alpha \Delta_k y_{t-1} + \Delta_k \epsilon_t 240 \] 241 where $\Delta_k = 1 - L^k$ and past levels of $y_t$ are perfectly 242 valid instruments. In this example, we can choose $k=3$ and use $y_1$ 243 as an instrument, so this unit is in fact perfectly usable. 244 245 Not all software packages seem to be aware of this possibility, so 246 replicating published results may prove tricky if your dataset 247 contains individuals with gaps between valid observations. 248 249 \section{Usage} 250 251 One of the concepts underlying the syntax of \texttt{dpanel} is that 252 you get default values for several choices you may want to make, so 253 that in a ``standard'' situation the command is very concise. The 254 simplest case of the model (\ref{eq:dpd-def}) is a plain AR(1) 255 process: 256 \begin{equation} 257 \label{eq:dp1} 258 y_{i,t} = \alpha y_{i,t-1} + \eta_{i} + v_{it} . 259 \end{equation} 260 If you give the command 261 \begin{code} 262 dpanel 1 ; y 263 \end{code} 264 gretl assumes that you want to estimate (\ref{eq:dp1}) via the 265 difference estimator (\ref{eq:dif-gmm}), using as many orthogonality 266 conditions as possible. The scalar \texttt{1} between \texttt{dpanel} 267 and the semicolon indicates that only one lag of \texttt{y} is 268 included as an explanatory variable; using \texttt{2} would give an 269 AR(2) model. The syntax that gretl uses for the non-seasonal AR and MA 270 lags in an ARMA model is also supported in this context.\footnote{This 271 represents an enhancement over the \texttt{arbond} command.} For 272 example, if you want the first and third lags of \texttt{y} (but not 273 the second) included as explanatory variables you can say 274 \begin{code} 275 dpanel {1 3} ; y 276 \end{code} 277 or you can use a pre-defined matrix for this purpose: 278 \begin{code} 279 matrix ylags = {1, 3} 280 dpanel ylags ; y 281 \end{code} 282 To use a single lag of \texttt{y} other than the first you need to 283 employ this mechanism: 284 \begin{code} 285 dpanel {3} ; y # only lag 3 is included 286 dpanel 3 ; y # compare: lags 1, 2 and 3 are used 287 \end{code} 288 289 To use the system estimator instead, you add the \verb|--system| 290 option, as in 291 \begin{code} 292 dpanel 1 ; y --system 293 \end{code} 294 The level orthogonality conditions and the corresponding instrument 295 are appended automatically (see eq.\ \ref{eq:sys-gmm}). 296 297 \subsection{Regressors} 298 299 If we want to introduce additional regressors, we list them after the 300 dependent variable in the same way as other gretl commands, such as 301 \texttt{ols}. 302 303 For the difference orthogonality relations, \texttt{dpanel} takes care 304 of transforming the regressors in parallel with the dependent 305 variable. Note that this differs from gretl's \texttt{arbond} command, 306 where only the dependent variable is differenced automatically; it 307 brings us more in line with other software. 308 309 One case of potential ambiguity is when an intercept is specified but 310 the difference-only estimator is selected, as in 311 \begin{code} 312 dpanel 1 ; y const 313 \end{code} 314 In this case the default \texttt{dpanel} behavior, which agrees with 315 Stata's \texttt{xtabond2}, is to drop the constant (since differencing 316 reduces it to nothing but zeros). However, for compatibility with the 317 DPD package for Ox, you can give the option \verb|--dpdstyle|, in 318 which case the constant is retained (equivalent to including a linear 319 trend in equation~\ref{eq:dpd-def}). A similar point applies to the 320 period-specific dummy variables which can be added in \texttt{dpanel} 321 via the \verb|--time-dummies| option: in the differences-only case 322 these dummies are entered in differenced form by default, but when the 323 \verb|--dpdstyle| switch is applied they are entered in levels. 324 325 The standard gretl syntax applies if you want to use lagged 326 explanatory variables, so for example the command 327 \begin{code} 328 dpanel 1 ; y const x(0 to -1) --system 329 \end{code} 330 would result in estimation of the model 331 \[ 332 y_{it} = \alpha y_{i,t-1} + 333 \beta_0 + \beta_1 x_{it} + \beta_2 x_{i,t-1} + 334 \eta_{i} + v_{it} . 335 \] 336 337 338 \subsection{Instruments} 339 340 The default rules for instruments are: 341 \begin{itemize} 342 \item lags of the dependent variable are instrumented using all 343 available orthogonality conditions; and 344 \item additional regressors are considered exogenous, so they are used 345 as their own instruments. 346 \end{itemize} 347 348 If a different policy is wanted, the instruments should be specified 349 in an additional list, separated from the regressors list by a 350 semicolon. The syntax closely mirrors that for the \texttt{tsls} 351 command, but in this context it is necessary to distinguish between 352 ``regular'' instruments and what are often called ``GMM-style'' 353 instruments (that is, instruments that are handled in the same 354 block-diagonal manner as lags of the dependent variable, as described 355 above). 356 357 ``Regular'' instruments are transformed in the same way as 358 regressors, and the contemporaneous value of the transformed variable 359 is used to form an orthogonality condition. Since regressors are 360 treated as exogenous by default, it follows that these two commands 361 estimate the same model: 362 363 \begin{code} 364 dpanel 1 ; y z 365 dpanel 1 ; y z ; z 366 \end{code} 367 The instrument specification in the second case simply confirms what 368 is implicit in the first: that \texttt{z} is exogenous. Note, though, 369 that if you have some additional variable \texttt{z2} which you want 370 to add as a regular instrument, it then becomes necessary to 371 include \texttt{z} in the instrument list if it is to be treated 372 as exogenous: 373 \begin{code} 374 dpanel 1 ; y z ; z2 # z is now implicitly endogenous 375 dpanel 1 ; y z ; z z2 # z is treated as exogenous 376 \end{code} 377 378 The specification of ``GMM-style'' instruments is handled by the 379 special constructs \texttt{GMM()} and \texttt{GMMlevel()}. The first 380 of these relates to instruments for the equations in differences, and 381 the second to the equations in levels. The syntax for \texttt{GMM()} 382 is 383 384 \begin{altcode} 385 \texttt{GMM(}\textsl{name}\texttt{,} \textsl{minlag}\texttt{,} 386 \textsl{maxlag}\texttt{)} 387 \end{altcode} 388 389 \noindent 390 where \textsl{name} is replaced by the name of a series (or the name 391 of a list of series), and \textsl{minlag} and \textsl{maxlag} are 392 replaced by the minimum and maximum lags to be used as 393 instruments. The same goes for \texttt{GMMlevel()}. 394 395 One common use of \texttt{GMM()} is to limit the number of lagged 396 levels of the dependent variable used as instruments for the equations 397 in differences. It's well known that although exploiting all possible 398 orthogonality conditions yields maximal asymptotic efficiency, in 399 finite samples it may be preferable to use a smaller subset (but see 400 also \cite{OkuiJoE2009}). For example, the specification 401 402 \begin{code} 403 dpanel 1 ; y ; GMM(y, 2, 4) 404 \end{code} 405 ensures that no lags of $y_t$ earlier than $t-4$ will be used as 406 instruments. 407 408 A second use of \texttt{GMM()} is to exploit more fully the potential 409 block-diagonal orthogonality conditions offered by an exogenous 410 regressor, or a related variable that does not appear as a regressor. 411 For example, in 412 413 \begin{code} 414 dpanel 1 ; y x ; GMM(z, 2, 6) 415 \end{code} 416 the variable \texttt{x} is considered an endogenous regressor, and up to 417 5 lags of \texttt{z} are used as instruments. 418 419 Note that in the following script fragment 420 \begin{code} 421 dpanel 1 ; y z 422 dpanel 1 ; y z ; GMM(z,0,0) 423 \end{code} 424 the two estimation commands should not be expected to give the same 425 result, as the sets of orthogonality relationships are subtly 426 different. In the latter case, you have $T-2$ separate orthogonality 427 relationships pertaining to $z_{it}$, none of which has any 428 implication for the other ones; in the former case, you only have one. 429 In terms of the $\bZ_i$ matrix, the first form adds a single row to 430 the bottom of the instruments matrix, while the second form adds a 431 diagonal block with $T-2$ columns; that is, 432 \[ 433 \left[ \begin{array}{cccc} 434 z_{i3} & z_{i4} & \cdots & z_{it} 435 \end{array} \right] 436 \] 437 versus 438 \[ 439 \left[ \begin{array}{cccc} 440 z_{i3} & 0 & \cdots & 0 \\ 441 0 & z_{i4} & \cdots & 0 \\ 442 & \ddots & \ddots & \\ 443 0 & 0 & \cdots & z_{it} 444 \end{array} \right] 445 \] 446 447 \section{Replication of DPD results} 448 \label{sec:DPD-replic} 449 450 In this section we show how to replicate the results of some of the 451 pioneering work with dynamic panel-data estimators by Arellano, Bond 452 and Blundell. As the DPD manual \citep*{DPDmanual} explains, it is 453 difficult to replicate the original published results exactly, for two 454 main reasons: not all of the data used in those studies are publicly 455 available; and some of the choices made in the original software 456 implementation of the estimators have been superseded. Here, 457 therefore, our focus is on replicating the results obtained using the 458 current DPD package and reported in the DPD manual. 459 460 The examples are based on the program files \texttt{abest1.ox}, 461 \texttt{abest3.ox} and \texttt{bbest1.ox}. These are included in the 462 DPD package, along with the Arellano--Bond database files 463 \texttt{abdata.bn7} and \texttt{abdata.in7}.\footnote{See 464 \url{http://www.doornik.com/download.html}.} The 465 Arellano--Bond data are also provided with gretl, in the file 466 \texttt{abdata.gdt}. In the following we do not show the output from 467 DPD or gretl; it is somewhat voluminous, and is easily generated by 468 the user. As of this writing the results from Ox/DPD and gretl are 469 identical in all relevant respects for all of the examples 470 shown.\footnote{To be specific, this is using Ox Console version 5.10, 471 version 1.24 of the DPD package, and gretl built from CVS as of 472 2010-10-23, all on Linux.} 473 474 A complete Ox/DPD program to generate the results of interest takes 475 this general form: 476 477 \begin{code} 478 #include <oxstd.h> 479 #import <packages/dpd/dpd> 480 481 main() 482 { 483 decl dpd = new DPD(); 484 485 dpd.Load("abdata.in7"); 486 dpd.SetYear("YEAR"); 487 488 // model-specific code here 489 490 delete dpd; 491 } 492 \end{code} 493 % 494 In the examples below we take this template for granted and show just 495 the model-specific code. 496 497 \subsection{Example 1} 498 499 The following Ox/DPD code---drawn from \texttt{abest1.ox}---replicates 500 column (b) of Table 4 in \cite{arellano-bond91}, an instance of the 501 differences-only or GMM-DIF estimator. The dependent variable is the 502 log of employment, \texttt{n}; the regressors include two lags of the 503 dependent variable, current and lagged values of the log real-product 504 wage, \texttt{w}, the current value of the log of gross capital, 505 \texttt{k}, and current and lagged values of the log of industry 506 output, \texttt{ys}. In addition the specification includes a constant 507 and five year dummies; unlike the stochastic regressors, these 508 deterministic terms are not differenced. In this specification the 509 regressors \texttt{w}, \texttt{k} and \texttt{ys} are treated as 510 exogenous and serve as their own instruments. In DPD syntax this 511 requires entering these variables twice, on the \verb|X_VAR| and 512 \verb|I_VAR| lines. The GMM-type (block-diagonal) instruments in this 513 example are the second and subsequent lags of the level of \texttt{n}. 514 Both 1-step and 2-step estimates are computed. 515 516 \begin{code} 517 dpd.SetOptions(FALSE); // don't use robust standard errors 518 dpd.Select(Y_VAR, {"n", 0, 2}); 519 dpd.Select(X_VAR, {"w", 0, 1, "k", 0, 0, "ys", 0, 1}); 520 dpd.Select(I_VAR, {"w", 0, 1, "k", 0, 0, "ys", 0, 1}); 521 522 dpd.Gmm("n", 2, 99); 523 dpd.SetDummies(D_CONSTANT + D_TIME); 524 525 print("\n\n***** Arellano & Bond (1991), Table 4 (b)"); 526 dpd.SetMethod(M_1STEP); 527 dpd.Estimate(); 528 dpd.SetMethod(M_2STEP); 529 dpd.Estimate(); 530 \end{code} 531 532 Here is gretl code to do the same job: 533 534 \begin{code} 535 open abdata.gdt 536 list X = w w(-1) k ys ys(-1) 537 dpanel 2 ; n X const --time-dummies --asy --dpdstyle 538 dpanel 2 ; n X const --time-dummies --asy --two-step --dpdstyle 539 \end{code} 540 541 Note that in gretl the switch to suppress robust standard errors is 542 \verb|--asymptotic|, here abbreviated to \verb|--asy|.\footnote{Option 543 flags in gretl can always be truncated, down to the minimal unique 544 abbreviation.} The \verb|--dpdstyle| flag specifies that the 545 constant and dummies should not be differenced, in the context of a 546 GMM-DIF model. With gretl's \texttt{dpanel} command it is not 547 necessary to specify the exogenous regressors as their own instruments 548 since this is the default; similarly, the use of the second and all 549 longer lags of the dependent variable as GMM-type instruments is the 550 default and need not be stated explicitly. 551 552 \subsection{Example 2} 553 554 The DPD file \texttt{abest3.ox} contains a variant of the above that 555 differs with regard to the choice of instruments: the variables 556 \texttt{w} and \texttt{k} are now treated as predetermined, and are 557 instrumented GMM-style using the second and third lags of their 558 levels. This approximates column (c) of Table 4 in 559 \cite{arellano-bond91}. We have modified the code in 560 \texttt{abest3.ox} slightly to allow the use of robust 561 (Windmeijer-corrected) standard errors, which are the default in both 562 DPD and gretl with 2-step estimation: 563 564 \begin{code} 565 dpd.Select(Y_VAR, {"n", 0, 2}); 566 dpd.Select(X_VAR, {"w", 0, 1, "k", 0, 0, "ys", 0, 1}); 567 dpd.Select(I_VAR, {"ys", 0, 1}); 568 dpd.SetDummies(D_CONSTANT + D_TIME); 569 570 dpd.Gmm("n", 2, 99); 571 dpd.Gmm("w", 2, 3); 572 dpd.Gmm("k", 2, 3); 573 574 print("\n***** Arellano & Bond (1991), Table 4 (c)\n"); 575 print(" (but using different instruments!!)\n"); 576 dpd.SetMethod(M_2STEP); 577 dpd.Estimate(); 578 \end{code} 579 580 The gretl code is as follows: 581 582 \begin{code} 583 open abdata.gdt 584 list X = w w(-1) k ys ys(-1) 585 list Ivars = ys ys(-1) 586 dpanel 2 ; n X const ; GMM(w,2,3) GMM(k,2,3) Ivars --time --two-step --dpd 587 \end{code} 588 % 589 Note that since we are now calling for an instrument set other then 590 the default (following the second semicolon), it is necessary to 591 include the \texttt{Ivars} specification for the variable \texttt{ys}. 592 However, it is not necessary to specify \texttt{GMM(n,2,99)} since 593 this remains the default treatment of the dependent variable. 594 595 \subsection{Example 3} 596 597 Our third example replicates the DPD output from \texttt{bbest1.ox}: 598 this uses the same dataset as the previous examples but the model 599 specifications are based on \cite{blundell-bond98}, and involve 600 comparison of the GMM-DIF and GMM-SYS (``system'') estimators. The 601 basic specification is slightly simplified in that the variable 602 \texttt{ys} is not used and only one lag of the dependent variable 603 appears as a regressor. The Ox/DPD code is: 604 605 \begin{code} 606 dpd.Select(Y_VAR, {"n", 0, 1}); 607 dpd.Select(X_VAR, {"w", 0, 1, "k", 0, 1}); 608 dpd.SetDummies(D_CONSTANT + D_TIME); 609 610 print("\n\n***** Blundell & Bond (1998), Table 4: 1976-86 GMM-DIF"); 611 dpd.Gmm("n", 2, 99); 612 dpd.Gmm("w", 2, 99); 613 dpd.Gmm("k", 2, 99); 614 dpd.SetMethod(M_2STEP); 615 dpd.Estimate(); 616 617 print("\n\n***** Blundell & Bond (1998), Table 4: 1976-86 GMM-SYS"); 618 dpd.GmmLevel("n", 1, 1); 619 dpd.GmmLevel("w", 1, 1); 620 dpd.GmmLevel("k", 1, 1); 621 dpd.SetMethod(M_2STEP); 622 dpd.Estimate(); 623 \end{code} 624 625 Here is the corresponding gretl code: 626 627 \begin{code} 628 open abdata.gdt 629 list X = w w(-1) k k(-1) 630 list Z = w k 631 632 # Blundell & Bond (1998), Table 4: 1976-86 GMM-DIF 633 dpanel 1 ; n X const ; GMM(Z,2,99) --time --two-step --dpd 634 635 # Blundell & Bond (1998), Table 4: 1976-86 GMM-SYS 636 dpanel 1 ; n X const ; GMM(Z,2,99) GMMlevel(Z,1,1) \ 637 --time --two-step --dpd --system 638 \end{code} 639 640 Note the use of the \verb|--system| option flag to specify GMM-SYS, 641 including the default treatment of the dependent variable, which 642 corresponds to \texttt{GMMlevel(n,1,1)}. In this case we also want to 643 use lagged differences of the regressors \texttt{w} and \texttt{k} as 644 instruments for the levels equations so we need explicit 645 \texttt{GMMlevel} entries for those variables. If you want something 646 other than the default treatment for the dependent variable as an 647 instrument for the levels equations, you should give an explicit 648 \texttt{GMMlevel} specification for that variable---and in that case 649 the \verb|--system| flag is redundant (but harmless). 650 651 For the sake of completeness, note that if you specify at least one 652 \texttt{GMMlevel} term, \texttt{dpanel} will then include equations in 653 levels, but it will not automatically add a default \texttt{GMMlevel} 654 specification for the dependent variable unless the \verb|--system| 655 option is given. 656 657 \section{Cross-country growth example} 658 \label{sec:dpanel-growth} 659 660 The previous examples all used the Arellano--Bond dataset; for this 661 example we use the dataset \texttt{CEL.gdt}, which is also included in 662 the gretl distribution. As with the Arellano--Bond data, there are 663 numerous missing values. Details of the provenance of the data can be 664 found by opening the dataset information window in the gretl GUI 665 (\textsf{Data} menu, \textsf{Dataset info} item). This is a subset of 666 the Barro--Lee 138-country panel dataset, an approximation to which is 667 used in \citet*{CEL96} and \citet*{Bond2001}.\footnote{We say an 668 ``approximation'' because we have not been able to replicate exactly 669 the OLS results reported in the papers cited, though it seems from 670 the description of the data in \cite{CEL96} that we ought to be able 671 to do so. We note that \cite{Bond2001} used data provided by 672 Professor Caselli yet did not manage to reproduce the latter's 673 results.} Both of these papers explore the dynamic panel-data 674 approach in relation to the issues of growth and convergence of per 675 capita income across countries. 676 677 The dependent variable is growth in real GDP per capita over 678 successive five-year periods; the regressors are the log of the 679 initial (five years prior) value of GDP per capita, the log-ratio of 680 investment to GDP, $s$, in the prior five years, and the log of annual 681 average population growth, $n$, over the prior five years plus 0.05 as 682 stand-in for the rate of technical progress, $g$, plus the rate of 683 depreciation, $\delta$ (with the last two terms assumed to be constant 684 across both countries and periods). The original model is 685 \begin{equation} 686 \label{eq:CEL96} 687 \Delta_5 y_{it} = \beta y_{i,t-5} + \alpha s_{it} + \gamma (n_{it} + 688 g + \delta) + \nu_t + \eta_i + \epsilon_{it} 689 \end{equation} 690 which allows for a time-specific disturbance $\nu_t$. The Solow model 691 with Cobb--Douglas production function implies that $\gamma = 692 -\alpha$, but this assumption is not imposed in estimation. The 693 time-specific disturbance is eliminated by subtracting the period mean 694 from each of the series. 695 696 Equation (\ref{eq:CEL96}) can be transformed to an AR(1) dynamic 697 panel-data model by adding $y_{i,t-5}$ to both sides, which gives 698 \begin{equation} 699 \label{eq:CEL96a} 700 y_{it} = (1 + \beta) y_{i,t-5} + \alpha s_{it} + \gamma (n_{it} + 701 g + \delta) + \eta_i + \epsilon_{it} 702 \end{equation} 703 where all variables are now assumed to be time-demeaned. 704 705 In (rough) replication of \cite{Bond2001} we now proceed to estimate 706 the following two models: (a) equation (\ref{eq:CEL96a}) via GMM-DIF, 707 using as instruments the second and all longer lags of $y_{it}$, 708 $s_{it}$ and $n_{it} + g + \delta$; and (b) equation 709 (\ref{eq:CEL96a}) via GMM-SYS, using $\Delta y_{i,t-1}$, $\Delta 710 s_{i,t-1}$ and $\Delta (n_{i,t-1} + g + \delta)$ as additional 711 instruments in the levels equations. We report robust standard errors 712 throughout. (As a purely notational matter, we now use ``$t-1$'' to 713 refer to values five years prior to $t$, as in \cite{Bond2001}). 714 715 The gretl script to do this job is shown below. Note that the final 716 transformed versions of the variables (logs, with time-means 717 subtracted) are named \texttt{ly} ($y_{it}$), \texttt{linv} ($s_{it}$) 718 and \texttt{lngd} ($n_{it} + g + \delta$). 719 % 720 \begin{code} 721 open CEL.gdt 722 723 ngd = n + 0.05 724 ly = log(y) 725 linv = log(s) 726 lngd = log(ngd) 727 728 # take out time means 729 loop i=1..8 --quiet 730 smpl (time == i) --restrict --replace 731 ly -= mean(ly) 732 linv -= mean(linv) 733 lngd -= mean(lngd) 734 endloop 735 736 smpl --full 737 list X = linv lngd 738 # 1-step GMM-DIF 739 dpanel 1 ; ly X ; GMM(X,2,99) 740 # 2-step GMM-DIF 741 dpanel 1 ; ly X ; GMM(X,2,99) --two-step 742 # GMM-SYS 743 dpanel 1 ; ly X ; GMM(X,2,99) GMMlevel(X,1,1) --two-step --sys 744 \end{code} 745 746 For comparison we estimated the same two models using Ox/DPD and the 747 Stata command \texttt{xtabond2}. (In each case we constructed a 748 comma-separated values dataset containing the data as transformed in 749 the gretl script shown above, using a missing-value code appropriate 750 to the target program.) For reference, the commands used with 751 Stata are reproduced below: 752 % 753 \begin{code} 754 insheet using CEL.csv 755 tsset unit time 756 xtabond2 ly L.ly linv lngd, gmm(L.ly, lag(1 99)) gmm(linv, lag(2 99)) 757 gmm(lngd, lag(2 99)) rob nolev 758 xtabond2 ly L.ly linv lngd, gmm(L.ly, lag(1 99)) gmm(linv, lag(2 99)) 759 gmm(lngd, lag(2 99)) rob nolev twostep 760 xtabond2 ly L.ly linv lngd, gmm(L.ly, lag(1 99)) gmm(linv, lag(2 99)) 761 gmm(lngd, lag(2 99)) rob nocons twostep 762 \end{code} 763 764 For the GMM-DIF model all three programs find 382 usable observations 765 and 30 instruments, and yield identical parameter estimates and 766 robust standard errors (up to the number of digits printed, or more); 767 see Table~\ref{tab:growth-DIF}.\footnote{The coefficient shown for 768 \texttt{ly(-1)} in the Tables is that reported directly by the 769 software; for comparability with the original model (eq.\ 770 \ref{eq:CEL96}) it is necesary to subtract 1, which produces the 771 expected negative value indicating conditional convergence in per 772 capita income.} 773 774 \begin{table}[htbp] 775 \begin{center} 776 \begin{tabular}{lrrrr} 777 & \multicolumn{2}{c}{1-step} & \multicolumn{2}{c}{2-step} \\ 778 & \multicolumn{1}{c}{coeff} & \multicolumn{1}{c}{std.\ error} & 779 \multicolumn{1}{c}{coeff} & \multicolumn{1}{c}{std.\ error} \\ 780 \texttt{ly(-1)} & 0.577564 & 0.1292 & 0.610056 & 0.1562 \\ 781 \texttt{linv} & 0.0565469 & 0.07082 & 0.100952 & 0.07772 \\ 782 \texttt{lngd} & $-$0.143950 & 0.2753 & $-$0.310041 & 0.2980 \\ 783 \end{tabular} 784 \caption{GMM-DIF: Barro--Lee data} 785 \label{tab:growth-DIF} 786 \end{center} 787 \end{table} 788 789 Results for GMM-SYS estimation are shown in 790 Table~\ref{tab:growth-SYS}. In this case we show two sets of gretl 791 results: those labeled ``gretl(1)'' were obtained using gretl's 792 \verb|--dpdstyle| option, while those labeled ``gretl(2)'' did not use 793 that option---the intent being to reproduce the $H$ matrices used by 794 Ox/DPD and \texttt{xtabond2} respectively. 795 796 \begin{table}[htbp] 797 \begin{center} 798 \begin{tabular}{lrrrr} 799 & \multicolumn{1}{c}{gretl(1)} & 800 \multicolumn{1}{c}{Ox/DPD} & 801 \multicolumn{1}{c}{gretl(2)} & 802 \multicolumn{1}{c}{xtabond2} \\ 803 \texttt{ly(-1)} & 0.9237 (0.0385) & 804 0.9167 (0.0373) & 805 0.9073 (0.0370) & 806 0.9073 (0.0370) \\ 807 \texttt{linv} & 0.1592 (0.0449) & 808 0.1636 (0.0441) & 809 0.1856 (0.0411) & 810 0.1856 (0.0411) \\ 811 \texttt{lngd} & $-$0.2370 (0.1485) & 812 $-$0.2178 (0.1433) & 813 $-$0.2355 (0.1501) & 814 $-$0.2355 (0.1501) 815 \end{tabular} 816 \caption{2-step GMM-SYS: Barro--Lee data (standard errors in parentheses)} 817 \label{tab:growth-SYS} 818 \end{center} 819 \end{table} 820 821 In this case all three programs use 479 observations; gretl and 822 \texttt{xtabond2} use 41 instruments and produce the same estimates 823 (when using the same $H$ matrix) while Ox/DPD nominally uses 824 66.\footnote{This is a case of the issue described in 825 section~\ref{sec:rankdef}: the full $\bA_N$ matrix turns out to be 826 singular and special measures must be taken to produce estimates.} 827 It is noteworthy that with GMM-SYS plus ``messy'' missing 828 observations, the results depend on the precise array of instruments 829 used, which in turn depends on the details of the implementation of 830 the estimator. 831 832 \section{Auxiliary test statistics} 833 834 We have concentrated above on the parameter estimates and standard 835 errors. It may be worth adding a few words on the additional test 836 statistics that typically accompany both GMM-DIF and GMM-SYS 837 estimation. These include the Sargan test for overidentification, one 838 or more Wald tests for the joint significance of the regressors (and time 839 dummies, if applicable) and tests for first- and second-order 840 autocorrelation of the residuals from the equations in differences. 841 842 As in Ox/DPD, the Sargan test statistic reported by gretl is 843 \[ 844 S = \left(\sum_{i=1}^N \hat{\bv}^{*\prime}_i \bZ_i\right) 845 \bA_N \left(\sum_{i=1}^N \bZ_i' \hat{\bv}^*_i\right) 846 \] 847 where the $\hat{\bv}^*_i$ are the transformed (e.g.\ differenced) 848 residuals for unit $i$. Under the null hypothesis that the 849 instruments are valid, $S$ is asymptotically distributed as chi-square 850 with degrees of freedom equal to the number of overidentifying 851 restrictions. 852 853 In general we see a good level of agreement between gretl, DPD and 854 \texttt{xtabond2} with regard to these statistics, with a few 855 relatively minor exceptions. Specifically, \texttt{xtabond2} computes 856 both a ``Sargan test'' and a ``Hansen test'' for overidentification, 857 but what it calls the Hansen test is, apparently, what DPD calls the 858 Sargan test. (We have had difficulty determining from the 859 \texttt{xtabond2} documentation \citep{Roodman2006} exactly how its 860 Sargan test is computed.) In addition there are cases where the 861 degrees of freedom for the Sargan test differ between DPD and gretl; 862 this occurs when the $\bA_N$ matrix is singular 863 (section~\ref{sec:rankdef}). In concept the df equals the number of 864 instruments minus the number of parameters estimated; for the first of 865 these terms gretl uses the rank of $\bA_N$, while DPD appears to use 866 the full dimension of this matrix. 867 868 \section{Post-estimation available statistics} 869 \label{sec:dpanel-post} 870 871 After estimation, the \dollar{model} accessor will return a bundle 872 containing several items that may be of interest: most should be 873 self-explanatory, but here's a partial list: 874 875 \begin{center} 876 \begin{tabular}{rp{0.6\textwidth}} 877 \hline 878 \textbf{Key} & \textbf{Content} \\ 879 \hline 880 \texttt{AR1}, \texttt{AR2} & 1st and 2nd order autocorrelation test 881 statistics \\ 882 \texttt{sargan}, \texttt{sargan\_df} & Sargan test for 883 overidentifying restrictions 884 and corresponding degrees of freedom \\ 885 \texttt{wald}, \texttt{wald\_df} & Wald test for 886 overall significance 887 and corresponding degrees of 888 freedom \\ 889 \texttt{GMMinst} & The matrix $\bZ$ of instruments (see equations 890 (\ref{eq:dpd-dif}) and (\ref{eq:sys-gmm}) \\ 891 \texttt{wgtmat} & The matrix $\bA$ of GMM weights (see equations 892 (\ref{eq:dpd-dif}) and (\ref{eq:sys-gmm}) \\ 893 \hline 894 \end{tabular} 895 \end{center} 896 897 Note, however, that \texttt{GMMinst} and \texttt{wgtmat} (which may be 898 quite large matrices) are not saved in the \dollar{model} bundle by 899 default; that requires use of the \option{keep-extra} option with the 900 \cmd{dpanel} command. The script in Table \ref{tab:dpanel-rep} 901 illustrates use of these matrices to replicate via hansl commands the 902 calculation of the GMM estimator. 903 904 \begin{table}[htbp] 905 \label{tab:dpanel-rep} 906 \begin{scode} 907 set verbose off 908 open abdata.gdt 909 910 # compose list of regressors 911 list X = w w(-1) k k(-1) 912 list Z = w k 913 914 dpanel 1 ; n X const ; GMM(Z,2,99) --two-step --dpd --keep-extra 915 916 ### --- re-do by hand ---------------------------- 917 918 # fetch Z and A from model 919 A = $model.wgtmat 920 mZt = $model.GMMinst # note: transposed 921 922 # create data matrices 923 series valid = ok($uhat) 924 series ddep = diff(n) 925 series dldep = ddep(-1) 926 list dreg = diff(X) 927 928 smpl valid --dummy 929 930 matrix m_reg = {dldep} ~ {dreg} ~ 1 931 matrix m_dep = {ddep} 932 933 matrix uno = mZt * m_reg 934 matrix due = qform(uno', A) 935 matrix tre = (uno'A) * (mZt * m_dep) 936 matrix coef = due\tre 937 938 print coef 939 \end{scode} 940 \caption{replication of built-in command via hansl commands} 941 \end{table} 942 943 \section{Memo: \texttt{dpanel} options} 944 \label{sec:options} 945 946 \begin{center} 947 \begin{tabular}{lp{.7\textwidth}} 948 \textit{flag} & \textit{effect} \\ [6pt] 949 \verb|--asymptotic| & Suppresses the use of robust standard errors \\ 950 \verb|--two-step| & Calls for 2-step estimation (the default being 1-step) \\ 951 \verb|--system| & Calls for GMM-SYS, with default treatment of the 952 dependent variable, as in \texttt{GMMlevel(y,1,1)} \\ 953 \verb|--time-dummies| & Includes period-specific dummy variables \\ 954 \verb|--dpdstyle| & Compute the $H$ matrix as in DPD; also suppresses 955 differencing of automatic time dummies and omission of intercept 956 in the GMM-DIF case\\ 957 \verb|--verbose| & When \verb|--two-step| is selected, prints 958 the 1-step estimates first \\ 959 \verb|--vcv| & Calls for printing of the covariance matrix \\ 960 \verb|--quiet| & Suppresses the printing of results \\ 961 \verb|--keep-extra| & Save additional matrices in \dollar{model} 962 bundle (see above) \\ 963 \end{tabular} 964 \end{center} 965 966 The time dummies option supports the qualifier \texttt{noprint}, as 967 in 968 969 \verb| --time-dummies=noprint| 970 971 This means that although the dummies are included in the specification 972 their coefficients, standard errors and so on are not printed. 973 974 %%% Local Variables: 975 %%% mode: latex 976 %%% TeX-master: "gretl-guide" 977 %%% End: