"Fossies" - the Fresh Open Source Software Archive

Member "gretl-2019c/doc/tex/dpanel.tex" (19 Feb 2019, 39173 Bytes) of package /linux/misc/gretl-2019c.tar.xz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) TeX and LaTeX source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 \chapter{Dynamic panel models}
    2 \label{chap:dpanel}
    3 
    4 \newcommand{\by}{\boldsymbol{y}}
    5 \newcommand{\bx}{\boldsymbol{x}}
    6 \newcommand{\bv}{\boldsymbol{v}}
    7 \newcommand{\bX}{\boldsymbol{X}}
    8 \newcommand{\bW}{\boldsymbol{W}}
    9 \newcommand{\bZ}{\boldsymbol{Z}}
   10 \newcommand{\bA}{\boldsymbol{A}}
   11 \newcommand{\biota}{\bm{\iota}}
   12 
   13 \DefineVerbatimEnvironment%
   14 {code}{Verbatim}
   15 {fontsize=\small, xleftmargin=1em}
   16 
   17 \newenvironment%
   18 {altcode}%
   19 {\vspace{1ex}\small\leftmargin 1em}{\vspace{1ex}}
   20 
   21 The primary command for estimating dynamic panel models in gretl is
   22 \texttt{dpanel}. The closely related \texttt{arbond} command predated
   23 \texttt{dpanel}, and is still present, but whereas \texttt{arbond}
   24 only supports the so-called ``difference'' estimator
   25 \citep{arellano-bond91}, \texttt{dpanel} in addition offers the
   26 ``system'' estimator \citep{blundell-bond98}, which has become the
   27 method of choice in the applied literature.
   28 
   29 \section{Introduction}
   30 \subsection{Notation}
   31 \label{sec:notation}
   32 
   33 A dynamic linear panel data model can be represented as follows
   34 (in notation based on \cite{arellano03}):
   35 \begin{equation}
   36   \label{eq:dpd-def}
   37   y_{it} = \alpha y_{i,t-1} + \beta'x_{it} + \eta_{i} + v_{it}
   38 \end{equation}
   39 
   40 The main idea behind the difference estimator is to sweep out the
   41 individual effect via differencing.  First-differencing eq.\
   42 (\ref{eq:dpd-def}) yields
   43 \begin{equation}
   44   \label{eq:dpd-dif}
   45   \Delta y_{it} = \alpha \Delta y_{i,t-1} + \beta'\Delta x_{it} +
   46   \Delta v_{it} = \gamma' W_{it} + \Delta v_{it} ,
   47 \end{equation}
   48 in obvious notation. The error term of (\ref{eq:dpd-dif}) is, by
   49 construction, autocorrelated and also correlated with the lagged
   50 dependent variable, so an estimator that takes both issues into
   51 account is needed. The endogeneity issue is solved by noting that all
   52 values of $y_{i,t-k}$ with $k>1$ can be used as instruments for
   53 $\Delta y_{i,t-1}$: unobserved values of $y_{i,t-k}$ (whether missing
   54 or pre-sample) can safely be substituted with 0. In the language of
   55 GMM, this amounts to using the relation
   56 \begin{equation}
   57   \label{eq:OC-dif}
   58   E(\Delta v_{it} \cdot y_{i,t-k}) = 0, \quad k>1
   59 \end{equation}
   60 as an orthogonality condition.
   61 
   62 Autocorrelation is dealt with by noting that if $v_{it}$ is white
   63 noise, the covariance matrix of the vector whose typical element is
   64 $\Delta v_{it}$ is proportional to a matrix $H$ that has 2 on the main
   65 diagonal, $-1$ on the first subdiagonals and 0 elsewhere.  One-step
   66 GMM estimation of equation (\ref{eq:dpd-dif}) amounts to computing
   67 \begin{equation}
   68 \label{eq:dif-gmm}
   69   \hat{\gamma} = \left[ 
   70     \left( \sum_{i=1}^N \bW_i'\bZ_i \right) \bA_N
   71     \left( \sum_{i=1}^N \bZ_i'\bW_i \right) \right]^{-1} 
   72     \left( \sum_{i=1}^N \bW_i'\bZ_i \right) \bA_N
   73     \left( \sum_{i=1}^N \bZ_i'\Delta \by_i \right)
   74 \end{equation}
   75 where
   76 \begin{align*}
   77   \Delta \by_i  & =
   78      \left[ \begin{array}{ccc}
   79          \Delta y_{i,3} & \cdots & \Delta y_{i,T}
   80        \end{array} \right]' \\
   81   \bW_i  & = 
   82      \left[ \begin{array}{ccc}
   83          \Delta y_{i,2} & \cdots & \Delta y_{i,T-1} \\
   84          \Delta x_{i,3} & \cdots & \Delta x_{i,T} \\
   85        \end{array} \right]' \\
   86   \bZ_i  & = 
   87      \left[ \begin{array}{ccccccc}
   88          y_{i1} & 0 & 0 & \cdots & 0 & \Delta x_{i3}\\
   89          0 & y_{i1} & y_{i2} & \cdots & 0 & \Delta x_{i4}\\
   90          & & \vdots \\
   91          0 & 0 & 0 & \cdots & y_{i, T-2} & \Delta x_{iT} \\
   92        \end{array} \right]' \\
   93   \intertext{and}
   94   \bA_N & = \left( \sum_{i=1}^N \bZ_i' H \bZ_i \right)^{-1}
   95 \end{align*}
   96 
   97 Once the 1-step estimator is computed, the sample covariance matrix of
   98 the estimated residuals can be used instead of $H$ to obtain 2-step
   99 estimates, which are not only consistent but asymptotically
  100 efficient. (In principle the process may be iterated, but nobody seems
  101 to be interested.) Standard GMM theory applies, except for one thing:
  102 \cite{Windmeijer05} has computed finite-sample corrections to the
  103 asymptotic covariance matrix of the parameters, which are nowadays
  104 almost universally used.
  105 
  106 The difference estimator is consistent, but has been shown to have
  107 poor properties in finite samples when $\alpha$ is near one. People
  108 these days prefer the so-called ``system'' estimator, which
  109 complements the differenced data (with lagged levels used as
  110 instruments) with data in levels (using lagged differences as
  111 instruments). The system estimator relies on an extra orthogonality
  112 condition which has to do with the earliest value of the dependent
  113 variable $y_{i,1}$. The interested reader is referred to \citet[pp.\
  114 124--125]{blundell-bond98} for details, but here it suffices to say
  115 that this condition is satisfied in mean-stationary models and brings
  116 an improvement in efficiency that may be substantial in many cases.
  117 
  118 The set of orthogonality conditions exploited in the system approach
  119 is not very much larger than with the difference estimator since most
  120 of the possible orthogonality conditions associated with the equations
  121 in levels are redundant, given those already used for the equations in
  122 differences.
  123 
  124 The key equations of the system estimator can be written as
  125 
  126 \begin{equation}
  127 \label{eq:sys-gmm}
  128   \tilde{\gamma} = \left[ 
  129     \left( \sum_{i=1}^N \tilde{\bW}'\tilde{\bZ} \right) \bA_N
  130     \left( \sum_{i=1}^N \tilde{\bZ}'\tilde{\bW} \right) \right]^{-1} 
  131     \left( \sum_{i=1}^N \tilde{\bW}'\tilde{\bZ} \right) \bA_N
  132     \left( \sum_{i=1}^N \tilde{\bZ}'\Delta \tilde{\by}_i \right)
  133 \end{equation}
  134 where
  135 \begin{align*}
  136   \Delta \tilde{\by}_i  & =
  137      \left[ \begin{array}{ccccccc}
  138          \Delta y_{i3} & \cdots & \Delta y_{iT} & y_{i3} & \cdots & y_{iT}
  139        \end{array} \right]' \\
  140   \tilde{\bW}_i  & =
  141      \left[ \begin{array}{cccccc}
  142          \Delta y_{i2} & \cdots & \Delta y_{i,T-1} & y_{i2} & \cdots & y_{i,T-1} \\
  143          \Delta x_{i3} & \cdots & \Delta x_{iT}  & x_{i3} & \cdots & x_{iT} \\
  144        \end{array} \right]' \\
  145   \tilde{\bZ}_i  & =
  146      \left[ \begin{array}{ccccccccc}
  147          y_{i1} & 0 & 0       & \cdots & 0  & 0  & \cdots & 0 & \Delta x_{i,3}\\
  148          0 & y_{i1} & y_{i2} & \cdots & 0  & 0  & \cdots & 0 & \Delta x_{i,4}\\
  149          & & \vdots \\
  150          0 & 0 & 0 & \cdots & y_{i, T-2} & 0  & \cdots & 0  & \Delta x_{iT}\\
  151          & & \vdots \\
  152          0 & 0 & 0 & \cdots & 0 & \Delta y_{i2} & \cdots & 0  & x_{i3}\\
  153          & & \vdots \\
  154          0 & 0 & 0 & \cdots & 0 & 0 & \cdots & \Delta y_{i,T-1}  & x_{iT}\\
  155        \end{array} \right]' \\
  156   \intertext{and}
  157   \bA_N & = \left( \sum_{i=1}^N \tilde{\bZ}' H^* \tilde{\bZ} \right)^{-1}
  158 \end{align*}
  159 
  160 In this case choosing a precise form for the matrix $H^*$ for the
  161 first step is no trivial matter. Its north-west block should be as
  162 similar as possible to the covariance matrix of the vector $\Delta
  163 v_{it}$, so the same choice as the ``difference'' estimator is
  164 appropriate. Ideally, the south-east block should be proportional to
  165 the covariance matrix of the vector $\biota \eta_i + \bv$, that is
  166 $\sigma^2_{v} I + \sigma^2_{\eta} \biota \biota'$; but since
  167 $\sigma^2_{\eta}$ is unknown and any positive definite matrix renders
  168 the estimator consistent, people just use $I$. The off-diagonal blocks
  169 should, in principle, contain the covariances between $\Delta v_{is}$
  170 and $v_{it}$, which would be an identity matrix if $v_{it}$ is white
  171 noise. However, since the south-east block is typically given a
  172 conventional value anyway, the benefit in making this choice is not
  173 obvious. Some packages use $I$; others use a zero matrix.
  174 Asymptotically, it should not matter, but on real datasets the
  175 difference between the resulting estimates can be noticeable.
  176 
  177 \subsection{Rank deficiency}
  178 \label{sec:rankdef}
  179 
  180 Both the difference estimator (\ref{eq:dif-gmm}) and the system
  181 estimator (\ref{eq:sys-gmm}) depend for their existence on the
  182 invertibility of $\bA_N$. This matrix may turn out to be singular for
  183 several reasons. However, this does not mean that the estimator is not
  184 computable: in some cases, adjustments are possible such that the
  185 estimator does exist, but the user should be aware that in these cases
  186 not all software packages use the same strategy and replication of
  187 results may prove difficult or even impossible.
  188 
  189 A first reason why $\bA_N$ may be singular could be the unavailability
  190 of instruments, chiefly because of missing observations. This case is
  191 easy to handle. If a particular row of $\tilde{\bZ}_i$ is zero for all
  192 units, the corresponding orthogonality condition (or the corresponding
  193 instrument if you prefer) is automatically dropped; of course, the
  194 overidentification rank is adjusted for testing purposes.
  195 
  196 Even if no instruments are zero, however, $\bA_N$ could be rank
  197 deficient. A trivial case occurs if there are collinear instruments,
  198 but a less trivial case may arise when $T$ (the total number of time
  199 periods available) is not much smaller than $N$ (the number of units),
  200 as, for example, in some macro datasets where the units are
  201 countries. The total number of potentially usable orthogonality
  202 conditions is $O(T^2)$, which may well exceed $N$ in some cases. Of
  203 course $\bA_N$ is the sum of $N$ matrices which have, at most, rank $2T -
  204 3$ and therefore it could well happen that the sum is singular.
  205 
  206 In all these cases, what we consider the ``proper'' way to go is to
  207 substitute the pseudo-inverse of $\bA_N$ (Moore--Penrose) for its regular
  208 inverse. Again, our choice is shared by some software packages, but
  209 not all, so replication may be hard.
  210 
  211 
  212 \subsection{Treatment of missing values}
  213 
  214 Textbooks seldom bother with missing values, but in some cases their
  215 treatment may be far from obvious. This is especially true if missing
  216 values are interspersed between valid observations. For example,
  217 consider the plain difference estimator with one lag, so
  218 \[
  219 y_t = \alpha y_{t-1} + \eta + \epsilon_t
  220 \]
  221 where the $i$ index is omitted for clarity. Suppose you have an
  222 individual with $t=1\ldots5$, for which $y_3$ is missing. It may seem
  223 that the data for this individual are unusable, because
  224 differencing $y_t$ would produce something like
  225 \[
  226 \begin{array}{c|ccccc}
  227   t & 1 & 2 & 3 & 4 & 5 \\
  228   \hline
  229   y_t & * & * & \circ & * & * \\
  230   \Delta y_t & \circ & * & \circ & \circ & *
  231 \end{array}
  232 \]
  233 where $*$ = nonmissing and $\circ$ = missing. Estimation seems to be
  234 unfeasible, since there are no periods in which $\Delta y_t$ and
  235 $\Delta y_{t-1}$ are both observable.
  236 
  237 However, we can use a $k$-difference operator and get
  238 \[
  239 \Delta_k y_t = \alpha \Delta_k y_{t-1} + \Delta_k \epsilon_t
  240 \]
  241 where $\Delta_k = 1 - L^k$ and past levels of $y_t$ are perfectly
  242 valid instruments. In this example, we can choose $k=3$ and use $y_1$
  243 as an instrument, so this unit is in fact perfectly usable.
  244 
  245 Not all software packages seem to be aware of this possibility, so
  246 replicating published results may prove tricky if your dataset
  247 contains individuals with gaps between valid observations.
  248 
  249 \section{Usage}
  250 
  251 One of the concepts underlying the syntax of \texttt{dpanel} is that
  252 you get default values for several choices you may want to make, so
  253 that in a ``standard'' situation the command is very concise.  The
  254 simplest case of the model (\ref{eq:dpd-def}) is a plain AR(1)
  255 process:
  256 \begin{equation}
  257 \label{eq:dp1}
  258   y_{i,t} = \alpha y_{i,t-1} + \eta_{i} + v_{it} .
  259 \end{equation}
  260 If you give the command
  261 \begin{code}
  262   dpanel 1 ; y
  263 \end{code}
  264 gretl assumes that you want to estimate (\ref{eq:dp1}) via the
  265 difference estimator (\ref{eq:dif-gmm}), using as many orthogonality
  266 conditions as possible.  The scalar \texttt{1} between \texttt{dpanel}
  267 and the semicolon indicates that only one lag of \texttt{y} is
  268 included as an explanatory variable; using \texttt{2} would give an
  269 AR(2) model. The syntax that gretl uses for the non-seasonal AR and MA
  270 lags in an ARMA model is also supported in this context.\footnote{This
  271   represents an enhancement over the \texttt{arbond} command.} For
  272 example, if you want the first and third lags of \texttt{y} (but not
  273 the second) included as explanatory variables you can say
  274 \begin{code}
  275   dpanel {1 3} ; y
  276 \end{code}
  277 or you can use a pre-defined matrix for this purpose:
  278 \begin{code}
  279   matrix ylags = {1, 3}
  280   dpanel ylags ; y
  281 \end{code}
  282 To use a single lag of \texttt{y} other than the first you need to
  283 employ this mechanism:
  284 \begin{code}
  285   dpanel {3} ; y # only lag 3 is included
  286   dpanel 3 ; y   # compare: lags 1, 2 and 3 are used
  287 \end{code}
  288 
  289 To use the system estimator instead, you add the \verb|--system|
  290 option, as in
  291 \begin{code}
  292   dpanel 1 ; y --system
  293 \end{code}
  294 The level orthogonality conditions and the corresponding instrument
  295 are appended automatically (see eq.\ \ref{eq:sys-gmm}).
  296 
  297 \subsection{Regressors}
  298 
  299 If we want to introduce additional regressors, we list them after the
  300 dependent variable in the same way as other gretl commands, such as
  301 \texttt{ols}.
  302 
  303 For the difference orthogonality relations, \texttt{dpanel} takes care
  304 of transforming the regressors in parallel with the dependent
  305 variable. Note that this differs from gretl's \texttt{arbond} command,
  306 where only the dependent variable is differenced automatically; it
  307 brings us more in line with other software.
  308 
  309 One case of potential ambiguity is when an intercept is specified but
  310 the difference-only estimator is selected, as in
  311 \begin{code}
  312   dpanel 1 ; y const
  313 \end{code}
  314 In this case the default \texttt{dpanel} behavior, which agrees with
  315 Stata's \texttt{xtabond2}, is to drop the constant (since differencing
  316 reduces it to nothing but zeros). However, for compatibility with the
  317 DPD package for Ox, you can give the option \verb|--dpdstyle|, in
  318 which case the constant is retained (equivalent to including a linear
  319 trend in equation~\ref{eq:dpd-def}).  A similar point applies to the
  320 period-specific dummy variables which can be added in \texttt{dpanel}
  321 via the \verb|--time-dummies| option: in the differences-only case
  322 these dummies are entered in differenced form by default, but when the
  323 \verb|--dpdstyle| switch is applied they are entered in levels.
  324 
  325 The standard gretl syntax applies if you want to use lagged
  326 explanatory variables, so for example the command
  327 \begin{code}
  328   dpanel 1 ; y const x(0 to -1) --system
  329 \end{code}
  330 would result in estimation of the model
  331 \[
  332   y_{it} = \alpha y_{i,t-1} + 
  333   \beta_0 + \beta_1 x_{it} + \beta_2 x_{i,t-1} +
  334   \eta_{i} + v_{it} .
  335 \]
  336 
  337 
  338 \subsection{Instruments}
  339 
  340 The default rules for instruments are: 
  341 \begin{itemize}
  342 \item lags of the dependent variable are instrumented using all
  343   available orthogonality conditions; and
  344 \item additional regressors are considered exogenous, so they are used
  345   as their own instruments.
  346 \end{itemize}
  347 
  348 If a different policy is wanted, the instruments should be specified
  349 in an additional list, separated from the regressors list by a
  350 semicolon. The syntax closely mirrors that for the \texttt{tsls}
  351 command, but in this context it is necessary to distinguish between
  352 ``regular'' instruments and what are often called ``GMM-style''
  353 instruments (that is, instruments that are handled in the same
  354 block-diagonal manner as lags of the dependent variable, as described
  355 above).
  356 
  357 ``Regular'' instruments are transformed in the same way as
  358 regressors, and the contemporaneous value of the transformed variable
  359 is used to form an orthogonality condition. Since regressors are
  360 treated as exogenous by default, it follows that these two commands
  361 estimate the same model:
  362 
  363 \begin{code}
  364   dpanel 1 ; y z
  365   dpanel 1 ; y z ; z
  366 \end{code}
  367 The instrument specification in the second case simply confirms what
  368 is implicit in the first: that \texttt{z} is exogenous. Note, though,
  369 that if you have some additional variable \texttt{z2} which you want
  370 to add as a regular instrument, it then becomes necessary to 
  371 include \texttt{z} in the instrument list if it is to be treated
  372 as exogenous:
  373 \begin{code}
  374   dpanel 1 ; y z ; z2   # z is now implicitly endogenous
  375   dpanel 1 ; y z ; z z2 # z is treated as exogenous
  376 \end{code}
  377 
  378 The specification of ``GMM-style'' instruments is handled by the
  379 special constructs \texttt{GMM()} and \texttt{GMMlevel()}.  The first
  380 of these relates to instruments for the equations in differences, and
  381 the second to the equations in levels. The syntax for \texttt{GMM()}
  382 is
  383 
  384 \begin{altcode}
  385 \texttt{GMM(}\textsl{name}\texttt{,} \textsl{minlag}\texttt{,} 
  386 \textsl{maxlag}\texttt{)}
  387 \end{altcode}
  388 
  389 \noindent
  390 where \textsl{name} is replaced by the name of a series (or the name
  391 of a list of series), and \textsl{minlag} and \textsl{maxlag} are
  392 replaced by the minimum and maximum lags to be used as
  393 instruments. The same goes for \texttt{GMMlevel()}.
  394 
  395 One common use of \texttt{GMM()} is to limit the number of lagged
  396 levels of the dependent variable used as instruments for the equations
  397 in differences. It's well known that although exploiting all possible
  398 orthogonality conditions yields maximal asymptotic efficiency, in
  399 finite samples it may be preferable to use a smaller subset (but see
  400 also \cite{OkuiJoE2009}).  For example, the specification
  401 
  402 \begin{code}
  403   dpanel 1 ; y ; GMM(y, 2, 4)
  404 \end{code}
  405 ensures that no lags of $y_t$ earlier than $t-4$ will be used as
  406 instruments.
  407 
  408 A second use of \texttt{GMM()} is to exploit more fully the potential
  409 block-diagonal orthogonality conditions offered by an exogenous
  410 regressor, or a related variable that does not appear as a regressor.
  411 For example, in
  412 
  413 \begin{code}
  414   dpanel 1 ; y x ; GMM(z, 2, 6)
  415 \end{code}
  416 the variable \texttt{x} is considered an endogenous regressor, and up to
  417 5 lags of \texttt{z} are used as instruments.
  418 
  419 Note that in the following script fragment
  420 \begin{code}
  421   dpanel 1 ; y z
  422   dpanel 1 ; y z ; GMM(z,0,0)
  423 \end{code}
  424 the two estimation commands should not be expected to give the same
  425 result, as the sets of orthogonality relationships are subtly
  426 different.  In the latter case, you have $T-2$ separate orthogonality
  427 relationships pertaining to $z_{it}$, none of which has any
  428 implication for the other ones; in the former case, you only have one.
  429 In terms of the $\bZ_i$ matrix, the first form adds a single row to
  430 the bottom of the instruments matrix, while the second form adds a
  431 diagonal block with $T-2$ columns; that is,
  432 \[
  433   \left[ \begin{array}{cccc}
  434          z_{i3} & z_{i4} & \cdots & z_{it}
  435        \end{array} \right]
  436 \] 
  437 versus
  438 \[
  439   \left[ \begin{array}{cccc}
  440          z_{i3} & 0 & \cdots & 0 \\
  441          0 & z_{i4} & \cdots & 0 \\
  442           & \ddots & \ddots &  \\
  443          0 & 0 & \cdots & z_{it} 
  444        \end{array} \right]
  445 \]
  446 
  447 \section{Replication of DPD results}
  448 \label{sec:DPD-replic}
  449 
  450 In this section we show how to replicate the results of some of the
  451 pioneering work with dynamic panel-data estimators by Arellano, Bond
  452 and Blundell.  As the DPD manual \citep*{DPDmanual} explains, it is
  453 difficult to replicate the original published results exactly, for two
  454 main reasons: not all of the data used in those studies are publicly
  455 available; and some of the choices made in the original software
  456 implementation of the estimators have been superseded.  Here,
  457 therefore, our focus is on replicating the results obtained using the
  458 current DPD package and reported in the DPD manual.
  459 
  460 The examples are based on the program files \texttt{abest1.ox},
  461 \texttt{abest3.ox} and \texttt{bbest1.ox}. These are included in the
  462 DPD package, along with the Arellano--Bond database files
  463 \texttt{abdata.bn7} and \texttt{abdata.in7}.\footnote{See
  464   \url{http://www.doornik.com/download.html}.} The
  465 Arellano--Bond data are also provided with gretl, in the file
  466 \texttt{abdata.gdt}. In the following we do not show the output from
  467 DPD or gretl; it is somewhat voluminous, and is easily generated by
  468 the user. As of this writing the results from Ox/DPD and gretl are
  469 identical in all relevant respects for all of the examples
  470 shown.\footnote{To be specific, this is using Ox Console version 5.10,
  471   version 1.24 of the DPD package, and gretl built from CVS as of
  472   2010-10-23, all on Linux.}
  473 
  474 A complete Ox/DPD program to generate the results of interest takes
  475 this general form:
  476 
  477 \begin{code}
  478 #include <oxstd.h>
  479 #import <packages/dpd/dpd>
  480 
  481 main()
  482 {
  483     decl dpd = new DPD();
  484 
  485     dpd.Load("abdata.in7");
  486     dpd.SetYear("YEAR");
  487 
  488     // model-specific code here
  489 
  490     delete dpd;
  491 }
  492 \end{code}
  493 %
  494 In the examples below we take this template for granted and show just
  495 the model-specific code.
  496 
  497 \subsection{Example 1}
  498 
  499 The following Ox/DPD code---drawn from \texttt{abest1.ox}---replicates
  500 column (b) of Table 4 in \cite{arellano-bond91}, an instance of the
  501 differences-only or GMM-DIF estimator. The dependent variable is the
  502 log of employment, \texttt{n}; the regressors include two lags of the
  503 dependent variable, current and lagged values of the log real-product
  504 wage, \texttt{w}, the current value of the log of gross capital,
  505 \texttt{k}, and current and lagged values of the log of industry
  506 output, \texttt{ys}. In addition the specification includes a constant
  507 and five year dummies; unlike the stochastic regressors, these
  508 deterministic terms are not differenced. In this specification the
  509 regressors \texttt{w}, \texttt{k} and \texttt{ys} are treated as
  510 exogenous and serve as their own instruments. In DPD syntax this
  511 requires entering these variables twice, on the \verb|X_VAR| and
  512 \verb|I_VAR| lines. The GMM-type (block-diagonal) instruments in this
  513 example are the second and subsequent lags of the level of \texttt{n}.
  514 Both 1-step and 2-step estimates are computed.
  515 
  516 \begin{code}
  517 dpd.SetOptions(FALSE); // don't use robust standard errors
  518 dpd.Select(Y_VAR, {"n", 0, 2});
  519 dpd.Select(X_VAR, {"w", 0, 1, "k", 0, 0, "ys", 0, 1});
  520 dpd.Select(I_VAR, {"w", 0, 1, "k", 0, 0, "ys", 0, 1});
  521 
  522 dpd.Gmm("n", 2, 99);
  523 dpd.SetDummies(D_CONSTANT + D_TIME);
  524 
  525 print("\n\n***** Arellano & Bond (1991), Table 4 (b)");
  526 dpd.SetMethod(M_1STEP);
  527 dpd.Estimate();
  528 dpd.SetMethod(M_2STEP);
  529 dpd.Estimate();
  530 \end{code}
  531 
  532 Here is gretl code to do the same job:
  533 
  534 \begin{code}
  535 open abdata.gdt
  536 list X = w w(-1) k ys ys(-1)
  537 dpanel 2 ; n X const --time-dummies --asy --dpdstyle
  538 dpanel 2 ; n X const --time-dummies --asy --two-step --dpdstyle
  539 \end{code}
  540 
  541 Note that in gretl the switch to suppress robust standard errors is
  542 \verb|--asymptotic|, here abbreviated to \verb|--asy|.\footnote{Option
  543   flags in gretl can always be truncated, down to the minimal unique
  544   abbreviation.} The \verb|--dpdstyle| flag specifies that the
  545 constant and dummies should not be differenced, in the context of a
  546 GMM-DIF model. With gretl's \texttt{dpanel} command it is not
  547 necessary to specify the exogenous regressors as their own instruments
  548 since this is the default; similarly, the use of the second and all
  549 longer lags of the dependent variable as GMM-type instruments is the
  550 default and need not be stated explicitly. 
  551 
  552 \subsection{Example 2}
  553 
  554 The DPD file \texttt{abest3.ox} contains a variant of the above that
  555 differs with regard to the choice of instruments: the variables
  556 \texttt{w} and \texttt{k} are now treated as predetermined, and are
  557 instrumented GMM-style using the second and third lags of their
  558 levels. This approximates column (c) of Table 4 in
  559 \cite{arellano-bond91}.  We have modified the code in
  560 \texttt{abest3.ox} slightly to allow the use of robust
  561 (Windmeijer-corrected) standard errors, which are the default in both
  562 DPD and gretl with 2-step estimation:
  563 
  564 \begin{code}
  565 dpd.Select(Y_VAR, {"n", 0, 2});
  566 dpd.Select(X_VAR, {"w", 0, 1, "k", 0, 0, "ys", 0, 1});
  567 dpd.Select(I_VAR, {"ys", 0, 1});
  568 dpd.SetDummies(D_CONSTANT + D_TIME);
  569 
  570 dpd.Gmm("n", 2, 99);
  571 dpd.Gmm("w", 2, 3);
  572 dpd.Gmm("k", 2, 3);
  573 
  574 print("\n***** Arellano & Bond (1991), Table 4 (c)\n");
  575 print("        (but using different instruments!!)\n");
  576 dpd.SetMethod(M_2STEP);
  577 dpd.Estimate();
  578 \end{code}
  579 
  580 The gretl code is as follows:
  581 
  582 \begin{code}
  583 open abdata.gdt
  584 list X = w w(-1) k ys ys(-1)
  585 list Ivars = ys ys(-1)
  586 dpanel 2 ; n X const ; GMM(w,2,3) GMM(k,2,3) Ivars --time --two-step --dpd
  587 \end{code}
  588 %
  589 Note that since we are now calling for an instrument set other then
  590 the default (following the second semicolon), it is necessary to
  591 include the \texttt{Ivars} specification for the variable \texttt{ys}.
  592 However, it is not necessary to specify \texttt{GMM(n,2,99)} since
  593 this remains the default treatment of the dependent variable.
  594 
  595 \subsection{Example 3}
  596 
  597 Our third example replicates the DPD output from \texttt{bbest1.ox}:
  598 this uses the same dataset as the previous examples but the model
  599 specifications are based on \cite{blundell-bond98}, and involve
  600 comparison of the GMM-DIF and GMM-SYS (``system'') estimators. The
  601 basic specification is slightly simplified in that the variable
  602 \texttt{ys} is not used and only one lag of the dependent variable
  603 appears as a regressor. The Ox/DPD code is:
  604 
  605 \begin{code}
  606 dpd.Select(Y_VAR, {"n", 0, 1});
  607 dpd.Select(X_VAR, {"w", 0, 1, "k", 0, 1});
  608 dpd.SetDummies(D_CONSTANT + D_TIME);
  609 
  610 print("\n\n***** Blundell & Bond (1998), Table 4: 1976-86 GMM-DIF");
  611 dpd.Gmm("n", 2, 99);
  612 dpd.Gmm("w", 2, 99);
  613 dpd.Gmm("k", 2, 99);
  614 dpd.SetMethod(M_2STEP);
  615 dpd.Estimate();
  616 
  617 print("\n\n***** Blundell & Bond (1998), Table 4: 1976-86 GMM-SYS");
  618 dpd.GmmLevel("n", 1, 1);
  619 dpd.GmmLevel("w", 1, 1);
  620 dpd.GmmLevel("k", 1, 1);
  621 dpd.SetMethod(M_2STEP);
  622 dpd.Estimate();
  623 \end{code}
  624 
  625 Here is the corresponding gretl code:
  626 
  627 \begin{code}
  628 open abdata.gdt
  629 list X = w w(-1) k k(-1)
  630 list Z = w k
  631 
  632 # Blundell & Bond (1998), Table 4: 1976-86 GMM-DIF
  633 dpanel 1 ; n X const ; GMM(Z,2,99) --time --two-step --dpd
  634 
  635 # Blundell & Bond (1998), Table 4: 1976-86 GMM-SYS
  636 dpanel 1 ; n X const ; GMM(Z,2,99) GMMlevel(Z,1,1) \
  637  --time --two-step --dpd --system
  638 \end{code}
  639 
  640 Note the use of the \verb|--system| option flag to specify GMM-SYS,
  641 including the default treatment of the dependent variable, which
  642 corresponds to \texttt{GMMlevel(n,1,1)}. In this case we also want to
  643 use lagged differences of the regressors \texttt{w} and \texttt{k} as
  644 instruments for the levels equations so we need explicit
  645 \texttt{GMMlevel} entries for those variables. If you want something
  646 other than the default treatment for the dependent variable as an
  647 instrument for the levels equations, you should give an explicit
  648 \texttt{GMMlevel} specification for that variable---and in that case
  649 the \verb|--system| flag is redundant (but harmless).
  650 
  651 For the sake of completeness, note that if you specify at least one
  652 \texttt{GMMlevel} term, \texttt{dpanel} will then include equations in
  653 levels, but it will not automatically add a default \texttt{GMMlevel}
  654 specification for the dependent variable unless the \verb|--system|
  655 option is given.
  656 
  657 \section{Cross-country growth example}
  658 \label{sec:dpanel-growth}
  659 
  660 The previous examples all used the Arellano--Bond dataset; for this
  661 example we use the dataset \texttt{CEL.gdt}, which is also included in
  662 the gretl distribution. As with the Arellano--Bond data, there are
  663 numerous missing values.  Details of the provenance of the data can be
  664 found by opening the dataset information window in the gretl GUI
  665 (\textsf{Data} menu, \textsf{Dataset info} item). This is a subset of
  666 the Barro--Lee 138-country panel dataset, an approximation to which is
  667 used in \citet*{CEL96} and \citet*{Bond2001}.\footnote{We say an
  668   ``approximation'' because we have not been able to replicate exactly
  669   the OLS results reported in the papers cited, though it seems from
  670   the description of the data in \cite{CEL96} that we ought to be able
  671   to do so.  We note that \cite{Bond2001} used data provided by
  672   Professor Caselli yet did not manage to reproduce the latter's
  673   results.}  Both of these papers explore the dynamic panel-data
  674 approach in relation to the issues of growth and convergence of per
  675 capita income across countries.
  676 
  677 The dependent variable is growth in real GDP per capita over
  678 successive five-year periods; the regressors are the log of the
  679 initial (five years prior) value of GDP per capita, the log-ratio of
  680 investment to GDP, $s$, in the prior five years, and the log of annual
  681 average population growth, $n$, over the prior five years plus 0.05 as
  682 stand-in for the rate of technical progress, $g$, plus the rate of
  683 depreciation, $\delta$ (with the last two terms assumed to be constant
  684 across both countries and periods).  The original model is
  685 \begin{equation}
  686 \label{eq:CEL96}
  687 \Delta_5 y_{it} = \beta y_{i,t-5} + \alpha s_{it} + \gamma (n_{it} +
  688 g + \delta) + \nu_t + \eta_i + \epsilon_{it}
  689 \end{equation}
  690 which allows for a time-specific disturbance $\nu_t$. The Solow model
  691 with Cobb--Douglas production function implies that $\gamma =
  692 -\alpha$, but this assumption is not imposed in estimation. The
  693 time-specific disturbance is eliminated by subtracting the period mean
  694 from each of the series.
  695 
  696 Equation (\ref{eq:CEL96}) can be transformed to an AR(1) dynamic
  697 panel-data model by adding $y_{i,t-5}$ to both sides, which gives
  698 \begin{equation}
  699 \label{eq:CEL96a}
  700 y_{it} = (1 + \beta) y_{i,t-5} + \alpha s_{it} + \gamma (n_{it} +
  701 g + \delta) + \eta_i + \epsilon_{it}
  702 \end{equation}
  703 where all variables are now assumed to be time-demeaned.
  704 
  705 In (rough) replication of \cite{Bond2001} we now proceed to estimate
  706 the following two models: (a) equation (\ref{eq:CEL96a}) via GMM-DIF,
  707 using as instruments the second and all longer lags of $y_{it}$,
  708 $s_{it}$ and $n_{it} + g + \delta$; and (b) equation
  709 (\ref{eq:CEL96a}) via GMM-SYS, using $\Delta y_{i,t-1}$, $\Delta
  710 s_{i,t-1}$ and $\Delta (n_{i,t-1} + g + \delta)$ as additional
  711 instruments in the levels equations. We report robust standard errors
  712 throughout. (As a purely notational matter, we now use ``$t-1$'' to
  713 refer to values five years prior to $t$, as in \cite{Bond2001}).
  714 
  715 The gretl script to do this job is shown below. Note that the final
  716 transformed versions of the variables (logs, with time-means
  717 subtracted) are named \texttt{ly} ($y_{it}$), \texttt{linv} ($s_{it}$)
  718 and \texttt{lngd} ($n_{it} + g + \delta$).
  719 %
  720 \begin{code}
  721 open CEL.gdt
  722 
  723 ngd = n + 0.05
  724 ly = log(y)
  725 linv = log(s)
  726 lngd = log(ngd)
  727 
  728 # take out time means
  729 loop i=1..8 --quiet
  730   smpl (time == i) --restrict --replace
  731   ly -= mean(ly)
  732   linv -= mean(linv)
  733   lngd -= mean(lngd)
  734 endloop
  735 
  736 smpl --full
  737 list X = linv lngd
  738 # 1-step GMM-DIF
  739 dpanel 1 ; ly X ; GMM(X,2,99)
  740 # 2-step GMM-DIF
  741 dpanel 1 ; ly X ; GMM(X,2,99) --two-step
  742 # GMM-SYS
  743 dpanel 1 ; ly X ; GMM(X,2,99) GMMlevel(X,1,1) --two-step --sys
  744 \end{code}
  745 
  746 For comparison we estimated the same two models using Ox/DPD and the
  747 Stata command \texttt{xtabond2}. (In each case we constructed a
  748 comma-separated values dataset containing the data as transformed in
  749 the gretl script shown above, using a missing-value code appropriate
  750 to the target program.) For reference, the commands used with
  751 Stata are reproduced below:
  752 %
  753 \begin{code}
  754 insheet using CEL.csv
  755 tsset unit time
  756 xtabond2 ly L.ly linv lngd, gmm(L.ly, lag(1 99)) gmm(linv, lag(2 99)) 
  757   gmm(lngd, lag(2 99)) rob nolev
  758 xtabond2 ly L.ly linv lngd, gmm(L.ly, lag(1 99)) gmm(linv, lag(2 99)) 
  759   gmm(lngd, lag(2 99)) rob nolev twostep
  760 xtabond2 ly L.ly linv lngd, gmm(L.ly, lag(1 99)) gmm(linv, lag(2 99)) 
  761   gmm(lngd, lag(2 99)) rob nocons twostep
  762 \end{code}
  763 
  764 For the GMM-DIF model all three programs find 382 usable observations
  765 and 30 instruments, and yield identical parameter estimates and
  766 robust standard errors (up to the number of digits printed, or more);
  767 see Table~\ref{tab:growth-DIF}.\footnote{The coefficient shown for
  768   \texttt{ly(-1)} in the Tables is that reported directly by the
  769   software; for comparability with the original model (eq.\
  770   \ref{eq:CEL96}) it is necesary to subtract 1, which produces the
  771   expected negative value indicating conditional convergence in per
  772   capita income.}
  773 
  774 \begin{table}[htbp]
  775 \begin{center}
  776 \begin{tabular}{lrrrr}
  777 & \multicolumn{2}{c}{1-step} & \multicolumn{2}{c}{2-step} \\
  778 & \multicolumn{1}{c}{coeff} & \multicolumn{1}{c}{std.\ error} &
  779   \multicolumn{1}{c}{coeff} & \multicolumn{1}{c}{std.\ error} \\
  780 \texttt{ly(-1)} & 0.577564 & 0.1292 & 0.610056 & 0.1562 \\
  781 \texttt{linv} & 0.0565469 & 0.07082 & 0.100952 & 0.07772 \\
  782 \texttt{lngd} & $-$0.143950 & 0.2753 & $-$0.310041 & 0.2980 \\
  783 \end{tabular}
  784 \caption{GMM-DIF: Barro--Lee data}
  785 \label{tab:growth-DIF}
  786 \end{center}
  787 \end{table}
  788 
  789 Results for GMM-SYS estimation are shown in
  790 Table~\ref{tab:growth-SYS}. In this case we show two sets of gretl
  791 results: those labeled ``gretl(1)'' were obtained using gretl's
  792 \verb|--dpdstyle| option, while those labeled ``gretl(2)'' did not use
  793 that option---the intent being to reproduce the $H$ matrices used by
  794 Ox/DPD and \texttt{xtabond2} respectively.
  795 
  796 \begin{table}[htbp]
  797 \begin{center}
  798 \begin{tabular}{lrrrr}
  799 & \multicolumn{1}{c}{gretl(1)} & 
  800   \multicolumn{1}{c}{Ox/DPD} &
  801   \multicolumn{1}{c}{gretl(2)} & 
  802   \multicolumn{1}{c}{xtabond2} \\
  803 \texttt{ly(-1)} & 0.9237 (0.0385) & 
  804   0.9167 (0.0373) & 
  805     0.9073 (0.0370) &
  806       0.9073 (0.0370) \\
  807 \texttt{linv} & 0.1592 (0.0449) & 
  808   0.1636 (0.0441) & 
  809     0.1856 (0.0411) &
  810       0.1856 (0.0411) \\
  811 \texttt{lngd} & $-$0.2370 (0.1485) & 
  812   $-$0.2178 (0.1433) & 
  813     $-$0.2355 (0.1501) &
  814       $-$0.2355 (0.1501) 
  815 \end{tabular}
  816 \caption{2-step GMM-SYS: Barro--Lee data (standard errors in parentheses)}
  817 \label{tab:growth-SYS}
  818 \end{center}
  819 \end{table}
  820 
  821 In this case all three programs use 479 observations; gretl and
  822 \texttt{xtabond2} use 41 instruments and produce the same estimates
  823 (when using the same $H$ matrix) while Ox/DPD nominally uses
  824 66.\footnote{This is a case of the issue described in
  825   section~\ref{sec:rankdef}: the full $\bA_N$ matrix turns out to be
  826   singular and special measures must be taken to produce estimates.}
  827 It is noteworthy that with GMM-SYS plus ``messy'' missing
  828 observations, the results depend on the precise array of instruments
  829 used, which in turn depends on the details of the implementation of
  830 the estimator.
  831 
  832 \section{Auxiliary test statistics}
  833 
  834 We have concentrated above on the parameter estimates and standard
  835 errors. It may be worth adding a few words on the additional test
  836 statistics that typically accompany both GMM-DIF and GMM-SYS
  837 estimation. These include the Sargan test for overidentification, one
  838 or more Wald tests for the joint significance of the regressors (and time
  839 dummies, if applicable) and tests for first- and second-order
  840 autocorrelation of the residuals from the equations in differences.
  841 
  842 As in Ox/DPD, the Sargan test statistic reported by gretl is
  843 \[
  844   S = \left(\sum_{i=1}^N \hat{\bv}^{*\prime}_i \bZ_i\right)  
  845    \bA_N \left(\sum_{i=1}^N \bZ_i' \hat{\bv}^*_i\right)
  846 \]
  847 where the $\hat{\bv}^*_i$ are the transformed (e.g.\ differenced)
  848 residuals for unit $i$.  Under the null hypothesis that the
  849 instruments are valid, $S$ is asymptotically distributed as chi-square
  850 with degrees of freedom equal to the number of overidentifying
  851 restrictions.
  852 
  853 In general we see a good level of agreement between gretl, DPD and
  854 \texttt{xtabond2} with regard to these statistics, with a few
  855 relatively minor exceptions. Specifically, \texttt{xtabond2} computes
  856 both a ``Sargan test'' and a ``Hansen test'' for overidentification,
  857 but what it calls the Hansen test is, apparently, what DPD calls the
  858 Sargan test. (We have had difficulty determining from the
  859 \texttt{xtabond2} documentation \citep{Roodman2006} exactly how its
  860 Sargan test is computed.) In addition there are cases where the
  861 degrees of freedom for the Sargan test differ between DPD and gretl;
  862 this occurs when the $\bA_N$ matrix is singular
  863 (section~\ref{sec:rankdef}). In concept the df equals the number of
  864 instruments minus the number of parameters estimated; for the first of
  865 these terms gretl uses the rank of $\bA_N$, while DPD appears to use
  866 the full dimension of this matrix.
  867 
  868 \section{Post-estimation available statistics}
  869 \label{sec:dpanel-post}
  870 
  871 After estimation, the \dollar{model} accessor will return a bundle
  872 containing several items that may be of interest: most should be
  873 self-explanatory, but here's a partial list:
  874 
  875 \begin{center}
  876 \begin{tabular}{rp{0.6\textwidth}}
  877   \hline
  878   \textbf{Key} & \textbf{Content} \\
  879   \hline
  880   \texttt{AR1}, \texttt{AR2} & 1st and 2nd order autocorrelation test
  881                                statistics \\
  882   \texttt{sargan}, \texttt{sargan\_df} & Sargan test for
  883                                          overidentifying restrictions
  884                                          and corresponding degrees of freedom \\
  885   \texttt{wald}, \texttt{wald\_df} & Wald test for
  886                                      overall significance
  887                                      and corresponding degrees of
  888                                      freedom \\
  889   \texttt{GMMinst} & The matrix $\bZ$ of instruments (see equations
  890                      (\ref{eq:dpd-dif}) and (\ref{eq:sys-gmm}) \\
  891   \texttt{wgtmat} & The matrix $\bA$ of GMM weights (see equations
  892                     (\ref{eq:dpd-dif}) and (\ref{eq:sys-gmm}) \\
  893   \hline
  894 \end{tabular}
  895 \end{center}
  896 
  897 Note, however, that \texttt{GMMinst} and \texttt{wgtmat} (which may be
  898 quite large matrices) are not saved in the \dollar{model} bundle by
  899 default; that requires use of the \option{keep-extra} option with the
  900 \cmd{dpanel} command. The script in Table \ref{tab:dpanel-rep}
  901 illustrates use of these matrices to replicate via hansl commands the
  902 calculation of the GMM estimator.
  903 
  904 \begin{table}[htbp]
  905 \label{tab:dpanel-rep}
  906   \begin{scode}
  907 set verbose off
  908 open abdata.gdt
  909 
  910 # compose list of regressors
  911 list X = w w(-1) k k(-1)
  912 list Z = w k
  913 
  914 dpanel 1 ; n X const ; GMM(Z,2,99) --two-step --dpd --keep-extra
  915 
  916 ### --- re-do by hand ----------------------------
  917 
  918 # fetch Z and A from model
  919 A = $model.wgtmat
  920 mZt = $model.GMMinst # note: transposed
  921 
  922 # create data matrices
  923 series valid = ok($uhat)
  924 series ddep = diff(n)
  925 series dldep = ddep(-1)
  926 list dreg = diff(X)
  927 
  928 smpl valid --dummy
  929 
  930 matrix m_reg = {dldep} ~ {dreg} ~ 1
  931 matrix m_dep = {ddep}
  932 
  933 matrix uno = mZt * m_reg
  934 matrix due = qform(uno', A)
  935 matrix tre = (uno'A) * (mZt * m_dep)
  936 matrix coef = due\tre
  937 
  938 print coef
  939 \end{scode}
  940   \caption{replication of built-in command via hansl commands}
  941 \end{table}
  942 
  943 \section{Memo: \texttt{dpanel} options}
  944 \label{sec:options}
  945 
  946 \begin{center}
  947 \begin{tabular}{lp{.7\textwidth}}
  948   \textit{flag} & \textit{effect} \\ [6pt]
  949   \verb|--asymptotic| & Suppresses the use of robust standard errors \\
  950   \verb|--two-step| & Calls for 2-step estimation (the default being 1-step) \\
  951   \verb|--system| & Calls for GMM-SYS, with default treatment of the 
  952                     dependent variable, as in \texttt{GMMlevel(y,1,1)} \\
  953   \verb|--time-dummies| & Includes period-specific dummy variables \\
  954   \verb|--dpdstyle| & Compute the $H$ matrix as in DPD; also suppresses
  955                       differencing of automatic time dummies and omission of intercept
  956                       in the GMM-DIF case\\
  957   \verb|--verbose| & When \verb|--two-step| is selected, prints 
  958                      the 1-step estimates first \\
  959   \verb|--vcv| & Calls for printing of the covariance matrix \\
  960   \verb|--quiet| & Suppresses the printing of results \\
  961   \verb|--keep-extra| & Save additional matrices in \dollar{model}
  962                         bundle (see above) \\
  963 \end{tabular}
  964 \end{center}
  965 
  966 The time dummies option supports the qualifier \texttt{noprint}, as
  967 in
  968 
  969 \verb|  --time-dummies=noprint|
  970 
  971 This means that although the dummies are included in the specification
  972 their coefficients, standard errors and so on are not printed.
  973 
  974 %%% Local Variables:
  975 %%% mode: latex
  976 %%% TeX-master: "gretl-guide"
  977 %%% End: