"Fossies" - the Fresh Open Source Software Archive

Member "gretl-2020e/doc/tex/hp-series.tex" (2 Oct 2015, 10171 Bytes) of package /linux/misc/gretl-2020e.tar.xz:

As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) TeX and LaTeX source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 \chapter{Series and lists}
    2 \label{chap:series-etc}
    4 Scalars, matrices and strings can be used in a hansl script at any
    5 point; series and lists, on the other hand, are inherently tied to a
    6 dataset and therefore can be used only when a dataset is currently
    7 open.
    9 \section{The \texttt{series} type}
   10 \label{sec:series}
   12 Series are just what any applied economist would call ``variables'',
   13 that is, repeated observations of a given quantity; a dataset is an
   14 ordered array of series, complemented by additional information, such
   15 as the nature of the data (time-series, cross-section or panel),
   16 descriptive labels for the series and/or the observations, source
   17 information and so on. Series are the basic data type on which gretl's
   18 built-in estimation commands depend.
   20 The series belonging to a dataset are named via standard hansl
   21 identifiers (strings of maximum length 31 characters as described
   22 above). In the context of commands that take series as arguments,
   23 series may be referenced either by name or by \emph{ID number}, that
   24 is, the index of the series within the dataset. Position 0 in a
   25 dataset is always taken by the automatic ``variable'' known as
   26 \texttt{const}, which is just a column of 1s. The IDs of the actual
   27 data series can be displayed via the \cmd{varlist} command. (But note
   28 that in \textit{function calls}, as opposed to commands, series must
   29 be referred to by name.)  A detailed description of how a dataset
   30 works can be found in chapter 4 of \GUG.
   32 Some basic rules regarding series follow:
   33 \begin{itemize}
   34 \item If \texttt{lngdp} belongs to a time series or panel dataset,
   35   then the syntax \texttt{lngdp(-1)} yields its first lag, and
   36   \texttt{lngdp(+1)} its first lead.
   37 \item To access individual elements of a series, you use square
   38   brackets enclosing
   39   \begin{itemize}
   40   \item the progressive (1-based) number of the observation you want,
   41     as in \verb|lngdp[15]|, or
   42   \item the corresponding date code in the case of time-series data,
   43     as in \verb|lngdp[2008:4]| (for the 4th quarter of 2008), or
   44   \item the corresponding observation marker string, if the dataset
   45     contains any, as in \verb|GDP["USA"]|.
   46   \end{itemize}
   47 \end{itemize}
   49 The rules for assigning values to series are just the same as for
   50 other objects, so the following examples should be self-explanatory:
   51 \begin{code}
   52   series k = 3         # implicit conversion from scalar; a constant series
   53   series x = normal()  # pseudo-rv via a built-in function
   54   series s = a/b       # element-by-element operation on existing series
   56   series movavg = 0.5*(x + x(-1)) # using lags
   57   series y[2012:4] = x[2011:2]    # using individual data points
   58   series x2000 = 100*x/x[2000:1]  # constructing an index
   59 \end{code}
   61 \tip{In hansl, you don't have separate commands for \emph{creating}
   62   series and \emph{modifying} them. Other popular packages make this
   63   distinction, but we still struggle to understand why this is
   64   supposed to be useful.}
   66 \subsection{Converting series to or from matrices}
   68 The reason why hansl provides a specific series type, distinct from
   69 the matrix type, is historical. However, is also a very convenient
   70 feature.  Operations that are typically performed on series in applied
   71 work can be awkward to implement using ``raw'' matrices---for example,
   72 the computation of leads and lags, or regular and seasonal
   73 differences; the treatment of missing values; the addition of
   74 descriptive labels, and so on.
   76 Anyway, it is straightforward to convert data in either direction
   77 between the series and matrix types.
   78 \begin{itemize}
   79 \item To turn series into matrices, you use the curly braces syntax,
   80   as in
   81   \begin{code}
   82     matrix MACRO = {outputgap, unemp, infl}
   83   \end{code}
   84   where you can also use lists; the number of rows of the resulting
   85   matrix will depend on your currently selected sample.
   86 \item To turn matrices into series, you can just use matrix columns,
   87   as in
   88   \begin{code}
   89     series y = my_matrix[,4]
   90   \end{code}
   91   But note that this will work only if the number of rows in
   92   \texttt{my\_matrix} matches the length of the dataset (or the
   93   currently selected sample range).
   94 \end{itemize}
   96 Also note that the \cmd{lincomb} and \cmd{filter} functions are quite
   97 useful for creating and manipulating series in complex ways without
   98 having to convert the data to matrix form (which could be
   99 computationally costly with large datasets).
  101 \subsection{The ternary operator with series}
  103 Consider this assignment:
  105 \begin{code}
  106   worker_income = employed ? income : 0
  107 \end{code}
  109 Here we assume that \texttt{employed} is a dummy series coding for
  110 employee status. Its value will be tested for each observation in the
  111 current sample range and the value assigned to \texttt{worker\_income}
  112 at that observation will be determined accordingly. It is therefore
  113 equivalent to the following much more verbose formulation (where
  114 \dollar{t1} and \dollar{t2} are accessors for the start and end of the
  115 sample range):
  116 \begin{code}
  117 series worker_income
  118 loop i=$t1..$t2
  119     if employed[i]
  120         worker_income[i] = income[i]
  121     else
  122         worker_income[i] = 0
  123     endif
  124 endloop
  125 \end{code}
  127 \section{The \texttt{list} type}
  128 \label{sec:lists}
  130 In hansl parlance, a \textit{list} is an array of integers,
  131 representing the ID numbers of a set (in a loose sense of the word) of
  132 series.  For this reason, the most common operations you perform on
  133 lists are set operations such as addition or deletion of members,
  134 union, intersection and so on. Unlike sets, however, hansl lists are
  135 ordered, so individual list members can be accessed via the
  136 \texttt{[]} syntax, as in \texttt{X[3]}.
  138 There are several ways to assign values to a list.  The most basic
  139 sort of expression that works in this context is a space-separated
  140 list of series, given either by name or by ID number.  For example,
  141 \begin{code}
  142 list xlist = 1 2 3 4
  143 list reglist = income price 
  144 \end{code}
  145 An empty list is obtained by using the keyword \texttt{null}, as in
  146 \begin{code}
  147 list W = null  
  148 \end{code}
  149 or simply by bare declaration. Some more special forms (for example,
  150 using wildcards) are described in \GUG.
  152 The main idea is to use lists to group, under one identifier, one or
  153 more series that logically belong together somehow (for example, as
  154 explanatory variables in a model). So, for example,
  155 \begin{code}
  156 list xlist = x1 x2 x3 x4
  157 ols y 0 xlist
  158 \end{code}
  159 is an idiomatic way of specifying the OLS regression that could also
  160 be written as
  161 \begin{code}
  162 ols y 0 x1 x2 x3 x4
  163 \end{code}
  164 Note that we used here the convention, mentioned in section
  165 \ref{sec:series}, by which a series can be identified by its ID number
  166 when used as an argument to a command, typing \texttt{0} instead
  167 of \texttt{const}.
  169 Lists can be concatenated, as in as in \texttt{list L3 = L1 L2} (where
  170 \texttt{L1} and \texttt{L2} are names of existing lists). This will
  171 not necessarily do what you want, however, since the resulting list
  172 may contain duplicates. It's more common to use the following set
  173 operations:
  175 \begin{center}
  176   \begin{tabular}{rl}
  177     \textbf{Operator} & \textbf{Meaning} \\
  178     \hline
  179     \verb,||, & Union \\
  180     \verb|&&| & Intersection \\
  181     \verb|-|  & Set difference \\
  182     \hline
  183   \end{tabular}
  184 \end{center}
  186 So for example, if \texttt{L1} and \texttt{L2} are existing lists,
  187 after running the following code snippet
  188 \begin{code}
  189   list UL = L1 || L2 
  190   list IL = L1 && L2
  191   list DL = L1 - L2
  192 \end{code}
  193 the list \texttt{UL} will contain all the members of \texttt{L1}, plus
  194 any members of \texttt{L2} that are not already in \texttt{L1};
  195 \texttt{IL} will contain all the elements that are present in both
  196 \texttt{L1} and \texttt{L2} and \texttt{DL} will contain all the
  197 elements of \texttt{L1} that are not present in \texttt{L2}. 
  199 To \textit{append} or \textit{prepend} variables to an existing list,
  200 we can make use of the fact that a named list stands in for a
  201 ``longhand'' list.  For example, assuming that a list \texttt{xlist}
  202 is already defined (possibly as \texttt{null}), we can do
  203 \begin{code}
  204 list xlist = xlist 5 6 7
  205 xlist = 9 10 xlist 11 12
  206 \end{code}
  208 Another option for appending terms to, or dropping terms from, an
  209 existing list is to use \texttt{+=} or \texttt{-=}, respectively, as
  210 in
  211 \begin{code}
  212 xlist += cpi
  213 zlist -= cpi
  214 \end{code}
  215 A nice example of the above is provided by a common idiom: you may
  216 see in hansl scripts something like
  217 \begin{code}
  218   list C -= const
  219   list C = const C
  220 \end{code}
  221 which ensures that the series \texttt{const} is included (exactly
  222 once) in the list \texttt{C}, and comes first.
  224 \subsection{Converting lists to or from matrices}
  226 The idea of converting from a list, as defined above, to a matrix may
  227 be taken in either of two ways. You may want to turn a list into a
  228 matrix (vector) by filling the latter with the ID numbers contained in
  229 the former, or rather to create a matrix whose columns contain the
  230 series to which the ID numbers refer. Both interpretations are
  231 legitimate (and potentially useful in different contexts) so hansl
  232 lets you go either way.
  234 If you assign a list to a matrix, as in
  235 \begin{code}
  236   list L = moo foo boo zoo
  237   matrix A = L
  238 \end{code}
  239 the matrix \texttt{A} will contain the ID numbers of the four series
  240 as a row vector. This operation goes both ways, so the statement
  241 \begin{code}
  242   list C = seq(7,10)
  243 \end{code}
  244 is perfectly valid (provided, of course, that you have at least 10
  245 series in the currently open dataset).
  247 If instead you want to create a data matrix from the series which
  248 belong to a given list, you have to enclose the list name in curly
  249 brackets, as in
  250 \begin{code}
  251   matrix X = {L}
  252 \end{code}
  254 \subsection{The \texttt{foreach} loop variant with lists}
  256 Lists can be used as the ``catalogue'' in the \texttt{foreach} variant
  257 of the \cmd{loop} construct (see section \ref{sec:loop-foreach}). This
  258 is especially handy when you have to perform some operation on
  259 multiple series. For example, the following syntax can be used to
  260 calculate and print the mean of each of several series:
  261 \begin{code}
  262 list X = age income experience
  263 loop foreach i X
  264     printf "mean($i) = %g\n", mean($i)
  265 endloop
  266 \end{code}
  268 %%% Local Variables: 
  269 %%% mode: latex
  270 %%% TeX-master: "hansl-primer"
  271 %%% End: