"Fossies" - the Fresh Open Source Software Archive

Member "ffe-0.3.9/doc/ffe.html" (18 Mar 2018, 74620 Bytes) of package /linux/privat/ffe-0.3.9.tar.gz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) HTML source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
    2 <html>
    3 <!-- This file documents version 0.3.8 of ffe, a flat file extractor. 
    4 
    5 Copyright (C) 2014 Timo Savinen
    6 
    7 Permission is granted to make and distribute verbatim copies of
    8 this manual provided the copyright notice and this permission notice
    9 are preserved on all copies.
   10 
   11 Permission is granted to copy and distribute modified versions of this
   12 manual under the conditions for verbatim copying, provided that the entire
   13 resulting derived work is distributed under the terms of a permission
   14 notice identical to this one.
   15 
   16 Permission is granted to copy and distribute translations of this manual
   17 into another language, under the above conditions for modified versions. -->
   18 <!-- Created by GNU Texinfo 6.3, http://www.gnu.org/software/texinfo/ -->
   19 <head>
   20 <title>ffe - flat file extractor</title>
   21 
   22 <meta name="description" content="ffe - flat file extractor">
   23 <meta name="keywords" content="ffe - flat file extractor">
   24 <meta name="resource-type" content="document">
   25 <meta name="distribution" content="global">
   26 <meta name="Generator" content="makeinfo">
   27 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   28 <link href="#Top" rel="start" title="Top">
   29 <link href="#SEC_Contents" rel="contents" title="Table of Contents">
   30 <link href="dir.html#Top" rel="up" title="(dir)">
   31 <style type="text/css">
   32 <!--
   33 a.summary-letter {text-decoration: none}
   34 blockquote.indentedblock {margin-right: 0em}
   35 blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
   36 blockquote.smallquotation {font-size: smaller}
   37 div.display {margin-left: 3.2em}
   38 div.example {margin-left: 3.2em}
   39 div.lisp {margin-left: 3.2em}
   40 div.smalldisplay {margin-left: 3.2em}
   41 div.smallexample {margin-left: 3.2em}
   42 div.smalllisp {margin-left: 3.2em}
   43 kbd {font-style: oblique}
   44 pre.display {font-family: inherit}
   45 pre.format {font-family: inherit}
   46 pre.menu-comment {font-family: serif}
   47 pre.menu-preformatted {font-family: serif}
   48 pre.smalldisplay {font-family: inherit; font-size: smaller}
   49 pre.smallexample {font-size: smaller}
   50 pre.smallformat {font-family: inherit; font-size: smaller}
   51 pre.smalllisp {font-size: smaller}
   52 span.nolinebreak {white-space: nowrap}
   53 span.roman {font-family: initial; font-weight: normal}
   54 span.sansserif {font-family: sans-serif; font-weight: normal}
   55 ul.no-bullet {list-style: none}
   56 body {
   57     margin: 1%;
   58     padding: 0 5%;
   59     background: white;
   60     font-family: serif;
   61     text-align: justify;
   62 }
   63 
   64 h1,h2,h3,h4,h5 {
   65     padding: 0.5em 0 0 0;
   66     font-weight: bold;
   67     font-family: sans-serif;
   68 }
   69 
   70 h1 {
   71     padding: 0.5em 0 0.5em 1em;
   72     color: white;
   73     background: #575;
   74 }
   75 
   76 pre {
   77   margin: 0;
   78   padding: 0.5em 0.5em 0.5em 0;
   79 }
   80 
   81 pre.example {
   82   padding: 0;
   83   margin: 0;
   84   background: #eee;
   85 }
   86 
   87 pre.verbatim, .menu {
   88   border: solid 1px gray;
   89   background: white;
   90   padding-bottom: 1em;
   91 }
   92 
   93 div.node {
   94   background: #ccc;
   95   margin: 0;
   96   padding: 0 1.5em;
   97   font-weight: lighter;
   98   color: #000;
   99   text-align: right;
  100 }
  101 
  102 .node a {
  103   color: #770000;
  104 }
  105 
  106 .node a:visited {
  107   color: #550000;
  108 }
  109 
  110 dd, li {
  111   padding-top: 0.1em;
  112   padding-bottom: 0.1em;
  113 }
  114 
  115 samp {
  116     font: inherit;
  117 }
  118 
  119 code {
  120     font-size: inherit;
  121     font-weight: bold;
  122 }
  123 
  124 pre, code { 
  125     font-family: monospace;
  126 }
  127 
  128 .command, .file {
  129    font-family: monospace;
  130 } 
  131 
  132 div.node hr {
  133     display:none;
  134 }
  135 
  136 -->
  137 </style>
  138 
  139 
  140 </head>
  141 
  142 <body lang="en">
  143 <h1 class="settitle" align="center">ffe - flat file extractor</h1>
  144 
  145 
  146 
  147 
  148 
  149 <a name="Top"></a>
  150 <a name="ffe"></a>
  151 <h1 class="top">ffe</h1>
  152 
  153 <p>This file documents version 0.3.8 of <code>ffe</code>, a flat file extractor. 
  154 </p>
  155 <p>Copyright &copy; 2014 Timo Savinen
  156 </p>
  157 <blockquote>
  158 <p>Permission is granted to make and distribute verbatim copies of
  159 this manual provided the copyright notice and this permission notice
  160 are preserved on all copies.
  161 </p>
  162 <p>Permission is granted to copy and distribute modified versions of this
  163 manual under the conditions for verbatim copying, provided that the entire
  164 resulting derived work is distributed under the terms of a permission
  165 notice identical to this one.
  166 </p>
  167 <p>Permission is granted to copy and distribute translations of this manual
  168 into another language, under the above conditions for modified versions.
  169 </p></blockquote>
  170 
  171 
  172 
  173 
  174 <hr>
  175 <a name="Overview"></a>
  176 <a name="Preliminary-information"></a>
  177 <h2 class="chapter">1 Preliminary information</h2>
  178 <a name="index-greetings"></a>
  179 <a name="index-overview"></a>
  180 
  181 <p>The <code>ffe</code> is a program to extract fields from text and binary flat files and to print them in different
  182 formats. The input file structure and printing definitions are specified in a configuration file, which
  183 is always required. Default configuration file is <samp>~/.fferc</samp> (<samp>ffe.rc</samp> in windows).
  184 </p>
  185 <p><code>ffe</code> is a command line tool developed for GNU/Linux and UNIX systems. <code>ffe</code> can read from
  186 standard input and write to standard output, so it can be used as a part of a pipeline.
  187 </p>
  188 <p>There is also binary distribution for windows.
  189 </p>
  190 <hr>
  191 <a name="Samples"></a>
  192 <a name="Samples-using-ffe"></a>
  193 <h2 class="chapter">2 Samples using <code>ffe</code></h2>
  194 <a name="index-sample"></a>
  195 
  196 <p>One example of using <code>ffe</code> for printing personnel information in XML format from fixed length flat file:
  197 </p>
  198 <div class="example">
  199 <pre class="example">$ cat personnel
  200 john     Ripper       23
  201 Scott    Tiger        45
  202 Mary     Moore        41
  203 $
  204 </pre></div>
  205 
  206 <p>A file <samp>personnel</samp> contains three fixed length fields: &lsquo;<samp>FirstName</samp>&rsquo;, &lsquo;<samp>LastName</samp>&rsquo; and &lsquo;<samp>Age</samp>&rsquo;,
  207 their respective lengths are 9,13 and 2.
  208 </p>
  209 <p>In order to print data above in XML, following configuration file must be available:
  210 </p>
  211 <div class="example">
  212 <pre class="example">$cat personnel.fferc
  213 structure personel {
  214     type fixed
  215     output xml
  216     record person {
  217         field FirstName 9
  218         field LastName  13
  219         field Age 2
  220     }
  221 }
  222 
  223 output xml {
  224     file_header &quot;&lt;?xml version=\&quot;1.0\&quot; encoding=\&quot;ISO-8859-1\&quot;?&gt;\n&quot;
  225     data &quot;&lt;%n&gt;%t&lt;/%n&gt;\n&quot;
  226     record_header &quot;&lt;%r&gt;\n&quot;
  227     record_trailer &quot;&lt;/%r&gt;\n&quot;
  228     indent &quot; &quot;
  229 }
  230 $
  231 </pre></div>
  232 
  233 <p>Using ffe:
  234 </p>
  235 <div class="example">
  236 <pre class="example">$ffe -c personnel.fferc personnel
  237 &lt;?xml version=&quot;1.0&quot; encoding=&quot;ISO-8859-1&quot;?&gt;
  238  &lt;person&gt;
  239   &lt;FirstName&gt;john&lt;/FirstName&gt;
  240   &lt;LastName&gt;Ripper&lt;/LastName&gt;
  241   &lt;Age&gt;23&lt;/Age&gt;
  242  &lt;/person&gt;
  243  &lt;person&gt;
  244   &lt;FirstName&gt;Scott&lt;/FirstName&gt;
  245   &lt;LastName&gt;Tiger&lt;/LastName&gt;
  246   &lt;Age&gt;45&lt;/Age&gt;
  247  &lt;/person&gt;
  248  &lt;person&gt;
  249   &lt;FirstName&gt;Mary&lt;/FirstName&gt;
  250   &lt;LastName&gt;Moore&lt;/LastName&gt;
  251   &lt;Age&gt;41&lt;/Age&gt;
  252  &lt;/person&gt;
  253 $
  254 </pre></div>
  255 
  256 <hr>
  257 <a name="Invoking-ffe"></a>
  258 <a name="How-to-run-ffe"></a>
  259 <h2 class="chapter">3 How to run <code>ffe</code></h2>
  260 <a name="index-running-ffe"></a>
  261 <a name="index-using"></a>
  262 
  263 <p><code>ffe</code> is a command line tool. Normally <code>ffe</code> can be invoked as:
  264 </p>
  265 <p><code>ffe -o OUTPUTFILE INPUTFILE&hellip;</code>
  266 </p>
  267 <p><code>ffe</code> uses the definitions from the configuration file and tries to guess the input file
  268 structure.
  269 </p>
  270 <p>If the structure cannot be guessed the option <samp>-s</samp> must be used.
  271 </p>
  272 
  273 <hr>
  274 <a name="Invocation"></a>
  275 <a name="Program-invocation"></a>
  276 <h3 class="section">3.1 Program invocation</h3>
  277 <a name="index-options"></a>
  278 
  279 <p>The format for running the <code>ffe</code> program is:
  280 </p>
  281 <div class="example">
  282 <pre class="example">ffe <var>option</var> &hellip;
  283 </pre></div>
  284 
  285 <p><code>ffe</code> supports the following options:
  286 </p>
  287 <dl compact="compact">
  288 <dt><code>-c <var>file</var></code></dt>
  289 <dt><code>--configuration=<var>file</var></code></dt>
  290 <dd><p>Configuration is read from <var>file</var>, instead of <samp>~/.fferc</samp> (<samp>ffe.rc</samp> in windows).
  291 </p>
  292 </dd>
  293 <dt><code>-s <var>structure</var></code></dt>
  294 <dt><code>--structure=<var>structure</var></code></dt>
  295 <dd><p>Use structure <var>structure</var> for input file, suppresses guessing.
  296 </p>
  297 </dd>
  298 <dt><code>-p <var>output</var></code></dt>
  299 <dt><code>--print=<var>output</var></code></dt>
  300 <dd><p>Use output format <var>output</var> for printing. If not given, then the record or structure related
  301 output format is used. Printing can be suppressed using format <var>no</var>. Original data is printed using format <var>raw</var>.
  302 </p>
  303 </dd>
  304 <dt><code>-o <var>file</var></code></dt>
  305 <dt><code>--output=<var>file</var></code></dt>
  306 <dd><p>Write output to <var>file</var> instead of standard output.
  307 </p>
  308 </dd>
  309 <dt><code>-f <var>list</var></code></dt>
  310 <dt><code>--field-list=<var>list</var></code></dt>
  311 <dd><p>Print only fields and constants listed in the comma separated list <var>list</var>. Order of names in 
  312 <var>list</var> specifies also the printing order.
  313 </p>
  314 </dd>
  315 <dt><code>-e <var>expression</var></code></dt>
  316 <dt><code>--expression=<var>expression</var></code></dt>
  317 <dd><p>Print only those records for which the <var>expression</var> evaluates to true.
  318 </p>
  319 </dd>
  320 <dt><code>-a</code></dt>
  321 <dt><code>--and</code></dt>
  322 <dd><p>Expressions are combined with logical and, default is logical or.
  323 Note that if the same field and operator appear several time in expressions they are always compared with logical or.
  324 </p>
  325 </dd>
  326 <dt><code>-X</code></dt>
  327 <dt><code>--casecmp</code></dt>
  328 <dd><p>Expressions are evaluated using case insensitive comparison
  329 </p>
  330 </dd>
  331 <dt><code>-v</code></dt>
  332 <dt><code>--invert-match</code></dt>
  333 <dd><p>Print only those records which don&rsquo;t match the expression.
  334 </p>
  335 </dd>
  336 <dt><code>-l</code></dt>
  337 <dt><code>--loose</code></dt>
  338 <dd><p>Normally <code>ffe</code> stops when it encounters an input line or binary block which doesn&rsquo;t match any of
  339 the records in selected structure. Defining this option causes <code>ffe</code> continue despite the error.
  340 Note that invalid lines are reported only for text input. In case of binary input next valid block is silently searched.
  341 </p>
  342 </dd>
  343 <dt><code>-r</code></dt>
  344 <dt><code>--replace=<var>field</var>=<var>value</var></code></dt>
  345 <dd><p>Replace <var>field</var>s contents with <var>value</var> in output. <var>value</var> can contain same directives as output option <code>data</code>.
  346 </p>
  347 </dd>
  348 <dt><code>-d</code></dt>
  349 <dt><code>--debug</code></dt>
  350 <dd><p>All invalid input lines are written to <samp>ffe_error_&lt;pid&gt;.log</samp>, where <samp>&lt;pid&gt;</samp> is the process ID.
  351 </p>
  352 </dd>
  353 <dt><code>-I</code></dt>
  354 <dt><code>--info</code></dt>
  355 <dd><p>Show structure information in the configuration file and exit successfully. For every structure following information in shown:
  356 <br>
  357 Structures: Name, type and maximum record length. 
  358 <br>
  359 Records: Name and length
  360 <br>
  361 Fields: Name, position and length. First position is number one.
  362 </p>
  363 </dd>
  364 <dt><code>-?</code></dt>
  365 <dt><code>--help</code></dt>
  366 <dd><p>Print an informative help message describing the options and then exit
  367 successfully.
  368 </p>
  369 </dd>
  370 <dt><code>-V</code></dt>
  371 <dt><code>--version</code></dt>
  372 <dd><p>Print the version number of <code>ffe</code> and then exit successfully.
  373 </p></dd>
  374 </dl>
  375 
  376 <p>All remaining options are names of input files, if no input files are specified or <code>-</code> is given, then the standard input is read.
  377 </p>
  378 <a name="Expressions-_0028option-_002de_002c-_002d_002dexpression_0029"></a>
  379 <h4 class="subheading">Expressions (option <samp>-e</samp>, <samp>--expression</samp>)</h4>
  380 <p>Expression can be used to select specific records comparing field values. 
  381 Expression has syntax <var>field</var><strong>x</strong><var>value</var>, where <strong>x</strong> is the comparison operator.
  382 Expression is used to compare field&rsquo;s contents to <var>value</var> and if comparison is successful
  383 the record is printed. Several expressions can be given and at least one must evaluate to true in
  384 order to print a record. If option <samp>-a</samp> is given all expressions must evaluate to true.
  385 </p>
  386 <p>If <var>value</var> starts with string <code>file:</code> then the rest of <var>value</var> is considered as a file name.
  387 Every line in file is used as <var>value</var> in comparison. Comparison evaluates true if one or more values matches, so this makes possible use several different values in comparison. <strong>Note</strong>: The file size is limited by available memory because the file contents is loaded to memory. 
  388 </p>
  389 <p>When comparing binary fields the <var>value</var> must have the representation which can be shown using the <code>%d</code> output directive. Note that the printing option <var>hex-caps</var> takes effect in comparison.
  390 </p>
  391 <p>Expression notation:
  392 </p>
  393 <dl compact="compact">
  394 <dt><var>field<strong>=</strong>value</var></dt>
  395 <dd><p>Field <var>field</var> is equal to <var>value</var>.
  396 </p>
  397 </dd>
  398 <dt><var>field<strong>^</strong>value</var></dt>
  399 <dd><p>Field <var>field</var> starts with <var>value</var>.
  400 </p>
  401 </dd>
  402 <dt><var>field<strong>~</strong>value</var></dt>
  403 <dd><p>Field <var>field</var> contains <var>value</var>.
  404 </p>
  405 </dd>
  406 <dt><var>field<strong>!</strong>value</var></dt>
  407 <dd><p>Field <var>field</var> is not equal to <var>value</var>.
  408 </p>
  409 </dd>
  410 <dt><var>field<strong>?</strong>value</var></dt>
  411 <dd><p>Field <var>field</var> matches the regular expression <var>value</var>. 
  412 <code>ffe</code> supports POSIX extended regular expressions. 
  413 </p></dd>
  414 </dl>
  415 
  416 <hr>
  417 <a name="Configuration"></a>
  418 <a name="Configuration-1"></a>
  419 <h3 class="section">3.2 Configuration</h3>
  420 <a name="index-configuration"></a>
  421 
  422 <p><code>ffe</code> uses configuration file in order to read the input file and print the output.
  423 </p>
  424 <p>Configuration file for <code>ffe</code> is a text file. The file may contain empty lines. 
  425 Commands are case sensitive. Comments  begin with the <code>#</code>-character and end at the end of the line. 
  426 The <code>string</code> definitions can be enclosed in double quotation <code>&quot;</code> characters. 
  427 <code>char</code> is a single character. <code>string</code> and <code>char</code> can contain following escape codes: 
  428 <code>\a</code>, <code>\b</code>, <code>\t</code>, <code>\n</code>, <code>\v</code>, <code>\f</code>, <code>\r</code>, <code>\&quot;</code> and <code>\#</code>. 
  429 A backslash can be escaped as <code>\\</code>.
  430 </p>
  431 <p>Configuration has two main parts: the structure, which specifies the input file structure and 
  432 the output, which specifies how the input data is formatted for output.
  433 </p>
  434 <a name="Common-syntax"></a>
  435 <h4 class="subheading">Common syntax</h4>
  436 <p>Common syntax for configuration file is:
  437 </p>
  438 <div class="example">
  439 <pre class="example">#comment
  440 `command`
  441 const <var>name</var> <var>value</var>
  442 filter <var>name</var> <var>value</var>
  443 &hellip;
  444 structure <var>name</var> {
  445     <i>option value</i> &hellip;
  446     &hellip;
  447     record <var>name</var> {
  448         <i>option value</i> &hellip;
  449         &hellip;
  450     }
  451     record <var>name</var> {
  452         <i>option value</i> &hellip;
  453         &hellip;
  454     }
  455     &hellip;
  456 }
  457 structure <var>name</var> {
  458     &hellip;
  459 }
  460 &hellip;
  461 output <var>name</var> {
  462     <i>option value</i> &hellip;
  463     &hellip;
  464 }
  465 output <var>name</var> {
  466     &hellip;
  467 }
  468 &hellip;
  469 lookup <var>name</var> {
  470     <i>option value</i> &hellip;
  471     &hellip;
  472 }
  473 lookup <var>name</var> {
  474     &hellip;
  475 }
  476 
  477 &hellip;
  478 </pre></div>
  479 
  480 <a name="Structure"></a>
  481 <h4 class="subheading">Structure</h4>
  482 <p>Keyword <code>structure</code> is used to specify the input file content. An input file can contain several
  483 types of records (lines or binary blocks). E.g. file can have a header, data and trailer record types. Records
  484 must be distinguishable from each other, this can be achieved defining different &rsquo;keys&rsquo; 
  485 (<code>id</code> in record definition) or having different line lengths (for fixed length) or different count
  486 of fields (for separated structure) for different records.
  487 </p>
  488 <p>If binary structure has several records, then all records must have at least one key (<code>id</code>), because binary blocks can
  489 be distinguished only by using keys.
  490 </p>
  491 <p>The structure notation:
  492 <br>
  493 </p>
  494 <div class="example">
  495 <pre class="example">structure <var>name</var> {
  496     <i>option value</i> &hellip;
  497     &hellip;
  498 }
  499 </pre></div>
  500 
  501 <p>A structure can contain following options:
  502 </p>
  503 <dl compact="compact">
  504 <dt><code>type fixed|binary|separated [<var>char</var>] [*]</code></dt>
  505 <dd><p>The fields in the input are fixed length fields (text or binary) or text fields separated by <var>char</var>. If * is given,
  506 multiple sequential separators are considered as one. Default separator is comma.
  507 </p>
  508 </dd>
  509 <dt><code>quoted [<var>char</var>]</code></dt>
  510 <dd><p>Fields may be quoted with char, default quotation mark is the double quotation mark &rsquo;&quot;&rsquo;.
  511 A quotation mark is assumed to be escaped as \<var>char</var> or doubling the mark as <var>charchar</var> in input.
  512 Non escaped quotation marks are not preserved in output.
  513 </p>
  514 </dd>
  515 <dt><code>header first|all|no</code></dt>
  516 <dd><p>Controls the occurrence of the header line. Default is no. If set as <em>first</em> or <em>all</em>, the first line
  517 of the first input file is considered as header line containing the names of  the  fields. <em>first</em>
  518 means  that  only  the  first  file  has  a header, <em>all</em> means means that all files have a header,
  519 although the names are still taken from the header of the first file. Header line is handled
  520 according the record definition, meaning that the name positions, separators etc. are the same as
  521 for the fields. Binary files cannot have a header.
  522 </p>
  523 </dd>
  524 <dt><code>output <var>name</var>|no|raw</code></dt>
  525 <dd><p>All records belonging to this structure are printed according output format name.
  526 Default is to use output named as &lsquo;<samp>default</samp>&rsquo;. &lsquo;<samp>no</samp>&rsquo; prints nothing and &lsquo;<samp>raw</samp>&rsquo; prints only the original data.
  527 </p>
  528 </dd>
  529 <dt><code>record <var>name</var> {<i>options</i> &hellip;}</code></dt>
  530 <dd><p>Specifies one record for a structure. A structure can contain several record types.
  531 </p></dd>
  532 </dl>
  533 
  534 <a name="Record"></a>
  535 <h4 class="subheading">Record</h4>
  536 <p>A record specifies one type of input line or binary block in a file. Different records can be distinguished using 
  537 the <code>id</code> option or different line lengths or field counts. In multi-record binary structure every record must have at least one <code>id</code> because binary records do not have a special end of record marker as text lines have.
  538 </p>
  539 <p>The record notation:
  540 <br>
  541 </p>
  542 <div class="example">
  543 <pre class="example">record <var>name</var> {
  544     <i>option value</i> &hellip;
  545     &hellip;
  546 }
  547 </pre></div>
  548 <p>A record can contain following options:
  549 </p>
  550 <dl compact="compact">
  551 <dt><code>id <var>position</var> <var>string</var></code></dt>
  552 <dt><code>rid <var>position</var> <var>regexp</var></code></dt>
  553 <dd><p>Identifies a record in the input file. Records are identified by the <var>string</var> or by the regular expression <var>regexp</var> in input record position
  554 <var>position</var>. For fixed length and binary input the position is the byte position of input record and for
  555 separated input the <var>position</var> is the <var>position</var>&rsquo;th field of the input record. Positions starts always from one.
  556 </p>
  557 <p>A record definition can contain several id&rsquo;s, then all id&rsquo;s must match the input line 
  558 (<code>id</code>&rsquo;s are <em>and-ed</em>).
  559 </p>
  560 <p>Non printable characters can be escaped as &lsquo;<samp>\xnn</samp>&rsquo;, where &lsquo;<samp>nn</samp>&rsquo; is characters hexadecimal value.
  561 </p>
  562 </dd>
  563 <dt><code>field <var>name</var>|FILLER|* [<var>length</var>]|* [<var>lookup</var>]|* [<var>output</var>]|* [<var>filter</var>]|* [<var>conversion</var>]</code></dt>
  564 <dd><p>Defines a field in a text input structure.  <var>length</var> is mandatory for fixed length input structure.
  565 </p>
  566 <p>The last field of a fixed length input structure can have a <em>*</em> in place of <var>length</var>. That means that the last field
  567 has no exact length specified and it gets the remainder of the input line after all other fields. This allows a
  568 fixed record to have arbitrary long last field.
  569 </p>
  570 <p>Length is also used for printing the fields in fixed length format (directive <code>%D</code> in output definitions).
  571 </p>
  572 <p>If <em>*</em> is given instead of the name, then the <var>name</var> will be the ordinal number of the field,
  573 or if the <code>header</code> option has value <em>first</em> or <em>all</em>, then the name of the field will be taken from
  574 the header line (first line of the input).
  575 </p>
  576 <p>If <var>lookup</var> is given then the fields contents is used to make a lookup in lookup table <var>lookup</var>. 
  577 If <var>length</var> is not needed (separated format) but lookup is needed, use asterisk (*) in place of length definition.
  578 </p>
  579 <p>If <var>output</var> is given the field will be printed using output definition <var>output</var>. If <var>length</var> and/or <var>lookup</var> are not needed use asterisk in place of them. Use asterisk (*) if not needed.
  580 </p>
  581 <p>If <var>filter</var> is given the raw contents of the field is filtered through a program defined by <var>filter</var> and the output of the program is printed as field contents.
  582 </p>
  583 <p>If <var>conversion</var> is given it should contain a single printf style conversion specification, which will be used in printing. Conversion specification must start with <code>%</code> and the last character must be from set <code>diuoxXfeEgGcs</code>.
  584 </p>
  585 <p>If field is named as <code>FILLER</code>, the field will not appear in output.
  586 </p>
  587 <p>The order of fields in configuration file is essential, it specifies the field order in a record.
  588 </p></dd>
  589 <dt><code>field <var>name</var>|FILLER|* <var>length</var>|<var>type</var> [<var>lookup</var>]|* [<var>output</var>]|* [<var>filter</var>]|* [<var>conversion</var>]</code></dt>
  590 <dd><p>Defines a field in a binary structure. All other features are same as for text structure fields except the <var>type</var> parameter.
  591 </p>
  592 <p><var>type</var> specifies the field length and type and can have the following values:
  593 </p>
  594 <dl compact="compact">
  595 <dt><code>char</code></dt>
  596 <dd><p>Printable character.
  597 </p></dd>
  598 <dt><code>short</code></dt>
  599 <dd><p>Short integer having current system length and byte order.
  600 </p></dd>
  601 <dt><code>int</code></dt>
  602 <dd><p>Integer having current system length and byte order.
  603 </p></dd>
  604 <dt><code>long</code></dt>
  605 <dd><p>Long integer having current system length and byte order.
  606 </p></dd>
  607 <dt><code>llong</code></dt>
  608 <dd><p>Long long integer having current system length and byte order.
  609 </p></dd>
  610 <dt><code>ushort</code></dt>
  611 <dd><p>Unsigned short integer having current system length and byte order.
  612 </p></dd>
  613 <dt><code>uint</code></dt>
  614 <dd><p>Unsigned integer having current system length and byte order.
  615 </p></dd>
  616 <dt><code>ulong</code></dt>
  617 <dd><p>Unsigned long integer having current system length and byte order.
  618 </p></dd>
  619 <dt><code>ullong</code></dt>
  620 <dd><p>Unsigned long long integer having current system length and byte order.
  621 </p></dd>
  622 <dt><code>int8</code></dt>
  623 <dd><p>8 bit integer.
  624 </p></dd>
  625 <dt><code>int16_be</code></dt>
  626 <dd><p>Big endian 16 bit integer.
  627 </p></dd>
  628 <dt><code>int32_be</code></dt>
  629 <dd><p>Big endian 32 bit integer.
  630 </p></dd>
  631 <dt><code>int64_be</code></dt>
  632 <dd><p>Big endian 64 bit integer.
  633 </p></dd>
  634 <dt><code>int16_le</code></dt>
  635 <dd><p>Little endian 16 bit integer.
  636 </p></dd>
  637 <dt><code>int32_le</code></dt>
  638 <dd><p>Little endian 32 bit integer.
  639 </p></dd>
  640 <dt><code>int64_le</code></dt>
  641 <dd><p>Little endian 64 bit integer.
  642 </p></dd>
  643 <dt><code>uint8</code></dt>
  644 <dd><p>Unsigned 8 bit integer.
  645 </p></dd>
  646 <dt><code>uint16_be</code></dt>
  647 <dd><p>Unsigned big endian 16 bit integer.
  648 </p></dd>
  649 <dt><code>uint32_be</code></dt>
  650 <dd><p>Unsigned big endian 32 bit integer.
  651 </p></dd>
  652 <dt><code>uint64_be</code></dt>
  653 <dd><p>Unsigned big endian 64 bit integer.
  654 </p></dd>
  655 <dt><code>uint16_le</code></dt>
  656 <dd><p>Unsigned little endian 16 bit integer.
  657 </p></dd>
  658 <dt><code>uint32_le</code></dt>
  659 <dd><p>Unsigned little endian 32 bit integer.
  660 </p></dd>
  661 <dt><code>uint64_le</code></dt>
  662 <dd><p>Unsigned little endian 64 bit integer.
  663 </p></dd>
  664 <dt><code>float</code></dt>
  665 <dd><p>Float having current system length and byte order.
  666 </p></dd>
  667 <dt><code>float_be</code></dt>
  668 <dd><p>Float having current system length and big endian byte order.
  669 </p></dd>
  670 <dt><code>float_le</code></dt>
  671 <dd><p>Float having current system length and little endian byte order.
  672 </p></dd>
  673 <dt><code>double</code></dt>
  674 <dd><p>Double having current system length and byte order.
  675 </p></dd>
  676 <dt><code>double_be</code></dt>
  677 <dd><p>Double having current system length and big endian byte order.
  678 </p></dd>
  679 <dt><code>double_le</code></dt>
  680 <dd><p>Double having current system length and little endian byte order.
  681 </p></dd>
  682 <dt><code>bcd_be_<var>len</var></code></dt>
  683 <dd><p>Bcd number having length <var>len</var> and nybbles in big endian order.
  684 </p></dd>
  685 <dt><code>bcd_le_<var>len</var></code></dt>
  686 <dd><p>Bcd number having length <var>len</var> and nybbles in little endian order.
  687 </p></dd>
  688 <dt><code>hex_be_<var>len</var></code></dt>
  689 <dd><p>Hexadecimal data in big endian order having length <var>len</var>.
  690 </p></dd>
  691 <dt><code>hex_le_<var>len</var></code></dt>
  692 <dd><p>Hexadecimal data in little endian order having length <var>len</var>.
  693 </p></dd>
  694 </dl>
  695 
  696 <p>If <var>length</var> is given instead of the <var>type</var>, then the field is assumed to be a printable string having length <var>length</var>. String is printed until <var>length</var> characters are printed or NULL character is found.
  697 </p>
  698 <p>Bcd number (<code>bcd_be_<var>len</var></code> and <code>bcd_le_<var>len</var></code>) is printed until <var>len</var> bytes are read or a nybble having hexadecimal value <code>f</code> is found.
  699 Bcd number having big endian order is printed in order: most significant nybble first and least significant nybble second and bcd number having little endian order is printed in order: least significant nybble first and most significant nybble second. Bytes are always read in big endian order.
  700 </p>
  701 <p>Hexadecimal data (<code>hex_be_<var>len</var></code> and <code>hex_le_<var>len</var></code>) is printed as hexadecimal values. Big endian data is printed starting from lower address and little endian data starting from upper address.
  702 </p>
  703 </dd>
  704 <dt><code>field-count <var>number</var></code></dt>
  705 <dd><p>Same effect as having &quot;<code>field *</code>&quot; <var>number</var> times. This can be used in separated structure instead of
  706 writing sequential &quot;<code>field *</code>&quot; definitions. Several <code>field-count</code>s can be used in the same record and
  707 they can be mixed with <code>field</code>.
  708 </p></dd>
  709 <dt><code>fields-from <var>record</var></code></dt>
  710 <dd><p>Fields in this record are the same as in record <var>record</var>. <code>field</code> and <code>fields-from</code> are mutually
  711 exclusive.
  712 </p></dd>
  713 <dt><code>output <var>name</var>|no|raw</code></dt>
  714 <dd><p>This record is printed according to output format <var>name</var>. Default is to use output format specified in structure.
  715 </p></dd>
  716 <dt><code>level <var>number</var> [<var>element_name</var>|*] [<var>group_name</var>]</code></dt>
  717 <dd><p>Levels can be used to print the file in hierarchical multi-level nested form document.
  718 <var>number</var> is the level of the record, starting from number one (highest level), 
  719 <var>element_name</var> is the name for the record, <var>group_name</var> 
  720 is used to group records in the same and lower levels. Only <var>number</var> is mandatory.
  721 Use * instead of the element name if group name is needed.
  722 </p></dd>
  723 <dt><code>record-length strict|minimum</code></dt>
  724 <dd><dl compact="compact">
  725 <dt><code>strict</code></dt>
  726 <dd><p>Input record length (fixed format) or field count (separated format)  must match the record definition in order to get it processed. This is the default value.
  727 </p></dd>
  728 <dt><code>minimum</code></dt>
  729 <dd><p>Input record length or field count can be the same or longer as defined for the record. The rest of the input line is ignored.
  730 </p></dd>
  731 </dl>
  732 </dd>
  733 <dt><code>variable-length <var>record_length</var> <var>variable_length_field</var> <var>adjust</var></code></dt>
  734 <dd><p><var>record_length</var> and <var>variable_length_field</var> are the names of two fields in the record and <var>adjust</var> is a signed integer.
  735 Record length is read from field <var>record_length</var>. <var>record_length</var> is assumed to be an integer type for binary structures or contain only decimal numbers in fixed length structure. 
  736 <var>record_length</var> is assumed to contain the total length of the record. 
  737 <var>variable_length_field</var> is the field having variable length. The length of <var>variable_length_field</var> is calculated by subtracting the total length of the all other fields from the length read from <var>record_length</var>.
  738 <br>
  739 The length given by keyword <var>field</var> for <var>variable_length_field</var> is ignored. After calculating the length it is adjusted by <var>adjust</var>. <var>adjust</var> can be used in cases where the length read from <var>variable_length_field</var> does not contain the total length of the record.  variable-length can be used with binary or fixed lengths structures only.
  740 </p></dd>
  741 </dl>
  742 
  743 <a name="Output"></a>
  744 <h4 class="subheading">Output</h4>
  745 <p>Keyword <code>output</code> specifies a output format for formatting the input data for output. Formatting
  746 is controlled using options and printf style directives. An output definition is independent
  747 from structure, so one output format can be used with different input file formats.
  748 </p>
  749 <p>The output notation:
  750 <br>
  751 </p>
  752 <div class="example">
  753 <pre class="example">output <var>name</var> {
  754     <i>option value</i> &hellip;
  755     &hellip;
  756 }
  757 </pre></div>
  758 
  759 <p>Actual formatting and printing is controlled using <em>pictures</em> in output options. Pictures can contain
  760 following printf style directives:
  761 </p>
  762 <dl compact="compact">
  763 <dt><code>%f</code></dt>
  764 <dd><p>Name of the input file.
  765 </p></dd>
  766 <dt><code>%s</code></dt>
  767 <dd><p>Name of the current structure.
  768 </p></dd>
  769 <dt><code>%r</code></dt>
  770 <dd><p>Name of the current record.
  771 </p></dd>
  772 <dt><code>%o</code></dt>
  773 <dd><p>Input record number in current file.
  774 </p></dd>
  775 <dt><code>%O</code></dt>
  776 <dd><p>Input record number starting from the first file.
  777 </p></dd>
  778 <dt><code>%i</code></dt>
  779 <dd><p>Byte offset of the current record in the current file. Starts from zero.
  780 </p></dd>
  781 <dt><code>%I</code></dt>
  782 <dd><p>Byte offset of the current record starting from the first file. Starts from zero.
  783 </p></dd>
  784 <dt><code>%n</code></dt>
  785 <dd><p>Field name.
  786 </p></dd>
  787 <dt><code>%t</code></dt>
  788 <dd><p>Field contents, without leading and trailing white-spaces. 
  789 </p></dd>
  790 <dt><code>%d</code></dt>
  791 <dd><p>Field contents. Binary integer is printed as a decimal value. Floating point number is printed in the style <code>[-]ddd.ddd</code>, where the number of digits after the decimal-point character is 6. Bcd number is printed as a decimal number and hexadecimal data as consecutive hexadecimal values.
  792 </p></dd>
  793 <dt><code>%D</code></dt>
  794 <dd><p>Field contents, right padded to the field length (requires length definition for the field).
  795 </p></dd>
  796 <dt><code>%C</code></dt>
  797 <dd><p>Field contents, right padded to the field length (requires length definition for the field). Contents is cut if the input field
  798 is longer than output length.
  799 </p></dd>
  800 <dt><code>%x</code></dt>
  801 <dd><p>Unsigned hexadecimal value of a binary integer. Other fields are printed as directive <code>%d</code> would be used.
  802 </p></dd>
  803 <dt><code>%l</code></dt>
  804 <dd><p>Lookup value which has been found using current field as a search key.
  805 </p></dd>
  806 <dt><code>%L</code></dt>
  807 <dd><p>Lookup value, right padded to the field length.
  808 </p></dd>
  809 <dt><code>%p</code></dt>
  810 <dd><p>Fields start position in a record. For fixed and binary structure this is field&rsquo;s byte position in the input line
  811 and for separated structure this is the ordinal number of the field. Starts from one.
  812 </p></dd>
  813 <dt><code>%h</code></dt>
  814 <dd><p>Hexadecimal dump of a field. Byte values are printed as consecutive <code>xnn</code> values, where the <code>nn</code> is the hexadecimal value of a byte. Data is printed before any endian conversion.
  815 </p></dd>
  816 <dt><code>%e</code></dt>
  817 <dd><p>Does not print anything, causes still the &quot;field empty&quot; check to be performed.
  818 Can be  used  when only the names of non-empty fields should be printed.
  819 </p></dd>
  820 <dt><code>%g</code></dt>
  821 <dd><p>Group name given by the keyword <code>group_name</code> in record definition.
  822 </p></dd>
  823 <dt><code>%m</code></dt>
  824 <dd><p>Element name given by the keyword <code>element_name</code> in record definition.
  825 </p></dd>
  826 <dt><code>%%</code></dt>
  827 <dd><p>Percent sign.
  828 </p></dd>
  829 </dl>
  830 
  831 <p>Output options:
  832 </p><dl compact="compact">
  833 <dt><code>file_header <var>picture</var></code></dt>
  834 <dd><p><var>picture</var> is printed once before file contents.
  835 </p></dd>
  836 <dt><code>file_trailer <var>picture</var></code></dt>
  837 <dd><p><var>picture</var> is printed once after file contents.
  838 </p></dd>
  839 <dt><code>header <var>picture</var></code></dt>
  840 <dd><p>If given, then the header line describing the field names is printed before records.
  841 Every field name is printed according the <var>picture</var> using the same separator and field length as
  842 given for the fields. Picture can contain only <code>%n</code> directive.
  843 </p></dd>
  844 <dt><code>data <var>picture</var></code></dt>
  845 <dd><p>Field contents is printed according <var>picture</var>.
  846 </p></dd>
  847 <dt><code>lookup <var>picture</var></code></dt>
  848 <dd><p>If current field is related to lookup table, then this <var>picture</var> is used instead of picture from <code>data</code>.
  849 This makes possible to use different picture when the field is related to a lookup table. Default is to use the picture from <code>data</code>.
  850 </p></dd>
  851 <dt><code>separator <var>string</var></code></dt>
  852 <dd><p>All fields are terminated by <var>string</var>, except the last field of the record.
  853 Default is not to print separator.
  854 </p></dd>
  855 <dt><code>record_header <var>picture</var></code></dt>
  856 <dd><p><var>picture</var> is printed before the record content. Default is not to print the record header.
  857 </p></dd>
  858 <dt><code>record_trailer <var>picture</var></code></dt>
  859 <dd><p><var>picture</var> is printed after the record content. Default is newline.
  860 </p></dd>
  861 <dt><code>justify left|right|<var>char</var></code></dt>
  862 <dd><p>The output from the <code>data</code> option is left or right justified. 
  863 <var>char</var> justifies output according the first occurrence of <var>char</var>
  864 in the data picture. Default is left.
  865 </p></dd>
  866 <dt><code>indent <var>string</var></code></dt>
  867 <dd><p>Record contents is intended by <var>string</var>. 
  868 Field contents is intended by two times the string. Default is not to indent.
  869 If file contents is printed in hierarchical form (keyword <code>level</code> in record definition) then
  870 contents is indented according the level of a record.
  871 </p></dd>
  872 <dt><code>field-list <var>name1</var>,<var>name2</var>,&hellip;</code></dt>
  873 <dd><p>Only fields and constants named as <var>name1</var>,<var>name2</var>,&hellip; are printed, same effect as has option <samp>-f</samp>. 
  874 Default is print all fields and no constants. Fields and constants are also printed in the same order as they are listed.
  875 </p></dd>
  876 <dt><code>no-data-print yes|no</code></dt>
  877 <dd><p>If <code>field-list</code> is given and and this is set as no and none of the fields in <code>field-list</code>
  878 does not belong to the current record, then the <code>record_header</code> and <code>record_trailer</code> are not printed.
  879 Default is yes.
  880 </p></dd>
  881 <dt><code>field-empty-print yes|no</code></dt>
  882 <dd><p>When set as no, nothing is printed for the fields which consist entirely of characters from <code>empty-chars</code>. 
  883 If none of the fields of a record are printed, then the printing of <code>record_trailer</code> is also suppressed. 
  884 Default is yes.
  885 </p></dd>
  886 <dt><code>empty-chars <var>string</var></code></dt>
  887 <dd><p><var>string</var> specifies a set of characters which consist an &quot;empty&quot; field. Default is
  888 &quot;&nbsp;\f\n\r\t\v&quot;<!-- /@w --> (space, form-feed, newline, carriage return, horizontal tab and vertical tab).
  889 </p></dd>
  890 <dt><code>output-file <var>file</var></code></dt>
  891 <dd><p>Output is written to <var>file</var> instead of the default output (standard output or given by <samp>-o, --output</samp>). 
  892 If - is given the output is written to standard output.
  893 </p></dd>
  894 <dt><code>group_header <var>picture</var></code></dt>
  895 <dd><p>If a record has a level and a group name defined, 
  896 <var>picture</var> is printed before the first record in a group or if the group name has changed in the same level.
  897 <strong>Note</strong>: Level related pictures can contain printing directives <code>%g</code> and <code>%n</code> only.
  898 </p></dd>
  899 <dt><code>group_trailer <var>picture</var></code></dt>
  900 <dd><p>If a record has a level and a group name defined, 
  901 <var>picture</var> is printed after the records in lower levels are printed or if the group name has changed in the 
  902 same level or if a higher level record is found.
  903 </p></dd>
  904 <dt><code>element_header <var>picture</var></code></dt>
  905 <dd><p>If a record has a level and a element name defined, <var>picture</var> is printed before the records contents.
  906 </p></dd>
  907 <dt><code>element_trailer <var>picture</var></code></dt>
  908 <dd><p>If a record has a level and a element name defined, <var>picture</var> is printed after the records contents or after 
  909 the following lower level records.
  910 </p></dd>
  911 <dt><code>hex-caps yes|no</code></dt>
  912 <dd><p>Print hexadecimal numbers in capital letters. Default is no.
  913 </p></dd>
  914 </dl>
  915 
  916 <a name="Lookup"></a>
  917 <h4 class="subheading">Lookup</h4>
  918 <p>Keyword <code>lookup</code> specifies a lookup table which can be searched using field contents. Found values can
  919 be printed using output directives <code>%l</code> and <code>%L</code>.
  920 </p>
  921 <p>The lookup table notation:
  922 <br>
  923 </p>
  924 <div class="example">
  925 <pre class="example">lookup <var>name</var> {
  926     <i>option value</i> &hellip;
  927     &hellip;
  928 }
  929 </pre></div>
  930 
  931 <p>Lookup options:
  932 </p><dl compact="compact">
  933 <dt><code>search exact | longest</code></dt>
  934 <dd><p>Search method for this table. Either exact or longest match is used when searching the table. Default is <code>exact</code>.
  935 </p></dd>
  936 <dt><code>pair <var>key</var> <var>value</var></code></dt>
  937 <dd><p>Defines a key/value pair for the lookup table. In case of binary file <var>key</var> must have the same representation as
  938 can be shown using the <code>%d</code> printing directive.
  939 </p></dd>
  940 <dt><code>file <var>name</var> [<var>separator</var>]</code></dt>
  941 <dd><p>Data for the lookup table is read from file <var>name</var>. Each line in file <var>name</var> is considered as a key/value pair
  942 separated by a single character <var>separator</var>. Default separator is semicolon. Lines without separator are silently omitted.
  943 <strong>Note</strong>: The file size is limited by available memory because the file contents is loaded to memory. 
  944 </p></dd>
  945 <dt><code>default-value <var>value</var></code></dt>
  946 <dd><p>If searching the lookup table is unsuccessful then <var>value</var> is used in printing. Default is empty string.
  947 </p></dd>
  948 </dl>
  949 
  950 <a name="Constants"></a>
  951 <h4 class="subheading">Constants</h4>
  952 <p>Keyword <code>const</code> specifies one name/value pair which can be used as an additional output field.
  953 Constants can be used only in field lists (option <samp>-f,--field-list</samp>, or output option <code>field-list</code>).
  954 </p>
  955 <p>Constants can be used to add fields to output which do not appear in input. E.g. new fields for 
  956 separated output or adding spaces after a fixed length field (changing the field length).
  957 </p>
  958 <p>Note that <var>value</var> is printed as it is for every record. It cannot be changed record by record.
  959 </p>
  960 <p>If a constant has the same name as one of the input fields, the value <var>value</var> is printed instead of
  961 the input field contents.
  962 </p>
  963 <p>The constant notation:
  964 <br>
  965 </p>
  966 <div class="example">
  967 <pre class="example">const <var>name</var> <var>value</var>
  968 </pre></div>
  969 
  970 <p>When <var>name</var> appears in field list it is treated as one of the input fields having contents <var>value</var>.
  971 </p>
  972 <a name="Filter"></a>
  973 <h4 class="subheading">Filter</h4>
  974 <p>Keyword <code>filter</code> defines a command that can be used to format field raw contents. Command must read the standard input
  975 and write to standard output and it must not block. Field raw contents is filtered through the command and the output is printed as
  976 field contents.
  977 </p>
  978 <p>The filter notation:
  979 <br>
  980 </p><div class="example">
  981 <pre class="example">filter <var>name</var> <var>command</var>
  982 </pre></div>
  983 
  984 <p><var>name</var> is referred in field definition. <var>command</var> is the shell command to be executed.
  985 </p>
  986 <a name="Anonymization"></a>
  987 <h4 class="subheading">Anonymization</h4>
  988 <p>Keyword <code>anonymize</code> defines a set of fields which will be anonymized by using command line option <samp>-A,--anonymize</samp>
  989 is given. Ffe uses non-reversible anonymization methods and preserves the original field length.
  990 </p>
  991 <p>Notation:
  992 <br>
  993 </p><div class="example">
  994 <pre class="example">anonymize <var>name</var> {
  995     <i>method</i> &hellip;
  996     &hellip;
  997 }
  998 </pre></div>
  999 <p>The anonymization will be done if command line option <samp>-A,--anonymize</samp> is given with <var>name</var>.
 1000 Anonymize options:
 1001 </p><dl compact="compact">
 1002 <dt><code>method <var>field</var> <var>method</var> <var>start</var> <var>length</var> <var>parameter</var></code></dt>
 1003 <dd><p>All fields named as <var>field</var> in the current structure will be anonymized using method <var>method</var>.
 1004 As default the whole field is anonymized. Some parts of the field can be left non-anonymized using
 1005 <var>start</var> and <var>length</var>. <var>start</var> is the byte position where the anonymization starts, first byte is number 1.
 1006 If <var>start</var> is negative the anonymization starts from the end of the field.
 1007 If <var>length</var> is given then <var>length</var> number of bytes is anonymized after start position, default value 0 means the rest of the field.
 1008 Only <var>field</var> and <var>method</var> are mandatory.
 1009 <br>
 1010 <br>
 1011 Values for <var>method</var>:
 1012 </p><dl compact="compact">
 1013 <dt><code>MASK</code></dt>
 1014 <dd><p>Field will be masked with character &rsquo;0&rsquo;. Different character can be given with <var>parameter</var>.
 1015 </p></dd>
 1016 <dt><code>RANDOM</code></dt>
 1017 <dt><code>NRANDOM</code></dt>
 1018 <dd><p>Field will be filled with randomly selected bytes.
 1019 </p></dd>
 1020 <dt><code>HASH</code></dt>
 1021 <dt><code>NHASH</code></dt>
 1022 <dd><p>Field will be filled with data from hash calculated from the original field. 
 1023 This method yields always the same result with same input. The hash length in bytes can be given with <var>parameter</var>.
 1024 Default hash length is 16, valid values for hash length are 16, 32 and 64.
 1025 </p></dd>
 1026 </dl>
 1027 <p>Methods RANDOM and HASH use characters <code>0-9,A-Z,a-z</code> and space for text fields. Methods NRANDOM and NHASH use only characters <code>0-9</code>. 
 1028 For binary fields all byte values are used. BCD coded fields are always filled with BCD values <code>0-9</code>. 
 1029 </p></dd>
 1030 </dl>
 1031 
 1032 <a name="Command-Substitution"></a>
 1033 <h4 class="subheading">Command Substitution</h4>
 1034 <p>Command Substitution allows the output of a command to replace parts of the configuration file. Syntax for 
 1035 command substitution is:
 1036 <br>
 1037 <br>
 1038 &lsquo;<code>command</code>&lsquo;
 1039 <br>
 1040 <br>
 1041 The <code>command</code> is executed and the &lsquo;<code>command</code>&lsquo; is substituted with the standard output of
 1042 the command, with any trailing newlines deleted. Command substitutions may not be nested.
 1043 </p>
 1044 <p>Before executing the <code>command</code> <code>ffe</code> sets following environment variables:
 1045 </p><dl compact="compact">
 1046 <dt><code>FFE_STRUCTURE</code></dt>
 1047 <dd><p>The name of the structure from <samp>-s,--structure</samp>.
 1048 </p></dd>
 1049 <dt><code>FFE_OUTPUT</code></dt>
 1050 <dd><p>The name of the output file from <samp>-o,--output</samp>.
 1051 </p></dd>
 1052 <dt><code>FFE_FORMAT</code></dt>
 1053 <dd><p>The name of the output format from <samp>-p,--print</samp>.
 1054 </p></dd>
 1055 <dt><code>FFE_FIRST_FILE</code></dt>
 1056 <dd><p>The name of the first input file.
 1057 </p></dd>
 1058 <dt><code>FFE_FILES</code></dt>
 1059 <dd><p>A space-separated list of all input files.
 1060 </p></dd>
 1061 </dl>
 1062 <p>If variable is already set it will not be replaced.
 1063 </p>
 1064 <a name="Input-Preprocessor"></a>
 1065 <h4 class="subheading">Input Preprocessor</h4>
 1066 <p>It is possible to define an input preprosessor for <code>ffe</code>. An input preprocessor is simply an executable program
 1067 which writes the contents of the input file to standard output which will be read by <code>ffe</code>. If the input preprosessor
 1068 does not write any characters on its standard output, then <code>ffe</code> uses the original file.
 1069 </p>
 1070 <p>To set up an input preprocessor, set the <code>FFEOPEN</code> environment variable to a command line which will invoke your input preprocessor.
 1071 This command line should include  one  occurrence  of  the string <code>%s</code>,
 1072 which will be replaced by the input filename when the input preprocessor command is invoked.
 1073 </p>
 1074 <p>The input preprocessor is not used if <code>ffe</code> is reading standard input.
 1075 </p>
 1076 <p>Convenient way is to use <code>lesspipe</code> (or <code>lesspipe.sh</code>), which is available in many UNIX-systems, for example
 1077 <br>
 1078 </p><div class="example">
 1079 <pre class="example">export FFEOPEN=&quot;/usr/bin/lesspipe %s&quot;
 1080 </pre></div>
 1081 
 1082 <p>Using the example above is it possible to give a zipped input file to <code>ffe</code>, then the input processor will unzip the
 1083 file before it is processed by <code>ffe</code>.
 1084 </p>
 1085 <hr>
 1086 <a name="Guessing"></a>
 1087 <a name="Guessing-1"></a>
 1088 <h3 class="section">3.3 Guessing</h3>
 1089 <a name="index-guess"></a>
 1090 <p>If <samp>-s</samp> is not given, <code>ffe</code> tries to guess the input structure. 
 1091 </p>
 1092 <p>When guessing binary data <code>ffe</code> reads the first block of input data and tries to match the structure definitions
 1093 from configuration file to that block. The input block size is the maximum binary block size found in configuration file.
 1094 </p>
 1095 <p>When guessing text data <code>ffe</code> reads the first 10 000 lines or 1 MB of input data and tries to match the structure definitions
 1096 from configuration file to input stream. If all lines match one and only one structure, the structure is used
 1097 for reading the input file.
 1098 </p>
 1099 <p>Guessing uses following execution cycle:
 1100 </p>
 1101 <ol>
 1102 <li> A input line or a binary block is read
 1103 </li><li> All record <code>id</code>&rsquo;s are compared to the input data, if all <code>id</code>&rsquo;s of a record match 
 1104 the input date and the
 1105 records line length matches the total length (or total count for separated structure) of the fields,
 1106 the record is considered to match the input line. If there are no <code>id</code>&rsquo;s, 
 1107 only the line length or field count is checked. In case of binary data only  <code>id</code>&rsquo;s are used in matching.
 1108 </li><li> In case of text data: If all lines match at least one of the records in a particular structure, the structure is considered as selected. 
 1109 There must be only one structure matching all lines used for guessing.
 1110 
 1111 <p>In case of binary data: If the first block matches at least one record of a structure, the structure is considered as selected. Only one structure must match.
 1112 </p>
 1113 </li></ol>
 1114 
 1115 
 1116 <hr>
 1117 <a name="Limits"></a>
 1118 <a name="Limitations"></a>
 1119 <h3 class="section">3.4 Limitations</h3>
 1120 <a name="index-big-files"></a>
 1121 <a name="index-limits"></a>
 1122 
 1123 <p>At least in GNU/Linux <code>ffe</code> should be able to handle big files (&gt; 4 GB), other
 1124 systems are not tested.
 1125 </p>
 1126 <p>Regular expression can be used in operator <strong>?</strong> in option <samp>-e</samp>, <samp>--expression</samp> and in record key word <code>rid</code> only in systems where
 1127 regular expression functions (regcomp, regexec, &hellip;) are available.
 1128 </p>
 1129 <hr>
 1130 <a name="ffe-configuration"></a>
 1131 <a name="How-ffe-works"></a>
 1132 <h2 class="chapter">4 How <code>ffe</code> works</h2>
 1133 <p>Following examples use two different input files: 
 1134 </p><a name="Fixed-length-example"></a>
 1135 <h4 class="subheading">Fixed length example</h4>
 1136 <p>Fixed length personnel file with header and trailer, line (record) is identified by the
 1137 first byte (H = Header, E = Employee, B = Boss, T = trailer).
 1138 </p><div class="example">
 1139 <pre class="example">$cat personnel.fix
 1140 H2006-02-25
 1141 EJohn     Ripper       23
 1142 BScott    Tiger        45
 1143 EMary     Moore        41
 1144 ERidge    Forrester    31
 1145 T0004
 1146 $
 1147 </pre></div>
 1148 
 1149 <p>Structure for reading file above. Note that record &lsquo;<samp>boss</samp>&rsquo; reuses fields from &lsquo;<samp>employee</samp>&rsquo;. Age will be printed three numbers long with padded zeros.
 1150 </p>
 1151 <div class="example">
 1152 <pre class="example">structure personel_fix {
 1153     type fixed
 1154     record header {
 1155         id 1 H
 1156         field type 1
 1157         field date 10
 1158     }
 1159     record employee {
 1160         id 1 E
 1161         field EmpType 1
 1162         field FirstName 9
 1163         field LastName  13
 1164         field Age 2 * * * &quot;%03d&quot;
 1165     }
 1166     record boss {
 1167         id 1 B
 1168         fields-from employee
 1169     }
 1170     record trailer {
 1171         id 1 T
 1172         field type 1
 1173         field count 4
 1174     }
 1175 }
 1176 </pre></div>
 1177 
 1178 <a name="Separated-example"></a>
 1179 <h4 class="subheading">Separated example</h4>
 1180 <p>Same file as above, but now separated by comma.
 1181 </p>
 1182 <div class="example">
 1183 <pre class="example">$cat personnel.sep
 1184 H,2006-02-25
 1185 E,john,Ripper,23
 1186 B,Scott,Tiger,45
 1187 E,Mary,Moore,41
 1188 E,Ridge,Forrester,31
 1189 T,0004
 1190 $
 1191 </pre></div>
 1192 
 1193 <p>Structure for reading file above. Note that the field lengths are not needed in separated format. Length
 1194 is need if the separated data is to be printed in fixed length format.
 1195 </p>
 1196 <div class="example">
 1197 <pre class="example">structure personel_sep {
 1198     type separated ,
 1199     record header {
 1200         id 1 H
 1201         field type 
 1202         field date 
 1203     }
 1204     record employee {
 1205         id 1 E
 1206         field type 
 1207         field FirstName 
 1208         field LastName 
 1209         field Age 
 1210     }
 1211     record boss {
 1212         id 1 B
 1213         fields-from employee
 1214     }
 1215     record trailer {
 1216         id 1 T
 1217         field type 
 1218         field count
 1219     }
 1220 }
 1221 </pre></div>
 1222 
 1223 <a name="Printing-in-XML-format"></a>
 1224 <h4 class="subheading">Printing in XML format</h4>
 1225 <p>Data in examples above can be printed in XML using output definition like:
 1226 </p>
 1227 <div class="example">
 1228 <pre class="example">output xml {
 1229     file_header &quot;&lt;?xml version=\&quot;1.0\&quot; encoding=\&quot;UTF-8\&quot;?&gt;\n&quot;
 1230     data &quot;&lt;%n&gt;%t&lt;/%n&gt;\n&quot;
 1231     record_header &quot;&lt;%r&gt;\n&quot;
 1232     record_trailer &quot;&lt;/%r&gt;\n&quot;
 1233     indent &quot; &quot;
 1234 }
 1235 </pre></div>
 1236 
 1237 <p>Example output using command (assuming definitions above are saved in ~/.fferc)
 1238 </p>
 1239 <p><code>ffe -p xml personnel.sep</code>
 1240 </p>
 1241 <div class="example">
 1242 <pre class="example">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
 1243  &lt;header&gt;
 1244   &lt;type&gt;H&lt;/type&gt;
 1245   &lt;date&gt;2006-02-25&lt;/date&gt;
 1246  &lt;/header&gt;
 1247  &lt;employee&gt;
 1248   &lt;type&gt;E&lt;/type&gt;
 1249   &lt;FirstName&gt;john&lt;/FirstName&gt;
 1250   &lt;LastName&gt;Ripper&lt;/LastName&gt;
 1251   &lt;Age&gt;23&lt;/Age&gt;
 1252  &lt;/employee&gt;
 1253  &lt;boss&gt;
 1254   &lt;type&gt;B&lt;/type&gt;
 1255   &lt;FirstName&gt;Scott&lt;/FirstName&gt;
 1256   &lt;LastName&gt;Tiger&lt;/LastName&gt;
 1257   &lt;Age&gt;45&lt;/Age&gt;
 1258  &lt;/boss&gt;
 1259  &lt;employee&gt;
 1260   &lt;type&gt;E&lt;/type&gt;
 1261   &lt;FirstName&gt;Mary&lt;/FirstName&gt;
 1262   &lt;LastName&gt;Moore&lt;/LastName&gt;
 1263   &lt;Age&gt;41&lt;/Age&gt;
 1264  &lt;/employee&gt;
 1265  &lt;employee&gt;
 1266   &lt;type&gt;E&lt;/type&gt;
 1267   &lt;FirstName&gt;Ridge&lt;/FirstName&gt;
 1268   &lt;LastName&gt;Forrester&lt;/LastName&gt;
 1269   &lt;Age&gt;31&lt;/Age&gt;
 1270  &lt;/employee&gt;
 1271  &lt;trailer&gt;
 1272   &lt;type&gt;T&lt;/type&gt;
 1273   &lt;count&gt;0004&lt;/count&gt;
 1274  &lt;/trailer&gt;
 1275 </pre></div>
 1276 <a name="Printing-sql-commands"></a>
 1277 <h4 class="subheading">Printing sql commands</h4>
 1278 <p>Data in examples above can be loaded to database by generated sql commands. Note that the header and trailer
 1279 are not loaded, because only fields &lsquo;<samp>FirstName</samp>&rsquo;,&lsquo;<samp>LastName</samp>&rsquo; and &lsquo;<samp>Age</samp>&rsquo; are printed and &lsquo;<samp>no-data-print</samp>&rsquo;
 1280 is set as no. This prevents the &lsquo;<samp>record_header</samp>&rsquo; and &lsquo;<samp>record_trailer</samp>&rsquo; to be printed for file header and trailer.
 1281 </p>
 1282 <div class="example">
 1283 <pre class="example">output sql {
 1284     file_header &quot;delete table boss;\ndelete table employee;\n&quot;
 1285     record_header &quot;insert into %r values(&quot;
 1286     data &quot;'%t'&quot;
 1287     separator &quot;,&quot;
 1288     record_trailer &quot;);\n&quot;
 1289     file_trailer &quot;commit\nquit\n&quot;
 1290     no-data-print no
 1291     field-list FirstName,LastName,Age
 1292 }
 1293 </pre></div>
 1294 
 1295 <p>Output from command
 1296 </p>
 1297 <p><code>ffe -p sql personnel.sep</code>
 1298 </p>
 1299 <div class="example">
 1300 <pre class="example">delete table boss;
 1301 delete table employee;
 1302 insert into employee values('john','Ripper','23');
 1303 insert into boss values('Scott','Tiger','45');
 1304 insert into employee values('Mary','Moore','41');
 1305 insert into employee values('Ridge','Forrester','31');
 1306 commit
 1307 quit
 1308 </pre></div>
 1309 <a name="Human-readable-output"></a>
 1310 <h4 class="subheading">Human readable output</h4>
 1311 <p>This output format shows the fields in format suitable for displaying in screen or printing.
 1312 </p>
 1313 <div class="example">
 1314 <pre class="example">output nice {
 1315     record_header &quot;%s - %r - %f - %o\n&quot;
 1316     data &quot;%n=%t\n&quot;
 1317     justify =
 1318     indent &quot; &quot;
 1319 }
 1320 </pre></div>
 1321 
 1322 <p>Output from command
 1323 </p>
 1324 <p><code>ffe -p nice personnel.fix</code>
 1325 </p><div class="example">
 1326 <pre class="example"> personel - header - personnel.fix - 1
 1327   type=H
 1328   date=2006-02-25
 1329  
 1330  personel - employee - personnel.fix - 2
 1331     EmpType=E
 1332   FirstName=John
 1333    LastName=Ripper
 1334         Age=023
 1335  
 1336  personel - boss - personnel.fix - 3
 1337     EmpType=B
 1338   FirstName=Scott
 1339    LastName=Tiger
 1340         Age=045
 1341  
 1342  personel - employee - personnel.fix - 4
 1343     EmpType=E
 1344   FirstName=Mary
 1345    LastName=Moore
 1346         Age=041
 1347  
 1348  personel - employee - personnel.fix - 5
 1349     EmpType=E
 1350   FirstName=Ridge
 1351    LastName=Forrester
 1352         Age=031
 1353  
 1354  personel - trailer - personnel.fix - 6
 1355    type=T
 1356   count=0004
 1357 </pre></div>
 1358 
 1359 <a name="HTML-table"></a>
 1360 <h4 class="subheading">HTML table</h4>
 1361 <p>Personnel data can be displayed as HTML table using output like:
 1362 </p>
 1363 <div class="example">
 1364 <pre class="example">output html {
 1365     file_header &quot;&lt;html&gt;\n&lt;head&gt;\n&lt;/head&gt;\n&lt;body&gt;\n&lt;table border=\&quot;1\&quot;&gt;\n&lt;tr&gt;\n&quot;
 1366     header &quot;&lt;th&gt;%n&lt;/th&gt;\n&quot;
 1367     record_header &quot;&lt;tr&gt;\n&quot;
 1368     data &quot;&lt;td&gt;%t&lt;/td&gt;\n&quot;
 1369     file_trailer &quot;&lt;/table&gt;\n&lt;/body&gt;\n&lt;/html&gt;\n&quot;
 1370     no-data-print no
 1371 }
 1372 </pre></div>
 1373 
 1374 <p>Output from command
 1375 </p>
 1376 <p><code>ffe -p html -f FirstName,LastName,Age personnel.fix</code>
 1377 </p><div class="example">
 1378 <pre class="example">&lt;html&gt;
 1379 &lt;head&gt;
 1380 &lt;/head&gt;
 1381 &lt;body&gt;
 1382 &lt;table border=&quot;1&quot;&gt;
 1383 &lt;tr&gt;
 1384 &lt;th&gt;FirstName&lt;/th&gt;
 1385 &lt;th&gt;LastName&lt;/th&gt;
 1386 &lt;th&gt;Age&lt;/th&gt;
 1387 
 1388 &lt;tr&gt;
 1389 &lt;td&gt;John&lt;/td&gt;
 1390 &lt;td&gt;Ripper&lt;/td&gt;
 1391 &lt;td&gt;023&lt;/td&gt;
 1392 
 1393 &lt;tr&gt;
 1394 &lt;td&gt;Scott&lt;/td&gt;
 1395 &lt;td&gt;Tiger&lt;/td&gt;
 1396 &lt;td&gt;045&lt;/td&gt;
 1397 
 1398 &lt;tr&gt;
 1399 &lt;td&gt;Mary&lt;/td&gt;
 1400 &lt;td&gt;Moore&lt;/td&gt;
 1401 &lt;td&gt;041&lt;/td&gt;
 1402 
 1403 &lt;tr&gt;
 1404 &lt;td&gt;Ridge&lt;/td&gt;
 1405 &lt;td&gt;Forrester&lt;/td&gt;
 1406 &lt;td&gt;031&lt;/td&gt;
 1407 
 1408 &lt;/table&gt;
 1409 &lt;/body&gt;
 1410 &lt;/html&gt;
 1411 </pre></div>
 1412 
 1413 <a name="Using-expression"></a>
 1414 <h4 class="subheading">Using expression</h4>
 1415 <p>Printing only Scott&rsquo;s record using expression with previous example:
 1416 </p>
 1417 <p><code>ffe -p html -f FirstName,LastName,Age -e FirstName^Scott personnel.fix</code>
 1418 </p><div class="example">
 1419 <pre class="example">&lt;html&gt;
 1420 &lt;head&gt;
 1421 &lt;/head&gt;
 1422 &lt;body&gt;
 1423 &lt;table border=&quot;1&quot;&gt;
 1424 &lt;tr&gt;
 1425 &lt;th&gt;FirstName&lt;/th&gt;
 1426 &lt;th&gt;LastName&lt;/th&gt;
 1427 &lt;th&gt;Age&lt;/th&gt;
 1428 
 1429 &lt;tr&gt;
 1430 &lt;td&gt;Scott&lt;/td&gt;
 1431 &lt;td&gt;Tiger&lt;/td&gt;
 1432 &lt;td&gt;045&lt;/td&gt;
 1433 
 1434 &lt;/table&gt;
 1435 &lt;/body&gt;
 1436 &lt;/html&gt;
 1437 </pre></div>
 1438 
 1439 <a name="Using-replace"></a>
 1440 <h4 class="subheading">Using replace</h4>
 1441 <p>Make all bosses and write a new personnel file printing the fields in fixed length format
 1442 using directive <code>%D</code>:
 1443 </p>
 1444 <p>Output definition:
 1445 </p><div class="example">
 1446 <pre class="example">output fixed 
 1447 {
 1448     data &quot;%D&quot;
 1449 }
 1450 </pre></div>
 1451 
 1452 <p>Write a new file:
 1453 </p><div class="example">
 1454 <pre class="example">$ffe -p fixed -r EmpType=B -o personnel.fix.new personnel.fix
 1455 $cat personnel.fix.new
 1456 H2006-02-25
 1457 BJohn     Ripper       023
 1458 BScott    Tiger        045
 1459 BMary     Moore        041
 1460 BRidge    Forrester    031
 1461 T0004
 1462 $
 1463 </pre></div>
 1464 
 1465 <a name="Using-constant"></a>
 1466 <h4 class="subheading">Using constant</h4>
 1467 <p>The length of the fields FirstName and LastName in fixed length format will be made two bytes longer.
 1468 This will be done by printing a constant after those two fields.
 1469 We use dots instead of spaces in order to make change more visible.
 1470 </p>
 1471 <p>Because we do not want to change header and trailer we need specially crafted configuration file. 
 1472 Employee and boss records will be printed using new output <var>fixed2</var> and other records will be printed using 
 1473 output <var>default</var>.
 1474 </p>
 1475 <p>New definition file <samp>new_fixed.rc</samp>:
 1476 </p><div class="example">
 1477 <pre class="example">const 2dots &quot;..&quot;
 1478 
 1479 structure personel_fix {
 1480     type fixed
 1481     record header {
 1482         id 1 H 
 1483         field type 1 
 1484         field date 10
 1485     }
 1486     record employee {
 1487         id 1 E
 1488         field EmpType 1 
 1489         field FirstName 9
 1490         field LastName  13
 1491         field Age 2
 1492         output fixed2
 1493     }
 1494     record boss {
 1495         id 1 B
 1496         fields-from employee
 1497         output fixed2
 1498     }
 1499     record trailer {
 1500         id 1 T
 1501         field type 1 
 1502         field count 4
 1503     }
 1504 }
 1505 
 1506 output default
 1507 {
 1508     data &quot;%D&quot;
 1509 }
 1510 
 1511 output fixed2
 1512 {
 1513     data &quot;%D&quot;
 1514     field-list Emptype,FirstName,2dots,LastName,2dots,Age
 1515 }
 1516 </pre></div>
 1517 <p>Print new flat file:
 1518 </p><div class="example">
 1519 <pre class="example">$ ffe -c new_fixed.rc personel_fix
 1520 H2006-02-25
 1521 EJohn     ..Ripper       ..023
 1522 BScott    ..Tiger        ..045
 1523 EMary     ..Moore        ..041
 1524 ERidge    ..Forrester    ..031
 1525 T0004
 1526 $
 1527 </pre></div>
 1528 
 1529 <a name="Using-lookup-table"></a>
 1530 <h4 class="subheading">Using lookup table</h4>
 1531 <p>Lookup table is used to explain the EmpTypes contents in output format <code>nice</code>:
 1532 </p>
 1533 <p>Lookup definition:
 1534 </p><div class="example">
 1535 <pre class="example">lookup Type
 1536 {
 1537     search exact
 1538     pair H Header
 1539     pair B &quot;He is a Boss!&quot;
 1540     pair E &quot;Not a Boss!&quot;
 1541     pair T Trailer
 1542     default-value &quot;Unknown record type!&quot;
 1543 }   
 1544 </pre></div>
 1545 <p>Mapping the EmpType field to lookup:
 1546 </p><div class="example">
 1547 <pre class="example">structure personel_fix {
 1548     type fixed
 1549     record header {
 1550         id 1 H
 1551         field type 1
 1552         field date 10
 1553     }
 1554     record employee {
 1555         id 1 E
 1556         field EmpType 1 Type
 1557         field FirstName 9
 1558         field LastName  13
 1559         field Age 2
 1560     }
 1561     record boss {
 1562         id 1 B
 1563         fields-from employee
 1564     }
 1565     record trailer {
 1566         id 1 T
 1567         field type 1
 1568         field count 4
 1569     }
 1570 }
 1571 </pre></div>
 1572 <p>Adding the lookup option to output definition <code>nice</code>.
 1573 </p><div class="example">
 1574 <pre class="example">output nice {
 1575     record_header &quot;%s - %r - %f - %o\n&quot;
 1576     data &quot;%n=%t\n&quot;
 1577     lookup &quot;%n=%t (%l)\n&quot;
 1578     justify =
 1579     indent &quot; &quot;
 1580 }
 1581 </pre></div>
 1582 <p>Running ffe:
 1583 </p><div class="example">
 1584 <pre class="example"> $ffe -p nice personnel.fix
 1585  personel_fix - header - personel_fix - 1
 1586   type=H
 1587   date=2006-02-25
 1588  
 1589  personel_fix - employee - personel_fix - 2
 1590     EmpType=E (Not a Boss!)
 1591   FirstName=John
 1592    LastName=Ripper
 1593         Age=023
 1594  
 1595  personel_fix - boss - personel_fix - 3
 1596     EmpType=B (He is a Boss!)
 1597   FirstName=Scott
 1598    LastName=Tiger
 1599         Age=045
 1600  
 1601  personel_fix - employee - personel_fix - 4
 1602     EmpType=E (Not a Boss!)
 1603   FirstName=Mary
 1604    LastName=Moore
 1605         Age=041
 1606  
 1607  personel_fix - employee - personel_fix - 5
 1608     EmpType=E (Not a Boss!)
 1609   FirstName=Ridge
 1610    LastName=Forrester
 1611         Age=031
 1612  
 1613  personel_fix - trailer - personel_fix - 6
 1614    type=T
 1615   count=0004
 1616 </pre></div>
 1617 
 1618 <a name="External-lookup-file"></a>
 1619 <h4 class="subheading">External lookup file</h4>
 1620 <p>In previous example the lookup data could be read from external file like:
 1621 </p>
 1622 <div class="example">
 1623 <pre class="example">$cat lookupdata
 1624 H;Header
 1625 B;He is a Boss!
 1626 E;Not a Boss!
 1627 T;Trailer
 1628 $
 1629 </pre></div>
 1630 <p>Lookup definition using file above:
 1631 </p><div class="example">
 1632 <pre class="example">lookup Type
 1633 {
 1634     search exact
 1635     file lookupdata
 1636     default-value &quot;Unknown record type!&quot;
 1637 }
 1638 </pre></div>
 1639 
 1640 <a name="Making-universal-csv-reader-using-command-substitution"></a>
 1641 <h4 class="subheading">Making universal csv reader using command substitution</h4>
 1642 <p>Command substitution can be used to make a configuration for reading any csv file.
 1643 The number of fields will be read from the first file using awk. 
 1644 Input file names and date are printed in the file header:
 1645 </p>
 1646 <div class="example">
 1647 <pre class="example">structure csv {
 1648     type separated ,
 1649     header first
 1650     record csv {
 1651         field-count `awk &quot;-F,&quot; 'FNR == 1 {print NF;exit;}' $FFE_FIRST_FILE`
 1652     }
 1653 }
 1654 
 1655 output default {
 1656     file_header &quot;Files: `echo $FFE_FILES`\n`date`\n&quot;
 1657     data &quot;%n=%d\n&quot;
 1658     justify =
 1659 }
 1660 </pre></div>
 1661 
 1662 <a name="Reading-binary-data"></a>
 1663 <h4 class="subheading">Reading binary data</h4>
 1664 <p>A binary block having a 3 byte text (ABC) in 5 bytes long space, one byte integer (35), a 32 bit integer (12345678), a double (345.385), a 3 byte bcd number (45112) and a 4 byte hexadecimal data (f15a9188) can be read using following configuration:
 1665 </p>
 1666 <div class="example">
 1667 <pre class="example">structure bin_data
 1668 {
 1669     type binary
 1670     record b
 1671     {
 1672         field text 5
 1673         field byte_int int8
 1674         field integer int
 1675         field number double
 1676         field bcd_number bcd_be_3
 1677         field hex hex_be_4
 1678     }
 1679 }
 1680 
 1681 output default
 1682 {
 1683     data &quot;%n = %d (%h)\n&quot;
 1684 }
 1685 </pre></div>
 1686 <p>The <code>%h</code> directive gives a hex dump of the input data.
 1687 </p>
 1688 <p>Hexadecimal dump of the data:
 1689 </p><div class="example">
 1690 <pre class="example">$ od -t x1 example_bin
 1691 0000000 41 42 43 00 08 23 4e 61 bc 00 5c 8f c2 f5 28 96
 1692 0000020 75 40 45 11 2f f1 5a 91 88
 1693 0000031
 1694 </pre></div>
 1695 
 1696 <p>Using ffe:
 1697 </p><div class="example">
 1698 <pre class="example">$ffe -c example_bin.fferc -s bin_data example_bin
 1699 text = ABC (x41x42x43x00x08)
 1700 byte_int = 35 (x23)
 1701 integer = 12345678 (x4ex61xbcx00)
 1702 number = 345.385000 (x5cx8fxc2xf5x28x96x75x40)
 1703 bcd_number = 45112 (x45x11x2f)
 1704 hex = f15a9188 (xf1x5ax91x88)
 1705 </pre></div>
 1706 
 1707 <p>Note that the text has only 3 characters before NULL byte. Because this example was made in little endian
 1708 machine, same result can be achieved with different configuration:
 1709 </p><div class="example">
 1710 <pre class="example">structure bin_data
 1711 {
 1712     type binary
 1713     record b
 1714     {
 1715         field text 5
 1716         field byte_int int8
 1717         field integer int32_le
 1718         field number double_le
 1719         field bcd_number bcd_be_3
 1720         field hex hex_be_4
 1721     }
 1722 }
 1723 </pre></div>
 1724 <p>This configuration is more portable in case the same data is to be read in a different architecture because endianess of integer and double are explicit given.
 1725 </p>
 1726 <p>If the bcd number is read with <code>bcd_le_3</code> it would look as 
 1727 </p><div class="example">
 1728 <pre class="example">bcd_number = 5411 (x45x11x2f)
 1729 </pre></div>
 1730 <p>Note that nybbles are swapped and last byte is handled as <code>f2</code> (<code>f</code> stops the printing) causing only first two bytes to be printed. 
 1731 </p>
 1732 <p>and if hexadecimal data is read with <code>hex_le_4</code> it would look as
 1733 </p><div class="example">
 1734 <pre class="example">hex = 88915af1 (xf1x5ax91x88)
 1735 </pre></div>
 1736 <p>Bytes are printed starting from the end of the data.
 1737 </p>
 1738 <a name="Printing-nested-XML"></a>
 1739 <h4 class="subheading">Printing nested XML</h4>
 1740 <p>The keyword <code>level</code> in record definition can be used to print data in multi-level nested form. In this
 1741 example a parent row is in level one and a child row is in level two. Children after a parent row belongs
 1742 to the parent before child rows, so they are enclosed in a parent element.
 1743 </p>
 1744 <p>Example data:
 1745 </p><div class="example">
 1746 <pre class="example">P,John Smith,3 
 1747 C,Kathren,6,Blue 
 1748 C,Jimmy,4,Red
 1749 C,Peter,2,Green
 1750 P,Margaret Eelers,2
 1751 C,Aden,16,White
 1752 C,Amanda,20,Black
 1753 </pre></div>
 1754 
 1755 <p>A parent row consists of ID (P), parent name, and the count of the children. A child row consists of id (C), child name, age and favorite color.
 1756 </p>
 1757 <p>This can be printed in nested XML using rc file:
 1758 </p><div class="example">
 1759 <pre class="example">structure family
 1760 {
 1761     type separated ,
 1762     record parent
 1763     {
 1764         id 1 P
 1765         field FILLER
 1766         field Name
 1767         field Child_count
 1768         level 1 parent
 1769     }
 1770 
 1771     record child
 1772     {
 1773         id 1 C
 1774         field FILLER
 1775         field Name
 1776         field Age
 1777         field FavoriteColor
 1778         level 2 child children
 1779     }
 1780 }
 1781 
 1782 output nested_xml
 1783 {
 1784     file_header &quot;&lt;?xml version=\&quot;1.0\&quot; encoding=\&quot;UTF-8\&quot;?&gt;\n&quot;
 1785     data &quot;&lt;%n&gt;%t&lt;/%n&gt;\n&quot;
 1786     indent &quot; &quot;
 1787     record_trailer &quot;&quot;
 1788     group_header &quot;&lt;%g&gt;\n&quot;
 1789     group_trailer &quot;&lt;/%g&gt;\n&quot;
 1790     element_header &quot;&lt;%m&gt;\n&quot;
 1791     element_trailer &quot;&lt;/%m&gt;\n&quot;
 1792 }
 1793 </pre></div>
 1794 
 1795 <p>Output:
 1796 </p>
 1797 <div class="example">
 1798 <pre class="example">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
 1799  &lt;parent&gt;
 1800   &lt;Name&gt;John Smith&lt;/Name&gt;
 1801   &lt;Child_count&gt;3&lt;/Child_count&gt;
 1802   &lt;children&gt;
 1803    &lt;child&gt;
 1804     &lt;Name&gt;Kathren&lt;/Name&gt;
 1805     &lt;Age&gt;6&lt;/Age&gt;
 1806     &lt;FavoriteColor&gt;Blue&lt;/FavoriteColor&gt;
 1807    &lt;/child&gt;
 1808    &lt;child&gt;
 1809     &lt;Name&gt;Jimmy&lt;/Name&gt;
 1810     &lt;Age&gt;4&lt;/Age&gt;
 1811     &lt;FavoriteColor&gt;Red&lt;/FavoriteColor&gt;
 1812    &lt;/child&gt;
 1813    &lt;child&gt;
 1814     &lt;Name&gt;Peter&lt;/Name&gt;
 1815     &lt;Age&gt;2&lt;/Age&gt;
 1816     &lt;FavoriteColor&gt;Green&lt;/FavoriteColor&gt;
 1817    &lt;/child&gt;
 1818   &lt;/children&gt;
 1819  &lt;/parent&gt;
 1820  &lt;parent&gt;
 1821   &lt;Name&gt;Margaret Eelers&lt;/Name&gt;
 1822   &lt;Child_count&gt;2&lt;/Child_count&gt;
 1823   &lt;children&gt;
 1824    &lt;child&gt;
 1825     &lt;Name&gt;Aden&lt;/Name&gt;
 1826     &lt;Age&gt;16&lt;/Age&gt;
 1827     &lt;FavoriteColor&gt;White&lt;/FavoriteColor&gt;
 1828    &lt;/child&gt;
 1829    &lt;child&gt;
 1830     &lt;Name&gt;Amanda&lt;/Name&gt;
 1831     &lt;Age&gt;20&lt;/Age&gt;
 1832     &lt;FavoriteColor&gt;Black&lt;/FavoriteColor&gt;
 1833    &lt;/child&gt;
 1834   &lt;/children&gt;
 1835  &lt;/parent&gt;
 1836 </pre></div>
 1837 
 1838 <a name="Some-examples-put-in-a-single-file"></a>
 1839 <h4 class="subheading">Some examples put in a single file</h4>
 1840 <div class="example">
 1841 <pre class="example">structure personel_fix {
 1842     type fixed
 1843     record header {
 1844         id 1 H
 1845         field type 1
 1846         field date 10
 1847     }
 1848     record employee {
 1849         id 1 E
 1850         field EmpType 1 Type
 1851         field FirstName 9
 1852         field LastName  13
 1853         field Age 2
 1854     }
 1855     record boss {
 1856         id 1 B
 1857         fields-from employee
 1858     }
 1859     record trailer {
 1860         id 1 T
 1861         field type 1
 1862         field count 4
 1863     }
 1864 }
 1865 
 1866 structure personel_sep {
 1867     type separated ,
 1868     record header {
 1869         id 1 H
 1870         field type 
 1871         field date 
 1872     }
 1873     record employee {
 1874         id 1 E
 1875         field type 
 1876         field FirstName 
 1877         field LastName  
 1878         field Age 
 1879     }
 1880     record boss {
 1881         id 1 B
 1882         fields-from employee
 1883     }
 1884         record trailer {
 1885         id 1 T
 1886         field type 
 1887         field count
 1888     }
 1889 }
 1890 
 1891 structure bin_data
 1892 {
 1893     type binary
 1894     record b
 1895     {
 1896         field text 5
 1897         field byte_int int8
 1898         field integer int32_le
 1899         field number double_le
 1900         field bcd_number bcd_be_3
 1901         field hex hex_be_4
 1902     }
 1903 }
 1904 
 1905 output xml {
 1906     file_header &quot;&lt;?xml version=\&quot;1.0\&quot; encoding=\&quot;UTF-8\&quot;?&gt;\n&quot;
 1907     data &quot;&lt;%n&gt;%t&lt;/%n&gt;\n&quot;
 1908     record_header &quot;&lt;%r&gt;\n&quot;
 1909     record_trailer &quot;&lt;/%r&gt;\n&quot;
 1910     indent &quot; &quot;
 1911 }
 1912 
 1913 output sql {
 1914     file_header &quot;delete table boss;\ndelete table employee;\n&quot;
 1915     record_header &quot;insert into %r values(&quot;
 1916     data &quot;'%t'&quot;
 1917     separator &quot;,&quot;
 1918     record_trailer &quot;);\n&quot;
 1919     file_trailer &quot;commit\nquit\n&quot;
 1920     no-data-print no
 1921     field-list FirstName,LastName,Age
 1922 }
 1923 
 1924 output nice {
 1925     record_header &quot;%s - %r - %f - %o\n&quot;
 1926     data &quot;%n=%t\n&quot;
 1927     lookup &quot;%n=%t (%l)\n&quot;
 1928     justify =
 1929     indent &quot; &quot;
 1930 }
 1931 
 1932 output html {
 1933     file_header &quot;&lt;html&gt;\n&lt;head&gt;\n&lt;/head&gt;\n&lt;body&gt;\n&lt;table border=\&quot;1\&quot;&gt;\n&lt;tr&gt;\n&quot;
 1934     header &quot;&lt;th&gt;%n&lt;/th&gt;\n&quot;
 1935     record_header &quot;&lt;tr&gt;\n&quot;
 1936     data &quot;&lt;td&gt;%t&lt;/td&gt;\n&quot;
 1937     file_trailer &quot;&lt;/table&gt;\n&lt;/body&gt;\n&lt;/html&gt;\n&quot;
 1938     no-data-print no
 1939 }
 1940 
 1941 output fixed 
 1942 {
 1943     data &quot;%D&quot;
 1944 }
 1945 
 1946 lookup Type
 1947 {
 1948     search exact
 1949     pair H Header
 1950     pair B &quot;He is a Boss!&quot;
 1951     pair E &quot;Not a Boss!&quot;
 1952     pair T Trailer
 1953     default-value &quot;Unknown record type!&quot;
 1954 }   
 1955 </pre></div>
 1956 
 1957 <a name="Anonymization-1"></a>
 1958 <h4 class="subheading">Anonymization</h4>
 1959 <p>Anonymize fields FirstName, LastName and Age for personnel data:
 1960 </p><div class="example">
 1961 <pre class="example">anonymize personnel
 1962 {
 1963     method FirstName HASH 2
 1964     method LastName HASH 2
 1965     method Age NRANDOM
 1966 }
 1967 </pre></div>
 1968 
 1969 <p>Data before anonymization:
 1970 </p><div class="example">
 1971 <pre class="example">$cat personnel.fix
 1972 H2006-02-25
 1973 EJohn     Ripper       23
 1974 BScott    Tiger        45
 1975 EMary     Moore        41
 1976 ERidge    Forrester    31
 1977 T0004
 1978 </pre></div>
 1979 
 1980 <p>Anonymize the data to new file <samp>personnel_anon.fix</samp> (using the default configuration file <samp>~/.fferc</samp> and raw output):
 1981 </p>
 1982 <div class="example">
 1983 <pre class="example">ffe -A personnel -praw -o personnel_anon.fix personnel.fix
 1984 </pre></div>
 1985 
 1986 <p>Anonymized data:
 1987 </p><div class="example">
 1988 <pre class="example">$cat personnel_anon.fix
 1989 H2006-02-25
 1990 EJQIQ9C5oBR2rDU0qiSTv7E62
 1991 BSqUcsYzSTTNTuTraspsG4154
 1992 EMTsXkHltVMsV8qmK1tkgq 00
 1993 ER1e90zv1dFjP4 xgflVGQF87
 1994 T0004
 1995 
 1996 $ffe -pnice personnel_anon.fix
 1997  personel - header - personnel_anon.fix - 1
 1998    type=H
 1999    date=2006-02-25
 2000       
 2001  personel - employee - personnel_anon.fix - 2
 2002      EmpType=E
 2003    FirstName=JQIQ9C5oB
 2004     LastName=R2rDU0qiSTv7E
 2005          Age=62
 2006                         
 2007  personel - boss - personnel_anon.fix - 3
 2008     EmpType=B
 2009   FirstName=SqUcsYzST
 2010    LastName=TNTuTraspsG41
 2011         Age=54
 2012                                           
 2013  personel - employee - personnel_anon.fix - 4
 2014     EmpType=E
 2015   FirstName=MTsXkHltV
 2016    LastName=MsV8qmK1tkgq 
 2017         Age=00
 2018                                                      
 2019  personel - employee - personnel_anon.fix - 5
 2020     EmpType=E
 2021   FirstName=R1e90zv1d
 2022    LastName=FjP4 xgflVGQF
 2023         Age=87
 2024                                                                              
 2025  personel - trailer - personnel_anon.fix - 6
 2026    type=T
 2027   count=0004
 2028 </pre></div>
 2029 <p>FirstName and LastName have preserved the first letter because anonymization started from the second byte. Age is a two digit random number.
 2030 Name fields will get the same anonymized value for each run, but Age will have a random value for each run.
 2031 </p>
 2032 <a name="Using-ffe-to-test-file-integrity"></a>
 2033 <h4 class="subheading">Using <code>ffe</code> to test file integrity</h4>
 2034 <p><code>ffe</code> can be used to check flat file integrity, because <code>ffe</code> 
 2035 checks for all lines the line length and id&rsquo;s for fixed length structure 
 2036 and field count and id&rsquo;s for separated structure.
 2037 </p>
 2038 <p>Integrity can be checked using command 
 2039 </p>
 2040 <p><code>ffe -p no -l inputfiles&hellip;</code>
 2041 </p>
 2042 <p>Because option <samp>-p</samp> has value <code>no</code> nothing is printed to output except the error messages.
 2043 Option <samp>-l</samp> causes all erroneous lines to be reported, not just the first one.
 2044 </p>
 2045 <p>Example output:
 2046 </p>
 2047 <div class="example">
 2048 <pre class="example">ffe: Invalid input line in file 'inputfileB', line 14550
 2049 ffe: Invalid input line in file 'inputfileD', line 12
 2050 </pre></div>
 2051 
 2052 <hr>
 2053 <a name="Problems"></a>
 2054 <a name="Reporting-Bugs"></a>
 2055 <h2 class="chapter">5 Reporting Bugs</h2>
 2056 <a name="index-bugs"></a>
 2057 <a name="index-problems"></a>
 2058 
 2059 <p>If you find a bug in <code>ffe</code>, please send electronic mail to
 2060 <a href="mailto:tjsa@iki.fi">tjsa@iki.fi</a>.  Include the version number, which you can find by
 2061 running &lsquo;<samp>ffe&nbsp;<span class="nolinebreak">--version</span></samp>&rsquo;<!-- /@w -->.  Also include in your message the
 2062 output that the program produced and the output you expected.
 2063 </p>
 2064 <p>If you have other questions, comments or suggestions about
 2065 <code>ffe</code>, contact the author via electronic mail to
 2066 <a href="mailto:tjsa@iki.fi">tjsa@iki.fi</a>.  The author will try to help you out, although he
 2067 may not have time to fix your problems.
 2068 </p>
 2069 <a name="SEC_Contents"></a>
 2070 <h2 class="contents-heading">Table of Contents</h2>
 2071 
 2072 <div class="contents">
 2073 
 2074 <ul class="no-bullet">
 2075   <li><a name="toc-Preliminary-information" href="#Overview">1 Preliminary information</a></li>
 2076   <li><a name="toc-Samples-using-ffe" href="#Samples">2 Samples using <code>ffe</code></a></li>
 2077   <li><a name="toc-How-to-run-ffe" href="#Invoking-ffe">3 How to run <code>ffe</code></a>
 2078   <ul class="no-bullet">
 2079     <li><a name="toc-Program-invocation" href="#Invocation">3.1 Program invocation</a></li>
 2080     <li><a name="toc-Configuration-1" href="#Configuration">3.2 Configuration</a></li>
 2081     <li><a name="toc-Guessing-1" href="#Guessing">3.3 Guessing</a></li>
 2082     <li><a name="toc-Limitations" href="#Limits">3.4 Limitations</a></li>
 2083   </ul></li>
 2084   <li><a name="toc-How-ffe-works" href="#ffe-configuration">4 How <code>ffe</code> works</a></li>
 2085   <li><a name="toc-Reporting-Bugs" href="#Problems">5 Reporting Bugs</a></li>
 2086 </ul>
 2087 </div>
 2088 
 2089 <hr>
 2090 
 2091 
 2092 
 2093 </body>
 2094 </html>