"Fossies" - the Fresh Open Source Software Archive  

Source code changes of the file "doc/recode.info" between
recode-3.7.11.tar.gz and recode-3.7.12.tar.gz

About: recode is a charset converter tool and library (fork of the original and now unmaintained GNU recode).

recode.info  (recode-3.7.11):recode.info  (recode-3.7.12)
skipping to change at page 850, line ? skipping to change at page 850, line ?
This recoding library converts files between various coded character This recoding library converts files between various coded character
sets and surface encodings. When this cannot be achieved exactly, it sets and surface encodings. When this cannot be achieved exactly, it
may get rid of the offending characters or fall back on approximations. may get rid of the offending characters or fall back on approximations.
The library recognises or produces more than 300 different character The library recognises or produces more than 300 different character
sets and is able to convert files between almost any pair. Most sets and is able to convert files between almost any pair. Most
RFC 1345 character sets, and all character sets from a pre-installed RFC 1345 character sets, and all character sets from a pre-installed
'iconv' library, are supported. The 'recode' program is a handy 'iconv' library, are supported. The 'recode' program is a handy
front-end to the library. front-end to the library.
This manual documents Recode 3.7.10. This manual documents Recode 3.7.12.
* Menu: * Menu:
* Tutorial:: Quick Tutorial * Tutorial:: Quick Tutorial
* Introduction:: Terminology and purpose * Introduction:: Terminology and purpose
* Invoking recode:: How to use this program * Invoking recode:: How to use this program
* Library:: A recoding library * Library:: A recoding library
* Universal:: The universal charset * Universal:: The universal charset
* iconv:: The 'iconv' library * iconv:: The 'iconv' library
* Tabular:: Tabular sources (RFC 1345) * Tabular:: Tabular sources (RFC 1345)
skipping to change at line 532 skipping to change at line 532
case only system errors or library mis-usage causes the exit status case only system errors or library mis-usage causes the exit status
to be set. to be set.
* By default, without '-f' nor '-s', Recode sets the exit status as * By default, without '-f' nor '-s', Recode sets the exit status as
above, and also in case of invalid or untranslatable input. It above, and also in case of invalid or untranslatable input. It
also tries (but not always succeed) to detect if output is going to also tries (but not always succeed) to detect if output is going to
be ambiguous at some later recode-back time. be ambiguous at some later recode-back time.
* The stricter setting is activated with '-s', Recode then sets the * The stricter setting is activated with '-s', Recode then sets the
exit status as above, or if input is not canonically coded (and it exit status as above, or if input is not canonically coded (and it
also prevents itself from *completing* recoding tables for making also prevents itself from _completing_ recoding tables for making
the recoding reversible). the recoding reversible).
---------- Footnotes ---------- ---------- Footnotes ----------
(1) In previous versions of Recode, a single colon ':' was used (1) In previous versions of Recode, a single colon ':' was used
instead of the two dots '..' for separating charsets, but this created instead of the two dots '..' for separating charsets, but this created
problems, because colons are allowed in official charset names. problems, because colons are allowed in official charset names.
File: recode.info, Node: Requests, Next: Listings, Prev: Synopsis, Up: Invok ing recode File: recode.info, Node: Requests, Next: Listings, Prev: Synopsis, Up: Invok ing recode
skipping to change at line 1495 skipping to change at line 1495
When this flag is set, the library later issues diagnostics When this flag is set, the library later issues diagnostics
itself, and aborts the calling program on errors. This is itself, and aborts the calling program on errors. This is
merely a convenience, because if this flag was not given, the merely a convenience, because if this flag was not given, the
calling program should always take care of checking the return calling program should always take care of checking the return
value of all other calls to the recoding library functions, value of all other calls to the recoding library functions,
and when any error is detected, issue a diagnostic and abort and when any error is detected, issue a diagnostic and abort
processing itself. processing itself.
'RECODE_NO_ICONV_FLAG' 'RECODE_NO_ICONV_FLAG'
When this flag is set, the library does not initialize nor use When this flag is set, the library does not initialize or use
the external 'iconv' library. This means that the charsets the external 'iconv' library. This means that the charsets
and aliases provided by the 'iconv' external library and not and aliases provided by the 'iconv' external library and not
by Recode itself are not available. by Recode itself are not available.
In previous incatations of the Recode library, FLAGS was a Boolean 'RECODE_STRICT_MAPPING_FLAG'
When this flag is set (corresponding to the '--strict'
command-line option), untranslatable characters are discarded,
but an error is returned on completion unless
'RECODE_FORCE_FLAG' is also set.
'RECODE_FORCE_FLAG'
When this flag is set (corresponding to the '--force'
command-line option), errors caused by untranslatable
characters are ignored.
In previous incarnations of the Recode library, FLAGS was a Boolean
instead of a collection of flags, meant to set instead of a collection of flags, meant to set
'RECODE_AUTO_ABORT_FLAG'. This still works, but is deprecated. 'RECODE_AUTO_ABORT_FLAG'. This still works, but is deprecated.
Regardless of the setting of 'RECODE_AUTO_ABORT', all recoding Regardless of the setting of 'RECODE_AUTO_ABORT', all recoding
library functions return a success status. Most functions are library functions return a success status. Most functions are
geared for returning 'false' for an error, and 'true' if everything geared for returning 'false' for an error, and 'true' if everything
went fine. Functions returning structures or strings return 'NULL' went fine. Functions returning structures or strings return 'NULL'
instead of the result, when the result cannot be produced. If instead of the result, when the result cannot be produced. If
RECODE_AUTO_ABORT is selected, functions either return 'true', or RECODE_AUTO_ABORT is selected, functions either return 'true', or
do not return at all. do not return at all.
skipping to change at line 1565 skipping to change at line 1578
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <recode.h> #include <recode.h>
const char *program_name; const char *program_name;
int int
main (int argc, char *const *argv) main (int argc, char *const *argv)
{ {
program_name = argv[0]; program_name = argv[0];
RECODE_OUTER outer = recode_new_outer (true); RECODE_OUTER outer = recode_new_outer (RECODE_AUTO_ABORT_FLAG);
RECODE_REQUEST request = recode_new_request (outer); RECODE_REQUEST request = recode_new_request (outer);
bool success; bool success;
recode_scan_request (request, "ibmpc..latin1"); recode_scan_request (request, "ibmpc..latin1");
success = recode_file_to_file (request, stdin, stdout); success = recode_file_to_file (request, stdin, stdout);
recode_delete_request (request); recode_delete_request (request);
recode_delete_outer (outer); recode_delete_outer (outer);
skipping to change at line 1810 skipping to change at line 1823
#include <stdbool.h> #include <stdbool.h>
#include <stdlib.h> #include <stdlib.h>
#include <recodext.h> #include <recodext.h>
const char *program_name; const char *program_name;
int int
main (int argc, char *const *argv) main (int argc, char *const *argv)
{ {
program_name = argv[0]; program_name = argv[0];
RECODE_OUTER outer = recode_new_outer (false); RECODE_OUTER outer = recode_new_outer (0);
RECODE_REQUEST request = recode_new_request (outer); RECODE_REQUEST request = recode_new_request (outer);
RECODE_TASK task; RECODE_TASK task;
bool success; bool success;
recode_scan_request (request, "ibmpc..latin1"); recode_scan_request (request, "ibmpc..latin1");
task = recode_new_task (request); task = recode_new_task (request);
task->input.name = ""; task->input.name = "";
task->output.name = ""; task->output.name = "";
success = recode_perform_task (task); success = recode_perform_task (task);
skipping to change at line 2500 skipping to change at line 2513
********************* *********************
The Recode library is able to use the capabilities of an external, The Recode library is able to use the capabilities of an external,
pre-installed 'iconv' library, usually as provided by GNU 'libc' or the pre-installed 'iconv' library, usually as provided by GNU 'libc' or the
portable 'libiconv' written by Bruno Haible. In fact, many capabilities portable 'libiconv' written by Bruno Haible. In fact, many capabilities
of the Recode library are duplicated in an external 'iconv' library, as of the Recode library are duplicated in an external 'iconv' library, as
they likely share many charsets. We discuss, here, the issues related they likely share many charsets. We discuss, here, the issues related
to this duplication, and other peculiarities specific to the 'iconv' to this duplication, and other peculiarities specific to the 'iconv'
library. library.
The 'iconv' library provides transliteration between character sets, The 'RECODE_STRICT_MAPPING_FLAG' option, corresponding to the
using encodings with the suffix '-translit'. This corresponds to the '--strict' flag, is implemented by adding 'iconv' option '//IGNORE' to
'iconv' option '//TRANSLIT'. the 'after' encoding. This has the side effect that untranslatable
input is only signalled at the end of the conversion, whereas with
Similarly, the suffix '-ignore' tells 'iconv' to ignore invalid Recode's built-in conversion routines the error will be signalled
character sequences. This corresponds to the 'iconv' option '//IGNORE'. immediately.
The two suffixes can be combined using the suffix '-translit-ignore'; If the string '-translit' is appended to the AFTER encoding,
for example, 'iso-8859-1-translit-ignore'. characters being converted are transliterated when needed and possible.
This means that when a character cannot be represented in the target
character set, it can be approximated through one or several similar
looking characters. Characters that are outside of the target character
set and cannot be transliterated are replaced with a question mark (?)
in the output. This corresponds to the 'iconv' option '//TRANSLIT'.
To check whether 'iconv' is used for a particular conversion, just To check whether 'iconv' is used for a particular conversion, just
use the '-v' or '--verbose' option, *note Recoding::, and check whether use the '-v' or '--verbose' option, *note Recoding::, and check whether
':iconv:' appears as an intermediate charset. ':iconv:' appears as an intermediate charset.
The ':iconv:' charset represents a conceptual pivot charset within The ':iconv:' charset represents a conceptual pivot charset within
the external 'iconv' library (in fact, this pivot exists, but is not the external 'iconv' library (in fact, this pivot exists, but is not
directly reachable). This charset has a ':' (a mere colon) and directly reachable). This charset has a ':' (a mere colon) and
':libiconv:' for aliases. It is not allowed to recode from or to this ':libiconv:' for aliases. It is not allowed to recode from or to this
charset directly. But when this charset is selected as an intermediate, charset directly. But when this charset is selected as an intermediate,
skipping to change at line 4640 skipping to change at line 4658
quality (see 'recodext.h') and two functions you also provide. quality (see 'recodext.h') and two functions you also provide.
The first such function has the purpose of allocating structures, The first such function has the purpose of allocating structures,
pre-conditioning conversion tables, etc. It is also the way of further pre-conditioning conversion tables, etc. It is also the way of further
modifying the 'STEP' structure. This function is executed if and only modifying the 'STEP' structure. This function is executed if and only
if the single step is retained in an actual recoding sequence. If you if the single step is retained in an actual recoding sequence. If you
do not need such delayed initialisation, merely use 'NULL' for the do not need such delayed initialisation, merely use 'NULL' for the
function argument. function argument.
The second function executes the elementary recoding on a whole file. The second function executes the elementary recoding on a whole file.
There are a few cases when you can spare writing this function:
* Some single steps do nothing else than a pure copy of the input
onto the output, in this case, you can use the predefined function
'file_one_to_one', while having a delayed initialisation for
presetting the 'STEP' field 'one_to_one' to the predefined value
'one_to_same'.
* Some single steps are driven by a table which recodes one character
into another; if the recoding does nothing else, you can use the
predefined function 'file_one_to_one', while having a delayed
initialisation for presetting the 'STEP' field 'one_to_one' with
your table.
* Some single steps are driven by a table which recodes one character
into a string; if the recoding does nothing else, you can use the
predefined function 'file_one_to_many', while having a delayed
initialisation for presetting the 'STEP' field 'one_to_many' with
your table.
If you have a recoding table handy in a suitable format but do not If you have a recoding table handy in a suitable format but do not
use one of the predefined recoding functions, it is still a good idea to use one of the predefined recoding functions, it is still a good idea to
use a delayed initialisation to save it anyway, because 'recode' option use a delayed initialisation to save it anyway, because 'recode' option
'-h' will take advantage of this information when available. '-h' will take advantage of this information when available.
Finally, edit 'Makefile.am' to add the source file name of your Finally, edit 'Makefile.am' to add the source file name of your
routines to the 'C_STEPS' or 'L_STEPS' macro definition, depending on routines to the 'C_STEPS' or 'L_STEPS' macro definition, depending on
the fact your routines is written in C or in Flex. whether your routines are written in C or Flex.
File: recode.info, Node: New surfaces, Next: Design, Prev: New charsets, Up: Internals File: recode.info, Node: New surfaces, Next: Design, Prev: New charsets, Up: Internals
14.3 Adding new surfaces 14.3 Adding new surfaces
======================== ========================
Adding a new surface is technically quite similar to adding a new Adding a new surface is technically quite similar to adding a new
charset. *Note New charsets::. A surface is provided as a set of two charset. *Note New charsets::. A surface is provided as a set of two
transformations: one from the predefined special charset 'data' to the transformations: one from the predefined special charset 'data' to the
new surface, meant to apply the surface, the other from the new surface new surface, meant to apply the surface, the other from the new surface
skipping to change at line 4987 skipping to change at line 4986
* non canonical input, error message: Errors. (line 20) * non canonical input, error message: Errors. (line 20)
* normilise an HTML file: HTML. (line 110) * normilise an HTML file: HTML. (line 110)
* NOS 6/12 code: CDC-NOS. (line 9) * NOS 6/12 code: CDC-NOS. (line 9)
* numeric character references: HTML. (line 6) * numeric character references: HTML. (line 6)
* outer level functions: Outer level. (line 6) * outer level functions: Outer level. (line 6)
* partial conversion: Mixed. (line 20) * partial conversion: Mixed. (line 20)
* permutations of groups of bytes: Permutations. (line 6) * permutations of groups of bytes: Permutations. (line 6)
* pipe sequencing: Sequencing. (line 28) * pipe sequencing: Sequencing. (line 28)
* programming language support: Listings. (line 26) * programming language support: Listings. (line 26)
* program_name variable: Library. (line 11) * program_name variable: Library. (line 11)
* program_name variable <1>: Outer level. (line 99) * program_name variable <1>: Outer level. (line 112)
* pseudo-charsets: Charset overview. (line 34) * pseudo-charsets: Charset overview. (line 34)
* pure charset: Surface overview. (line 17) * pure charset: Surface overview. (line 17)
* quality of recoding: Recoding. (line 35) * quality of recoding: Recoding. (line 35)
* Recode internals: Internals. (line 6) * Recode internals: Internals. (line 6)
* Recode request syntax: Requests. (line 15) * Recode request syntax: Requests. (line 15)
* Recode use, a tutorial: Tutorial. (line 6) * Recode use, a tutorial: Tutorial. (line 6)
* Recode version, printing: Listings. (line 10) * Recode version, printing: Listings. (line 10)
* Recode, and RFC 1345: Tabular. (line 43) * Recode, and RFC 1345: Tabular. (line 43)
* Recode, main flow of operation: Main flow. (line 6) * Recode, main flow of operation: Main flow. (line 6)
* recode, operation as filter: Synopsis. (line 27) * recode, operation as filter: Synopsis. (line 27)
skipping to change at line 5124 skipping to change at line 5123
* abort_level: Task level. (line 187) * abort_level: Task level. (line 187)
* ascii_graphics: Request level. (line 111) * ascii_graphics: Request level. (line 111)
* byte_order_mark: Task level. (line 171) * byte_order_mark: Task level. (line 171)
* declare_step: New surfaces. (line 12) * declare_step: New surfaces. (line 12)
* DEFAULT_CHARSET: Requests. (line 103) * DEFAULT_CHARSET: Requests. (line 103)
* diacritics_only: Request level. (line 102) * diacritics_only: Request level. (line 102)
* diaeresis_char: Request level. (line 86) * diaeresis_char: Request level. (line 86)
* error_so_far: Task level. (line 199) * error_so_far: Task level. (line 199)
* fail_level: Task level. (line 177) * fail_level: Task level. (line 177)
* file_one_to_many: New charsets. (line 70)
* file_one_to_one: New charsets. (line 58)
* find_charset: Charset level. (line 15) * find_charset: Charset level. (line 15)
* LC_MESSAGES, when listing charsets: Listings. (line 211) * LC_MESSAGES, when listing charsets: Listings. (line 211)
* list_all_charsets: Charset level. (line 15) * list_all_charsets: Charset level. (line 15)
* list_concise_charset: Charset level. (line 15) * list_concise_charset: Charset level. (line 15)
* list_full_charset: Charset level. (line 15) * list_full_charset: Charset level. (line 15)
* make_header_flag: Request level. (line 93) * make_header_flag: Request level. (line 93)
* RECODE_AMBIGUOUS_OUTPUT: Errors. (line 33) * RECODE_AMBIGUOUS_OUTPUT: Errors. (line 33)
* recode_buffer_to_buffer: Request level. (line 157) * recode_buffer_to_buffer: Request level. (line 157)
* recode_buffer_to_file: Request level. (line 157) * recode_buffer_to_file: Request level. (line 157)
* recode_delete_outer: Outer level. (line 50) * recode_delete_outer: Outer level. (line 50)
skipping to change at line 5520 skipping to change at line 5517
* IBM871, aliases and source: Tabular. (line 371) * IBM871, aliases and source: Tabular. (line 371)
* IBM875, aliases and source: Tabular. (line 375) * IBM875, aliases and source: Tabular. (line 375)
* IBM880, aliases and source: Tabular. (line 379) * IBM880, aliases and source: Tabular. (line 379)
* IBM891, aliases and source: Tabular. (line 383) * IBM891, aliases and source: Tabular. (line 383)
* IBM903, aliases and source: Tabular. (line 387) * IBM903, aliases and source: Tabular. (line 387)
* IBM904, aliases and source: Tabular. (line 391) * IBM904, aliases and source: Tabular. (line 391)
* IBM905, aliases and source: Tabular. (line 395) * IBM905, aliases and source: Tabular. (line 395)
* IBM912: Tabular. (line 447) * IBM912: Tabular. (line 447)
* IBM918, aliases and source: Tabular. (line 399) * IBM918, aliases and source: Tabular. (line 399)
* Icon-QNX, and aliases: Icon-QNX. (line 6) * Icon-QNX, and aliases: Icon-QNX. (line 6)
* iconv: iconv. (line 28) * iconv: iconv. (line 33)
* iconv, not in requests: Charset overview. (line 34) * iconv, not in requests: Charset overview. (line 34)
* IEC_P27-1, aliases and source: Tabular. (line 403) * IEC_P27-1, aliases and source: Tabular. (line 403)
* INIS, aliases and source: Tabular. (line 407) * INIS, aliases and source: Tabular. (line 407)
* INIS-8, aliases and source: Tabular. (line 411) * INIS-8, aliases and source: Tabular. (line 411)
* INIS-cyrillic, aliases and source: Tabular. (line 415) * INIS-cyrillic, aliases and source: Tabular. (line 415)
* INVARIANT, aliases and source: Tabular. (line 419) * INVARIANT, aliases and source: Tabular. (line 419)
* irv: Tabular. (line 511) * irv: Tabular. (line 511)
* ISO 5426, a charset: ISO 5426 and ANSEL. (line 6) * ISO 5426, a charset: ISO 5426 and ANSEL. (line 6)
* ISO-10646-UCS-2, and aliases: UCS-2. (line 31) * ISO-10646-UCS-2, and aliases: UCS-2. (line 31)
* ISO-10646-UCS-4, and aliases: UCS-4. (line 10) * ISO-10646-UCS-4, and aliases: UCS-4. (line 10)
 End of changes. 12 change blocks. 
39 lines changed or deleted 36 lines changed or added

Home  |  About  |  Features  |  All  |  Newest  |  Dox  |  Diffs  |  RSS Feeds  |  Screenshots  |  Comments  |  Imprint  |  Privacy  |  HTTP(S)