lzlib.info (lzlib-1.12.tar.lz) | : | lzlib.info (lzlib-1.13.tar.lz) | ||
---|---|---|---|---|
File: lzlib.info, Node: Top, Next: Introduction, Up: (dir) | File: lzlib.info, Node: Top, Next: Introduction, Up: (dir) | |||
Lzlib Manual | Lzlib Manual | |||
************ | ************ | |||
This manual is for Lzlib (version 1.12, 2 January 2021). | This manual is for Lzlib (version 1.13, 23 January 2022). | |||
* Menu: | * Menu: | |||
* Introduction:: Purpose and features of lzlib | * Introduction:: Purpose and features of lzlib | |||
* Library version:: Checking library version | * Library version:: Checking library version | |||
* Buffering:: Sizes of lzlib's buffers | * Buffering:: Sizes of lzlib's buffers | |||
* Parameter limits:: Min / max values for some parameters | * Parameter limits:: Min / max values for some parameters | |||
* Compression functions:: Descriptions of the compression functions | * Compression functions:: Descriptions of the compression functions | |||
* Decompression functions:: Descriptions of the decompression functions | * Decompression functions:: Descriptions of the decompression functions | |||
* Error codes:: Meaning of codes returned by functions | * Error codes:: Meaning of codes returned by functions | |||
* Error messages:: Error messages corresponding to error codes | * Error messages:: Error messages corresponding to error codes | |||
* Invoking minilzip:: Command line interface of the test program | * Invoking minilzip:: Command line interface of the test program | |||
* Data format:: Detailed format of the compressed data | * Data format:: Detailed format of the compressed data | |||
* Examples:: A small tutorial with examples | * Examples:: A small tutorial with examples | |||
* Problems:: Reporting bugs | * Problems:: Reporting bugs | |||
* Concept index:: Index of concepts | * Concept index:: Index of concepts | |||
Copyright (C) 2009-2021 Antonio Diaz Diaz. | Copyright (C) 2009-2022 Antonio Diaz Diaz. | |||
This manual is free documentation: you have unlimited permission to copy, | This manual is free documentation: you have unlimited permission to copy, | |||
distribute, and modify it. | distribute, and modify it. | |||
File: lzlib.info, Node: Introduction, Next: Library version, Prev: Top, Up: Top | File: lzlib.info, Node: Introduction, Next: Library version, Prev: Top, Up: Top | |||
1 Introduction | 1 Introduction | |||
************** | ************** | |||
Lzlib is a data compression library providing in-memory LZMA compression and | Lzlib is a data compression library providing in-memory LZMA compression and | |||
skipping to change at line 66 | skipping to change at line 66 | |||
* Additionally the lzip reference implementation is copylefted, which | * Additionally the lzip reference implementation is copylefted, which | |||
guarantees that it will remain free forever. | guarantees that it will remain free forever. | |||
A nice feature of the lzip format is that a corrupt byte is easier to | A nice feature of the lzip format is that a corrupt byte is easier to | |||
repair the nearer it is from the beginning of the file. Therefore, with the | repair the nearer it is from the beginning of the file. Therefore, with the | |||
help of lziprecover, losing an entire archive just because of a corrupt | help of lziprecover, losing an entire archive just because of a corrupt | |||
byte near the beginning is a thing of the past. | byte near the beginning is a thing of the past. | |||
The functions and variables forming the interface of the compression | The functions and variables forming the interface of the compression | |||
library are declared in the file 'lzlib.h'. Usage examples of the library | library are declared in the file 'lzlib.h'. Usage examples of the library | |||
are given in the files 'bbexample.c', 'ffexample.c', and 'main.c' from the | are given in the files 'bbexample.c', 'ffexample.c', and 'minilzip.c' from | |||
source distribution. | the source distribution. | |||
All the library functions are thread safe. The library does not install | ||||
any signal handler. The decoder checks the consistency of the compressed | ||||
data, so the library should never crash even in case of corrupted input. | ||||
Compression/decompression is done by repeatedly calling a couple of | Compression/decompression is done by repeatedly calling a couple of | |||
read/write functions until all the data have been processed by the library. | read/write functions until all the data have been processed by the library. | |||
This interface is safer and less error prone than the traditional zlib | This interface is safer and less error prone than the traditional zlib | |||
interface. | interface. | |||
Compression/decompression is done when the read function is called. This | Compression/decompression is done when the read function is called. This | |||
means the value returned by the position functions will not be updated until | means the value returned by the position functions will not be updated until | |||
a read call, even if a lot of data are written. If you want the data to be | a read call, even if a lot of data are written. If you want the data to be | |||
compressed in advance, just call the read function with a SIZE equal to 0. | compressed in advance, just call the read function with a SIZE equal to 0. | |||
skipping to change at line 95 | skipping to change at line 99 | |||
Lzlib will correctly decompress a data stream which is the concatenation | Lzlib will correctly decompress a data stream which is the concatenation | |||
of two or more compressed data streams. The result is the concatenation of | of two or more compressed data streams. The result is the concatenation of | |||
the corresponding decompressed data streams. Integrity testing of | the corresponding decompressed data streams. Integrity testing of | |||
concatenated compressed data streams is also supported. | concatenated compressed data streams is also supported. | |||
Lzlib is able to compress and decompress streams of unlimited size by | Lzlib is able to compress and decompress streams of unlimited size by | |||
automatically creating multimember output. The members so created are large, | automatically creating multimember output. The members so created are large, | |||
about 2 PiB each. | about 2 PiB each. | |||
All the library functions are thread safe. The library does not install | ||||
any signal handler. The decoder checks the consistency of the compressed | ||||
data, so the library should never crash even in case of corrupted input. | ||||
In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a | In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a | |||
concrete algorithm; it is more like "any algorithm using the LZMA coding | concrete algorithm; it is more like "any algorithm using the LZMA coding | |||
scheme". For example, the option '-0' of lzip uses the scheme in almost the | scheme". For example, the option '-0' of lzip uses the scheme in almost the | |||
simplest way possible; issuing the longest match it can find, or a literal | simplest way possible; issuing the longest match it can find, or a literal | |||
byte if it can't find a match. Inversely, a much more elaborated way of | byte if it can't find a match. Inversely, a much more elaborated way of | |||
finding coding sequences of minimum size than the one currently used by | finding coding sequences of minimum size than the one currently used by lzip | |||
lzip could be developed, and the resulting sequence could also be coded | could be developed, and the resulting sequence could also be coded using the | |||
using the LZMA coding scheme. | LZMA coding scheme. | |||
Lzlib currently implements two variants of the LZMA algorithm; fast | Lzlib currently implements two variants of the LZMA algorithm: fast | |||
(used by option '-0' of minilzip) and normal (used by all other compression | (used by option '-0' of minilzip) and normal (used by all other compression | |||
levels). | levels). | |||
The high compression of LZMA comes from combining two basic, well-proven | The high compression of LZMA comes from combining two basic, well-proven | |||
compression ideas: sliding dictionaries (LZ77/78) and markov models (the | compression ideas: sliding dictionaries (LZ77/78) and markov models (the | |||
thing used by every compression algorithm that uses a range encoder or | thing used by every compression algorithm that uses a range encoder or | |||
similar order-0 entropy coder as its last stage) with segregation of | similar order-0 entropy coder as its last stage) with segregation of | |||
contexts according to what the bits are used for. | contexts according to what the bits are used for. | |||
The ideas embodied in lzlib are due to (at least) the following people: | The ideas embodied in lzlib are due to (at least) the following people: | |||
skipping to change at line 137 | skipping to change at line 137 | |||
File: lzlib.info, Node: Library version, Next: Buffering, Prev: Introduction, Up: Top | File: lzlib.info, Node: Library version, Next: Buffering, Prev: Introduction, Up: Top | |||
2 Library version | 2 Library version | |||
***************** | ***************** | |||
One goal of lzlib is to keep perfect backward compatibility with older | One goal of lzlib is to keep perfect backward compatibility with older | |||
versions of itself down to 1.0. Any application working with an older lzlib | versions of itself down to 1.0. Any application working with an older lzlib | |||
should work with a newer lzlib. Installing a newer lzlib should not break | should work with a newer lzlib. Installing a newer lzlib should not break | |||
anything. This chapter describes the constants and functions that the | anything. This chapter describes the constants and functions that the | |||
application can use to discover the version of the library being used. | application can use to discover the version of the library being used. All | |||
of them are declared in 'lzlib.h'. | ||||
-- Constant: LZ_API_VERSION | -- Constant: LZ_API_VERSION | |||
This constant is defined in 'lzlib.h' and works as a version test | This constant is defined in 'lzlib.h' and works as a version test | |||
macro. The application should verify at compile time that | macro. The application should verify at compile time that | |||
LZ_API_VERSION is greater than or equal to the version required by the | LZ_API_VERSION is greater than or equal to the version required by the | |||
application: | application: | |||
#if !defined LZ_API_VERSION || LZ_API_VERSION < 1012 | #if !defined LZ_API_VERSION || LZ_API_VERSION < 1012 | |||
#error "lzlib 1.12 or newer needed." | #error "lzlib 1.12 or newer needed." | |||
#endif | #endif | |||
skipping to change at line 314 | skipping to change at line 315 | |||
'LZ_compress_read'). *Note member_size::, for a description of | 'LZ_compress_read'). *Note member_size::, for a description of | |||
MEMBER_SIZE. | MEMBER_SIZE. | |||
-- Function: int LZ_compress_sync_flush ( struct LZ_Encoder * const | -- Function: int LZ_compress_sync_flush ( struct LZ_Encoder * const | |||
ENCODER ) | ENCODER ) | |||
Use this function to make available to 'LZ_compress_read' all the data | Use this function to make available to 'LZ_compress_read' all the data | |||
already written with the function 'LZ_compress_write'. First call | already written with the function 'LZ_compress_write'. First call | |||
'LZ_compress_sync_flush'. Then call 'LZ_compress_read' until it | 'LZ_compress_sync_flush'. Then call 'LZ_compress_read' until it | |||
returns 0. | returns 0. | |||
This function writes a LZMA marker '3' ("Sync Flush" marker) to the | This function writes at least one LZMA marker '3' ("Sync Flush" marker) | |||
compressed output. Note that the sync flush marker is not allowed in | to the compressed output. Note that the sync flush marker is not | |||
lzip files; it is a device for interactive communication between | allowed in lzip files; it is a device for interactive communication | |||
applications using lzlib, but is useless and wasteful in a file, and | between applications using lzlib, but is useless and wasteful in a | |||
is excluded from the media type 'application/lzip'. The LZMA marker | file, and is excluded from the media type 'application/lzip'. The LZMA | |||
'2' ("End Of Stream" marker) is the only marker allowed in lzip files. | marker '2' ("End Of Stream" marker) is the only marker allowed in lzip | |||
*Note Data format::. | files. *Note Data format::. | |||
Repeated use of 'LZ_compress_sync_flush' may degrade compression | Repeated use of 'LZ_compress_sync_flush' may degrade compression | |||
ratio, so use it only when needed. If the interval between calls to | ratio, so use it only when needed. If the interval between calls to | |||
'LZ_compress_sync_flush' is large (comparable to dictionary size), | 'LZ_compress_sync_flush' is large (comparable to dictionary size), | |||
creating a multimember data stream with 'LZ_compress_restart_member' | creating a multimember data stream with 'LZ_compress_restart_member' | |||
may be an alternative. | may be an alternative. | |||
Combining multimember stream creation with flushing may be tricky. If | Combining multimember stream creation with flushing may be tricky. If | |||
there are more bytes available than those needed to complete | there are more bytes available than those needed to complete | |||
MEMBER_SIZE, 'LZ_compress_restart_member' needs to be called when | MEMBER_SIZE, 'LZ_compress_restart_member' needs to be called when | |||
'LZ_compress_member_finished' returns 1, followed by a new call to | 'LZ_compress_member_finished' returns 1, followed by a new call to | |||
'LZ_compress_sync_flush'. | 'LZ_compress_sync_flush'. | |||
-- Function: int LZ_compress_read ( struct LZ_Encoder * const ENCODER, | -- Function: int LZ_compress_read ( struct LZ_Encoder * const ENCODER, | |||
uint8_t * const BUFFER, const int SIZE ) | uint8_t * const BUFFER, const int SIZE ) | |||
The function 'LZ_compress_read' reads up to SIZE bytes from the stream | Reads up to SIZE bytes from the stream pointed to by ENCODER, storing | |||
pointed to by ENCODER, storing the results in BUFFER. If | the results in BUFFER. If LZ_API_VERSION >= 1012, BUFFER may be a null | |||
LZ_API_VERSION >= 1012, BUFFER may be a null pointer, in which case | pointer, in which case the bytes read are discarded. | |||
the bytes read are discarded. | ||||
Returns the number of bytes actually read. This might be less than | ||||
The return value is the number of bytes actually read. This might be | SIZE; for example, if there aren't that many bytes left in the stream | |||
less than SIZE; for example, if there aren't that many bytes left in | or if more bytes have to be yet written with the function | |||
the stream or if more bytes have to be yet written with the function | ||||
'LZ_compress_write'. Note that reading less than SIZE bytes is not an | 'LZ_compress_write'. Note that reading less than SIZE bytes is not an | |||
error. | error. | |||
-- Function: int LZ_compress_write ( struct LZ_Encoder * const ENCODER, | -- Function: int LZ_compress_write ( struct LZ_Encoder * const ENCODER, | |||
uint8_t * const BUFFER, const int SIZE ) | uint8_t * const BUFFER, const int SIZE ) | |||
The function 'LZ_compress_write' writes up to SIZE bytes from BUFFER | Writes up to SIZE bytes from BUFFER to the stream pointed to by | |||
to the stream pointed to by ENCODER. | ENCODER. Returns the number of bytes actually written. This might be | |||
The return value is the number of bytes actually written. This might be | ||||
less than SIZE. Note that writing less than SIZE bytes is not an error. | less than SIZE. Note that writing less than SIZE bytes is not an error. | |||
-- Function: int LZ_compress_write_size ( struct LZ_Encoder * const | -- Function: int LZ_compress_write_size ( struct LZ_Encoder * const | |||
ENCODER ) | ENCODER ) | |||
The function 'LZ_compress_write_size' returns the maximum number of | Returns the maximum number of bytes that can be immediately written | |||
bytes that can be immediately written through 'LZ_compress_write'. For | through 'LZ_compress_write'. For efficiency reasons, once the input | |||
efficiency reasons, once the input buffer is full and | buffer is full and 'LZ_compress_write_size' returns 0, almost all the | |||
'LZ_compress_write_size' returns 0, almost all the buffer must be | buffer must be compressed before a size greater than 0 is returned | |||
compressed before a size greater than 0 is returned again. (This is | again. (This is done to minimize the amount of data that must be | |||
done to minimize the amount of data that must be copied to the | copied to the beginning of the buffer before new data can be accepted). | |||
beginning of the buffer before new data can be accepted). | ||||
It is guaranteed that an immediate call to 'LZ_compress_write' will | It is guaranteed that an immediate call to 'LZ_compress_write' will | |||
accept a SIZE up to the returned number of bytes. | accept a SIZE up to the returned number of bytes. | |||
-- Function: enum LZ_Errno LZ_compress_errno ( struct LZ_Encoder * const | -- Function: enum LZ_Errno LZ_compress_errno ( struct LZ_Encoder * const | |||
ENCODER ) | ENCODER ) | |||
Returns the current error code for ENCODER. *Note Error codes::. It is | Returns the current error code for ENCODER. *Note Error codes::. It is | |||
safe to call 'LZ_compress_errno' with a null argument, in which case | safe to call 'LZ_compress_errno' with a null argument, in which case | |||
it returns 'LZ_bad_argument'. | it returns 'LZ_bad_argument'. | |||
skipping to change at line 460 | skipping to change at line 457 | |||
'LZ_decompress_write' will be consumed and 'LZ_decompress_read' will | 'LZ_decompress_write' will be consumed and 'LZ_decompress_read' will | |||
return 0 until a header is found. | return 0 until a header is found. | |||
This function is useful to discard any data preceding the first member, | This function is useful to discard any data preceding the first member, | |||
or to discard the rest of the current member, for example in case of a | or to discard the rest of the current member, for example in case of a | |||
data error. If the decoder is already at the beginning of a member, | data error. If the decoder is already at the beginning of a member, | |||
this function does nothing. | this function does nothing. | |||
-- Function: int LZ_decompress_read ( struct LZ_Decoder * const DECODER, | -- Function: int LZ_decompress_read ( struct LZ_Decoder * const DECODER, | |||
uint8_t * const BUFFER, const int SIZE ) | uint8_t * const BUFFER, const int SIZE ) | |||
The function 'LZ_decompress_read' reads up to SIZE bytes from the | Reads up to SIZE bytes from the stream pointed to by DECODER, storing | |||
stream pointed to by DECODER, storing the results in BUFFER. If | the results in BUFFER. If LZ_API_VERSION >= 1012, BUFFER may be a null | |||
LZ_API_VERSION >= 1012, BUFFER may be a null pointer, in which case | pointer, in which case the bytes read are discarded. | |||
the bytes read are discarded. | ||||
Returns the number of bytes actually read. This might be less than | ||||
The return value is the number of bytes actually read. This might be | SIZE; for example, if there aren't that many bytes left in the stream | |||
less than SIZE; for example, if there aren't that many bytes left in | or if more bytes have to be yet written with the function | |||
the stream or if more bytes have to be yet written with the function | ||||
'LZ_decompress_write'. Note that reading less than SIZE bytes is not | 'LZ_decompress_write'. Note that reading less than SIZE bytes is not | |||
an error. | an error. | |||
'LZ_decompress_read' returns at least once per member so that | 'LZ_decompress_read' returns at least once per member so that | |||
'LZ_decompress_member_finished' can be called (and trailer data | 'LZ_decompress_member_finished' can be called (and trailer data | |||
retrieved) for each member, even for empty members. Therefore, | retrieved) for each member, even for empty members. Therefore, | |||
'LZ_decompress_read' returning 0 does not mean that the end of the | 'LZ_decompress_read' returning 0 does not mean that the end of the | |||
stream has been reached. The increase in the value returned by | stream has been reached. The increase in the value returned by | |||
'LZ_decompress_total_in_size' can be used to tell the end of the stream | 'LZ_decompress_total_in_size' can be used to tell the end of the stream | |||
from an empty member. | from an empty member. | |||
In case of decompression error caused by corrupt or truncated data, | In case of decompression error caused by corrupt or truncated data, | |||
'LZ_decompress_read' does not signal the error immediately to the | 'LZ_decompress_read' does not signal the error immediately to the | |||
application, but waits until all the bytes decoded have been read. This | application, but waits until all the bytes decoded have been read. This | |||
allows tools like tarlz to recover as much data as possible from each | allows tools like tarlz to recover as much data as possible from each | |||
damaged member. *Note tarlz manual: (tarlz)Top. | damaged member. *Note tarlz manual: (tarlz)Top. | |||
-- Function: int LZ_decompress_write ( struct LZ_Decoder * const DECODER, | -- Function: int LZ_decompress_write ( struct LZ_Decoder * const DECODER, | |||
uint8_t * const BUFFER, const int SIZE ) | uint8_t * const BUFFER, const int SIZE ) | |||
The function 'LZ_decompress_write' writes up to SIZE bytes from BUFFER | Writes up to SIZE bytes from BUFFER to the stream pointed to by | |||
to the stream pointed to by DECODER. | DECODER. Returns the number of bytes actually written. This might be | |||
The return value is the number of bytes actually written. This might be | ||||
less than SIZE. Note that writing less than SIZE bytes is not an error. | less than SIZE. Note that writing less than SIZE bytes is not an error. | |||
-- Function: int LZ_decompress_write_size ( struct LZ_Decoder * const | -- Function: int LZ_decompress_write_size ( struct LZ_Decoder * const | |||
DECODER ) | DECODER ) | |||
The function 'LZ_decompress_write_size' returns the maximum number of | Returns the maximum number of bytes that can be immediately written | |||
bytes that can be immediately written through 'LZ_decompress_write'. | through 'LZ_decompress_write'. This number varies smoothly; each | |||
This number varies smoothly; each compressed byte consumed may be | compressed byte consumed may be overwritten immediately, increasing by | |||
overwritten immediately, increasing by 1 the value returned. | 1 the value returned. | |||
It is guaranteed that an immediate call to 'LZ_decompress_write' will | It is guaranteed that an immediate call to 'LZ_decompress_write' will | |||
accept a SIZE up to the returned number of bytes. | accept a SIZE up to the returned number of bytes. | |||
-- Function: enum LZ_Errno LZ_decompress_errno ( struct LZ_Decoder * const | -- Function: enum LZ_Errno LZ_decompress_errno ( struct LZ_Decoder * const | |||
DECODER ) | DECODER ) | |||
Returns the current error code for DECODER. *Note Error codes::. It is | Returns the current error code for DECODER. *Note Error codes::. It is | |||
safe to call 'LZ_decompress_errno' with a null argument, in which case | safe to call 'LZ_decompress_errno' with a null argument, in which case | |||
it returns 'LZ_bad_argument'. | it returns 'LZ_bad_argument'. | |||
-- Function: int LZ_decompress_finished ( struct LZ_Decoder * const | -- Function: int LZ_decompress_finished ( struct LZ_Decoder * const | |||
DECODER ) | DECODER ) | |||
Returns 1 if all the data have been read and 'LZ_decompress_close' can | Returns 1 if all the data have been read and 'LZ_decompress_close' can | |||
be safely called. Otherwise it returns 0. 'LZ_decompress_finished' | be safely called. Otherwise it returns 0. 'LZ_decompress_finished' | |||
does not imply 'LZ_decompress_member_finished'. | does not imply 'LZ_decompress_member_finished'. | |||
-- Function: int LZ_decompress_member_finished ( struct LZ_Decoder * const | -- Function: int LZ_decompress_member_finished ( struct LZ_Decoder * const | |||
DECODER ) | DECODER ) | |||
Returns 1 if the previous call to 'LZ_decompress_read' finished reading | Returns 1 if the previous call to 'LZ_decompress_read' finished reading | |||
the current member, indicating that final values for member are | the current member, indicating that final values for the member are | |||
available through 'LZ_decompress_data_crc', | available through 'LZ_decompress_data_crc', | |||
'LZ_decompress_data_position', and 'LZ_decompress_member_position'. | 'LZ_decompress_data_position', and 'LZ_decompress_member_position'. | |||
Otherwise it returns 0. | Otherwise it returns 0. | |||
-- Function: int LZ_decompress_member_version ( struct LZ_Decoder * const | -- Function: int LZ_decompress_member_version ( struct LZ_Decoder * const | |||
DECODER ) | DECODER ) | |||
Returns the version of current member from member header. | Returns the version of the current member, read from the member header. | |||
-- Function: int LZ_decompress_dictionary_size ( struct LZ_Decoder * const | -- Function: int LZ_decompress_dictionary_size ( struct LZ_Decoder * const | |||
DECODER ) | DECODER ) | |||
Returns the dictionary size of the current member, read from the member | Returns the dictionary size of the current member, read from the | |||
header. | member header. | |||
-- Function: unsigned LZ_decompress_data_crc ( struct LZ_Decoder * const | -- Function: unsigned LZ_decompress_data_crc ( struct LZ_Decoder * const | |||
DECODER ) | DECODER ) | |||
Returns the 32 bit Cyclic Redundancy Check of the data decompressed | Returns the 32 bit Cyclic Redundancy Check of the data decompressed | |||
from the current member. The returned value is valid only when | from the current member. The value returned is valid only when | |||
'LZ_decompress_member_finished' returns 1. | 'LZ_decompress_member_finished' returns 1. | |||
-- Function: unsigned long long LZ_decompress_data_position ( struct | -- Function: unsigned long long LZ_decompress_data_position ( struct | |||
LZ_Decoder * const DECODER ) | LZ_Decoder * const DECODER ) | |||
Returns the number of decompressed bytes already produced, but perhaps | Returns the number of decompressed bytes already produced, but perhaps | |||
not yet read, in the current member. | not yet read, in the current member. | |||
-- Function: unsigned long long LZ_decompress_member_position ( struct | -- Function: unsigned long long LZ_decompress_member_position ( struct | |||
LZ_Decoder * const DECODER ) | LZ_Decoder * const DECODER ) | |||
Returns the number of input bytes already decompressed in the current | Returns the number of input bytes already decompressed in the current | |||
skipping to change at line 635 | skipping to change at line 629 | |||
File: lzlib.info, Node: Invoking minilzip, Next: Data format, Prev: Error mes sages, Up: Top | File: lzlib.info, Node: Invoking minilzip, Next: Data format, Prev: Error mes sages, Up: Top | |||
9 Invoking minilzip | 9 Invoking minilzip | |||
******************* | ******************* | |||
Minilzip is a test program for the compression library lzlib, fully | Minilzip is a test program for the compression library lzlib, fully | |||
compatible with lzip 1.4 or newer. | compatible with lzip 1.4 or newer. | |||
Lzip is a lossless data compressor with a user interface similar to the | Lzip is a lossless data compressor with a user interface similar to the | |||
one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov | one of gzip or bzip2. Lzip uses a simplified form of the 'Lempel-Ziv-Markov | |||
chain-Algorithm' (LZMA) stream format, chosen to maximize safety and | chain-Algorithm' (LZMA) stream format and provides a 3 factor integrity | |||
interoperability. Lzip can compress about as fast as gzip (lzip -0) or | checking to maximize interoperability and optimize safety. Lzip can compress | |||
compress most files more than bzip2 (lzip -9). Decompression speed is | about as fast as gzip (lzip -0) or compress most files more than bzip2 | |||
intermediate between gzip and bzip2. Lzip is better than gzip and bzip2 | (lzip -9). Decompression speed is intermediate between gzip and bzip2. Lzip | |||
from a data recovery perspective. Lzip has been designed, written, and | is better than gzip and bzip2 from a data recovery perspective. Lzip has | |||
tested with great care to replace gzip and bzip2 as the standard | been designed, written, and tested with great care to replace gzip and | |||
general-purpose compressed format for unix-like systems. | bzip2 as the standard general-purpose compressed format for unix-like | |||
systems. | ||||
The format for running minilzip is: | The format for running minilzip is: | |||
minilzip [OPTIONS] [FILES] | minilzip [OPTIONS] [FILES] | |||
If no file names are specified, minilzip compresses (or decompresses) from | If no file names are specified, minilzip compresses (or decompresses) from | |||
standard input to standard output. A hyphen '-' used as a FILE argument | standard input to standard output. A hyphen '-' used as a FILE argument | |||
means standard input. It can be mixed with other FILES and is read just | means standard input. It can be mixed with other FILES and is read just | |||
once, the first time it appears in the command line. | once, the first time it appears in the command line. | |||
skipping to change at line 690 | skipping to change at line 685 | |||
Compress or decompress to standard output; keep input files unchanged. | Compress or decompress to standard output; keep input files unchanged. | |||
If compressing several files, each file is compressed independently. | If compressing several files, each file is compressed independently. | |||
(The output consists of a sequence of independently compressed | (The output consists of a sequence of independently compressed | |||
members). This option (or '-o') is needed when reading from a named | members). This option (or '-o') is needed when reading from a named | |||
pipe (fifo) or from a device. Use it also to recover as much of the | pipe (fifo) or from a device. Use it also to recover as much of the | |||
decompressed data as possible when decompressing a corrupt file. '-c' | decompressed data as possible when decompressing a corrupt file. '-c' | |||
overrides '-o' and '-S'. '-c' has no effect when testing or listing. | overrides '-o' and '-S'. '-c' has no effect when testing or listing. | |||
'-d' | '-d' | |||
'--decompress' | '--decompress' | |||
Decompress the files specified. If a file does not exist or can't be | Decompress the files specified. If a file does not exist, can't be | |||
opened, minilzip continues decompressing the rest of the files. If a | opened, or the destination file already exists and '--force' has not | |||
file fails to decompress, or is a terminal, minilzip exits immediately | been specified, minilzip continues decompressing the rest of the files | |||
without decompressing the rest of the files. | and exits with error status 1. If a file fails to decompress, or is a | |||
terminal, minilzip exits immediately with error status 2 without | ||||
decompressing the rest of the files. A terminal is considered an | ||||
uncompressed file, and therefore invalid. | ||||
'-f' | '-f' | |||
'--force' | '--force' | |||
Force overwrite of output files. | Force overwrite of output files. | |||
'-F' | '-F' | |||
'--recompress' | '--recompress' | |||
When compressing, force re-compression of files whose name already has | When compressing, force re-compression of files whose name already has | |||
the '.lz' or '.tlz' suffix. | the '.lz' or '.tlz' suffix. | |||
skipping to change at line 816 | skipping to change at line 814 | |||
Aliases for GNU gzip compatibility. | Aliases for GNU gzip compatibility. | |||
'--loose-trailing' | '--loose-trailing' | |||
When decompressing or testing, allow trailing data whose first bytes | When decompressing or testing, allow trailing data whose first bytes | |||
are so similar to the magic bytes of a lzip header that they can be | are so similar to the magic bytes of a lzip header that they can be | |||
confused with a corrupt header. Use this option if a file triggers a | confused with a corrupt header. Use this option if a file triggers a | |||
"corrupt header" error and the cause is not indeed a corrupt header. | "corrupt header" error and the cause is not indeed a corrupt header. | |||
'--check-lib' | '--check-lib' | |||
Compare the version of lzlib used to compile minilzip with the version | Compare the version of lzlib used to compile minilzip with the version | |||
actually being used and exit. Report any differences found. Exit with | actually being used at run time and exit. Report any differences | |||
error status 1 if differences are found. A mismatch may indicate that | found. Exit with error status 1 if differences are found. A mismatch | |||
lzlib is not correctly installed or that a different version of lzlib | may indicate that lzlib is not correctly installed or that a different | |||
has been installed after compiling the shared version of minilzip. | version of lzlib has been installed after compiling the shared version | |||
'minilzip -v --check-lib' shows the version of lzlib being used and | of minilzip. Exit with error status 2 if LZ_API_VERSION and | |||
the value of 'LZ_API_VERSION' (if defined). *Note Library version::. | LZ_version_string don't match. 'minilzip -v --check-lib' shows the | |||
version of lzlib being used and the value of LZ_API_VERSION (if | ||||
defined). *Note Library version::. | ||||
Numbers given as arguments to options may be followed by a multiplier | Numbers given as arguments to options may be followed by a multiplier | |||
and an optional 'B' for "byte". | and an optional 'B' for "byte". | |||
Table of SI and binary prefixes (unit multipliers): | Table of SI and binary prefixes (unit multipliers): | |||
Prefix Value | Prefix Value | Prefix Value | Prefix Value | |||
k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024) | k kilobyte (10^3 = 1000) | Ki kibibyte (2^10 = 1024) | |||
M megabyte (10^6) | Mi mebibyte (2^20) | M megabyte (10^6) | Mi mebibyte (2^20) | |||
G gigabyte (10^9) | Gi gibibyte (2^30) | G gigabyte (10^9) | Gi gibibyte (2^30) | |||
T terabyte (10^12) | Ti tebibyte (2^40) | T terabyte (10^12) | Ti tebibyte (2^40) | |||
P petabyte (10^15) | Pi pebibyte (2^50) | P petabyte (10^15) | Pi pebibyte (2^50) | |||
E exabyte (10^18) | Ei exbibyte (2^60) | E exabyte (10^18) | Ei exbibyte (2^60) | |||
Z zettabyte (10^21) | Zi zebibyte (2^70) | Z zettabyte (10^21) | Zi zebibyte (2^70) | |||
Y yottabyte (10^24) | Yi yobibyte (2^80) | Y yottabyte (10^24) | Yi yobibyte (2^80) | |||
Exit status: 0 for a normal exit, 1 for environmental problems (file not | Exit status: 0 for a normal exit, 1 for environmental problems (file not | |||
found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid | found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid | |||
input file, 3 for an internal consistency error (eg, bug) which caused | input file, 3 for an internal consistency error (e.g., bug) which caused | |||
minilzip to panic. | minilzip to panic. | |||
File: lzlib.info, Node: Data format, Next: Examples, Prev: Invoking minilzip, Up: Top | File: lzlib.info, Node: Data format, Next: Examples, Prev: Invoking minilzip, Up: Top | |||
10 Data format | 10 Data format | |||
************** | ************** | |||
Perfection is reached, not when there is no longer anything to add, but | Perfection is reached, not when there is no longer anything to add, but | |||
when there is no longer anything to take away. | when there is no longer anything to take away. | |||
-- Antoine de Saint-Exupery | -- Antoine de Saint-Exupery | |||
skipping to change at line 866 | skipping to change at line 866 | |||
+---+ | +---+ | |||
represents one byte; a box like this: | represents one byte; a box like this: | |||
+==============+ | +==============+ | |||
| | | | | | |||
+==============+ | +==============+ | |||
represents a variable number of bytes. | represents a variable number of bytes. | |||
A lzip data stream consists of a series of "members" (compressed data | Lzip data consist of a series of independent "members" (compressed data | |||
sets). The members simply appear one after another in the data stream, with | sets). The members simply appear one after another in the data stream, with | |||
no additional information before, between, or after them. | no additional information before, between, or after them. Each member can | |||
encode in compressed form up to 16 EiB - 1 byte of uncompressed data. The | ||||
size of a multimember data stream is unlimited. | ||||
Each member has the following structure: | Each member has the following structure: | |||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size | | | ID string | VN | DS | LZMA stream | CRC32 | Data size | Member size | | |||
+--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +--+--+--+--+----+----+=============+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
All multibyte values are stored in little endian order. | All multibyte values are stored in little endian order. | |||
'ID string (the "magic" bytes)' | 'ID string (the "magic" bytes)' | |||
skipping to change at line 896 | skipping to change at line 898 | |||
The dictionary size is calculated by taking a power of 2 (the base | The dictionary size is calculated by taking a power of 2 (the base | |||
size) and subtracting from it a fraction between 0/16 and 7/16 of the | size) and subtracting from it a fraction between 0/16 and 7/16 of the | |||
base size. | base size. | |||
Bits 4-0 contain the base 2 logarithm of the base size (12 to 29). | Bits 4-0 contain the base 2 logarithm of the base size (12 to 29). | |||
Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract | Bits 7-5 contain the numerator of the fraction (0 to 7) to subtract | |||
from the base size to obtain the dictionary size. | from the base size to obtain the dictionary size. | |||
Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB | Example: 0xD3 = 2^19 - 6 * 2^15 = 512 KiB - 6 * 32 KiB = 320 KiB | |||
Valid values for dictionary size range from 4 KiB to 512 MiB. | Valid values for dictionary size range from 4 KiB to 512 MiB. | |||
'LZMA stream' | 'LZMA stream' | |||
The LZMA stream, finished by an end of stream marker. Uses default | The LZMA stream, finished by an "End Of Stream" marker. Uses default | |||
values for encoder properties. *Note Stream format: (lzip)Stream | values for encoder properties. *Note Stream format: (lzip)Stream | |||
format, for a complete description. | format, for a complete description. | |||
Lzip only uses the LZMA marker '2' ("End Of Stream" marker). Lzlib | Lzip only uses the LZMA marker '2' ("End Of Stream" marker). Lzlib | |||
also uses the LZMA marker '3' ("Sync Flush" marker). *Note | also uses the LZMA marker '3' ("Sync Flush" marker). *Note | |||
sync_flush::. | sync_flush::. | |||
'CRC32 (4 bytes)' | 'CRC32 (4 bytes)' | |||
Cyclic Redundancy Check (CRC) of the uncompressed original data. | Cyclic Redundancy Check (CRC) of the original uncompressed data. | |||
'Data size (8 bytes)' | 'Data size (8 bytes)' | |||
Size of the uncompressed original data. | Size of the original uncompressed data. | |||
'Member size (8 bytes)' | 'Member size (8 bytes)' | |||
Total size of the member, including header and trailer. This field acts | Total size of the member, including header and trailer. This field acts | |||
as a distributed index, allows the verification of stream integrity, | as a distributed index, allows the verification of stream integrity, | |||
and facilitates safe recovery of undamaged members from multimember | and facilitates the safe recovery of undamaged members from | |||
files. | multimember files. Member size should be limited to 2 PiB to prevent | |||
the data size field from overflowing. | ||||
File: lzlib.info, Node: Examples, Next: Problems, Prev: Data format, Up: Top | File: lzlib.info, Node: Examples, Next: Problems, Prev: Data format, Up: Top | |||
11 A small tutorial with examples | 11 A small tutorial with examples | |||
********************************* | ********************************* | |||
This chapter provides real code examples for the most common uses of the | This chapter provides real code examples for the most common uses of the | |||
library. See these examples in context in the files 'bbexample.c' and | library. See these examples in context in the files 'bbexample.c' and | |||
'ffexample.c' from the source distribution of lzlib. | 'ffexample.c' from the source distribution of lzlib. | |||
skipping to change at line 944 | skipping to change at line 947 | |||
* File compression mm:: File-to-file multimember compression | * File compression mm:: File-to-file multimember compression | |||
* Skipping data errors:: Decompression with automatic resynchronization | * Skipping data errors:: Decompression with automatic resynchronization | |||
File: lzlib.info, Node: Buffer compression, Next: Buffer decompression, Up: E xamples | File: lzlib.info, Node: Buffer compression, Next: Buffer decompression, Up: E xamples | |||
11.1 Buffer compression | 11.1 Buffer compression | |||
======================= | ======================= | |||
Buffer-to-buffer single-member compression (MEMBER_SIZE > total output). | Buffer-to-buffer single-member compression (MEMBER_SIZE > total output). | |||
/* Compresses 'insize' bytes from 'inbuf' to 'outbuf'. | /* Compress 'insize' bytes from 'inbuf' to 'outbuf'. | |||
Returns the size of the compressed data in '*outlenp'. | Return the size of the compressed data in '*outlenp'. | |||
In case of error, or if 'outsize' is too small, returns false and does | In case of error, or if 'outsize' is too small, return false and do not | |||
not modify '*outlenp'. | modify '*outlenp'. | |||
*/ | */ | |||
bool bbcompress( const uint8_t * const inbuf, const int insize, | bool bbcompress( const uint8_t * const inbuf, const int insize, | |||
const int dictionary_size, const int match_len_limit, | const int dictionary_size, const int match_len_limit, | |||
uint8_t * const outbuf, const int outsize, | uint8_t * const outbuf, const int outsize, | |||
int * const outlenp ) | int * const outlenp ) | |||
{ | { | |||
int inpos = 0, outpos = 0; | int inpos = 0, outpos = 0; | |||
bool error = false; | bool error = false; | |||
struct LZ_Encoder * const encoder = | struct LZ_Encoder * const encoder = | |||
LZ_compress_open( dictionary_size, match_len_limit, INT64_MAX ); | LZ_compress_open( dictionary_size, match_len_limit, INT64_MAX ); | |||
skipping to change at line 987 | skipping to change at line 990 | |||
return true; | return true; | |||
} | } | |||
File: lzlib.info, Node: Buffer decompression, Next: File compression, Prev: B uffer compression, Up: Examples | File: lzlib.info, Node: Buffer decompression, Next: File compression, Prev: B uffer compression, Up: Examples | |||
11.2 Buffer decompression | 11.2 Buffer decompression | |||
========================= | ========================= | |||
Buffer-to-buffer decompression. | Buffer-to-buffer decompression. | |||
/* Decompresses 'insize' bytes from 'inbuf' to 'outbuf'. | /* Decompress 'insize' bytes from 'inbuf' to 'outbuf'. | |||
Returns the size of the decompressed data in '*outlenp'. | Return the size of the decompressed data in '*outlenp'. | |||
In case of error, or if 'outsize' is too small, returns false and does | In case of error, or if 'outsize' is too small, return false and do not | |||
not modify '*outlenp'. | modify '*outlenp'. | |||
*/ | */ | |||
bool bbdecompress( const uint8_t * const inbuf, const int insize, | bool bbdecompress( const uint8_t * const inbuf, const int insize, | |||
uint8_t * const outbuf, const int outsize, | uint8_t * const outbuf, const int outsize, | |||
int * const outlenp ) | int * const outlenp ) | |||
{ | { | |||
int inpos = 0, outpos = 0; | int inpos = 0, outpos = 0; | |||
bool error = false; | bool error = false; | |||
struct LZ_Decoder * const decoder = LZ_decompress_open(); | struct LZ_Decoder * const decoder = LZ_decompress_open(); | |||
if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok ) | if( !decoder || LZ_decompress_errno( decoder ) != LZ_ok ) | |||
{ LZ_decompress_close( decoder ); return false; } | { LZ_decompress_close( decoder ); return false; } | |||
skipping to change at line 1131 | skipping to change at line 1134 | |||
if( LZ_compress_restart_member( encoder, member_size ) < 0 ) break; | if( LZ_compress_restart_member( encoder, member_size ) < 0 ) break; | |||
} | } | |||
} | } | |||
if( LZ_compress_close( encoder ) < 0 ) done = false; | if( LZ_compress_close( encoder ) < 0 ) done = false; | |||
return done; | return done; | |||
} | } | |||
Example 2: Multimember compression (user-restarted members). (Call | Example 2: Multimember compression (user-restarted members). (Call | |||
LZ_compress_open with MEMBER_SIZE > largest member). | LZ_compress_open with MEMBER_SIZE > largest member). | |||
/* Compresses 'infile' to 'outfile' as a multimember stream with one member | /* Compress 'infile' to 'outfile' as a multimember stream with one member | |||
for each line of text terminated by a newline character or by EOF. | for each line of text terminated by a newline character or by EOF. | |||
Returns 0 if success, 1 if error. | Return 0 if success, 1 if error. | |||
*/ | */ | |||
int fflfcompress( struct LZ_Encoder * const encoder, | int fflfcompress( struct LZ_Encoder * const encoder, | |||
FILE * const infile, FILE * const outfile ) | FILE * const infile, FILE * const outfile ) | |||
{ | { | |||
enum { buffer_size = 16384 }; | enum { buffer_size = 16384 }; | |||
uint8_t buffer[buffer_size]; | uint8_t buffer[buffer_size]; | |||
while( true ) | while( true ) | |||
{ | { | |||
int len, ret; | int len, ret; | |||
int size = min( buffer_size, LZ_compress_write_size( encoder ) ); | int size = min( buffer_size, LZ_compress_write_size( encoder ) ); | |||
skipping to change at line 1176 | skipping to change at line 1179 | |||
} | } | |||
} | } | |||
return 1; | return 1; | |||
} | } | |||
File: lzlib.info, Node: Skipping data errors, Prev: File compression mm, Up: Examples | File: lzlib.info, Node: Skipping data errors, Prev: File compression mm, Up: Examples | |||
11.6 Skipping data errors | 11.6 Skipping data errors | |||
========================= | ========================= | |||
/* Decompresses 'infile' to 'outfile' with automatic resynchronization to | /* Decompress 'infile' to 'outfile' with automatic resynchronization to | |||
next member in case of data error, including the automatic removal of | next member in case of data error, including the automatic removal of | |||
leading garbage. | leading garbage. | |||
*/ | */ | |||
int ffrsdecompress( struct LZ_Decoder * const decoder, | int ffrsdecompress( struct LZ_Decoder * const decoder, | |||
FILE * const infile, FILE * const outfile ) | FILE * const infile, FILE * const outfile ) | |||
{ | { | |||
enum { buffer_size = 16384 }; | enum { buffer_size = 16384 }; | |||
uint8_t buffer[buffer_size]; | uint8_t buffer[buffer_size]; | |||
while( true ) | while( true ) | |||
{ | { | |||
skipping to change at line 1223 | skipping to change at line 1226 | |||
12 Reporting bugs | 12 Reporting bugs | |||
***************** | ***************** | |||
There are probably bugs in lzlib. There are certainly errors and omissions | There are probably bugs in lzlib. There are certainly errors and omissions | |||
in this manual. If you report them, they will get fixed. If you don't, no | in this manual. If you report them, they will get fixed. If you don't, no | |||
one will ever know about them and they will remain unfixed for all | one will ever know about them and they will remain unfixed for all | |||
eternity, if not longer. | eternity, if not longer. | |||
If you find a bug in lzlib, please send electronic mail to | If you find a bug in lzlib, please send electronic mail to | |||
<lzip-bug@nongnu.org>. Include the version number, which you can find by | <lzip-bug@nongnu.org>. Include the version number, which you can find by | |||
running 'minilzip --version' or in 'LZ_version_string' from 'lzlib.h'. | running 'minilzip --version' and 'minilzip -v --check-lib'. | |||
File: lzlib.info, Node: Concept index, Prev: Problems, Up: Top | File: lzlib.info, Node: Concept index, Prev: Problems, Up: Top | |||
Concept index | Concept index | |||
************* | ************* | |||
* Menu: | * Menu: | |||
* buffer compression: Buffer compression. (line 6) | * buffer compression: Buffer compression. (line 6) | |||
* buffer decompression: Buffer decompression. (line 6) | * buffer decompression: Buffer decompression. (line 6) | |||
End of changes. 34 change blocks. | ||||
97 lines changed or deleted | 100 lines changed or added |