"Fossies" - the Fresh Open Source Software Archive

Member "ziplimit.txt" (2 Jan 2009, 13644 Bytes) of package /windows/misc/unz600dn.zip:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 ziplimit.txt
    2 
    3 A1) Hard limits of the Zip archive format (without Zip64 extensions):
    4 
    5    Number of entries in Zip archive:            64 Ki (2^16 - 1 entries)
    6    Compressed size of archive entry:            4 GiByte (2^32 - 1 Bytes)
    7    Uncompressed size of entry:                  4 GiByte (2^32 - 1 Bytes)
    8    Size of single-volume Zip archive:           4 GiByte (2^32 - 1 Bytes)
    9    Per-volume size of multi-volume archives:    4 GiByte (2^32 - 1 Bytes)
   10    Number of parts for multi-volume archives:   64 Ki (2^16 - 1 parts)
   11    Total size of multi-volume archive:          256 TiByte (4G * 64k)
   12 
   13    The number of archive entries and of multivolume parts are limited by
   14    the structure of the "end-of-central-directory" record, where the these
   15    numbers are stored in 2-Byte fields.
   16    Some Zip and/or UnZip implementations (for example Info-ZIP's) allow
   17    handling of archives with more than 64k entries.  (The information
   18    from "number of entries" field in the "end-of-central-directory" record
   19    is not really neccessary to retrieve the contents of a Zip archive;
   20    it should rather be used for consistency checks.)
   21 
   22    Length of an archive entry name:             64 KiByte (2^16 - 1)
   23    Length of archive member comment:            64 KiByte (2^16 - 1)
   24    Total length of "extra field":               64 KiByte (2^16 - 1)
   25    Length of a single e.f. block:               64 KiByte (2^16 - 1)
   26    Length of archive comment:                   64 KiByte (2^16 - 1)
   27 
   28    Additional limitation claimed by PKWARE:
   29      Size of local-header structure (fixed fields of 30 Bytes + filename
   30       local extra field):                     < 64 KiByte
   31      Size of central-directory structure (46 Bytes + filename +
   32       central extra field + member comment):  < 64 KiByte
   33 
   34 A2) Hard limits of the Zip archive format with Zip64 extensions:
   35    In 2001, PKWARE has published version 4.5 of the Zip format specification
   36    (together with the release of PKZIP for Windows 4.5).  This specification
   37    defines new extra field blocks that allow to break the size limits of the
   38    standard zipfile structures.  This extended "Zip64" format enlarges the
   39    theoretical limits to the following values:
   40 
   41    Number of entries in Zip archive:            16 Ei (2^64 - 1 entries)
   42    Compressed size of archive entry:            16 EiByte (2^64 - 1 Bytes)
   43    Uncompressed size of entry:                  16 EiByte (2^64 - 1 Bytes)
   44    Size of single-volume Zip archive:           16 EiByte (2^64 - 1 Bytes)
   45    Per-volume size of multi-volume archives:    16 EiByte (2^64 - 1 Bytes)
   46    Number of parts for multi-volume archives:   4 Gi (2^32 - 1 parts)
   47    Total size of multi-volume archive:          2^96 Byte (16 Ei * 4Gi)
   48 
   49    The Info-ZIP software releases (beginning with Zip 3.0 and UnZip 6.0)
   50    support Zip64 archives on selected environments (where the underlying
   51    operating system capabilities are sufficient, e.g. Unix, VMS and Win32).
   52 
   53 B) Implementation limits of UnZip:
   54 
   55  1. Size limits caused by file I/O and decompression handling:
   56    a) Without "Zip64" and "LargeFile" extensions:
   57     Size of Zip archive:                2 GiByte (2^31 - 1 Bytes)
   58     Compressed size of archive entry:   2 GiByte (2^31 - 1 Bytes)
   59 
   60    b) With "Zip64" enabled and "LargeFile" supported:
   61     Size of Zip archive:                8 EiByte (2^63 - 1 Bytes)
   62     Compressed size of archive entry:   8 EiByte (2^63 - 1 Bytes)
   63     Uncompressed size of entry:         8 EiByte (2^63 - 1 Bytes)
   64 
   65    Note: On some systems, even UnZip without "LargeFile" extensions enabled
   66          may support archive sizes up to 4 GiByte.  To get this support, the
   67          target environment has to meet the following requirements:
   68          a) The compiler's intrinsic "long" data types must be able to hold
   69             integer numbers of 2^32. In other words - the standard intrinsic
   70             integer types "long" and "unsigned long" have to be wider than
   71             32 bit.
   72          b) The system has to supply a C runtime library that is compatible
   73             with the more-than-32-bit-wide "long int" type of condition a)
   74          c) The standard file positioning functions fseek(), ftell() (and/or
   75             the Unix style lseek() and tell() functions) have to be capable
   76             to move to absolute file offsets of up to 4 GiByte from the file
   77             start.
   78          On 32-bit CPU hardware, you generally cannot expect that a C compiler
   79          provides a "long int" type that is wider than 32-bit. So, many of the
   80          most popular systems (i386, PowerPC, 680x0, et. al) are out of luck.
   81          You may find environment that provide all requirements on systems
   82          with 64-bit CPU hardware. Examples might be Cray number crunchers,
   83          Compaq (former DEC) Alpha AXP machines, or Intel/AMD x64 computers.
   84 
   85    The number of Zip archive entries is unlimited. The "number-of-entries"
   86    field of the "end-of-central-dir" record is checked against the "number
   87    of entries found in the central directory" modulus 64k (2^16) (without
   88    Zip64 extension) or modulus 2^64 (with Zip64 extensions enabled for
   89    Zip64 archives).
   90 
   91    Multi-volume archive extraction is not (yet) supported.
   92 
   93    Memory requirements are mostly independent of the archive size
   94    and archive contents.
   95    In general, UnZip needs a fixed amount of internal buffer space
   96    plus the size to hold the complete information of the currently
   97    processed entry's local header. Here, a large extra field
   98    (could be up to 64 kByte) may exceed the available memory
   99    for MSDOS 16-bit executables (when they were compiled in small
  100    or medium memory model, with a fixed 64 KiByte limit on data space).
  101 
  102    The other exception where memory requirements scale with "larger"
  103    archives is the "restore directory attributes" feature. Here, the
  104    directory attributes info for each restored directory has to be held
  105    in memory until the whole archive has been processed. So, the amount
  106    of memory needed to keep this info scales with the number of restored
  107    directories and may cause memory problems when a lot of directories
  108    are restored in a single run.
  109 
  110 C) Implementation limits of the Zip executables:
  111 
  112  1. Size limits caused by file I/O and compression handling:
  113    a) Without "Zip64" and "LargeFile" extensions:
  114     Size of Zip archive:                2 GiByte (2^31 - 1 Bytes)
  115     Compressed size of archive entry:   2 GiByte (2^31 - 1 Bytes)
  116     Uncompressed size of entry:         2 GiByte (2^31 - 1 Bytes),
  117                                         (could/should be 4 GiBytes...)
  118 
  119    b) With "Zip64" enabled and "LargeFile" supported:
  120     Size of Zip archive:                8 EiByte (2^63 - 1 Bytes)
  121     Compressed size of archive entry:   8 EiByte (2^63 - 1 Bytes)
  122     Uncompressed size of entry:         8 EiByte (2^63 - 1 Bytes)
  123 
  124    Multi-volume archive creation now supported in the form of split
  125    archives.  Currently up to 99,999 splits are supported.
  126 
  127  2. Limits caused by handling of archive contents lists
  128 
  129  2.1. Number of archive entries (freshen, update, delete)
  130      a) 16-bit executable:              64k (2^16 -1) or 32k (2^15 - 1),
  131                                         (unsigned vs. signed type of size_t)
  132      a1) 16-bit executable:             <16k ((2^16)/4)
  133          (The smaller limit a1) results from the array size limit of
  134          the "qsort()" function.)
  135 
  136          32-bit executable:             <1G ((2^32)/4)
  137          (usual system limit of the "qsort()" function on 32-bit systems)
  138 
  139          64-bit executable:             <2Ei ((2^64)/8)
  140          (theoretical limit of 64-bit flat memory model, the actual limit of
  141          currently available OS implementations is several orders of magnitude
  142          lower)
  143 
  144      b) stack space needed by qsort to sort list of archive entries
  145 
  146      NOTE: In the current executables, overflows of limits a) and b) are NOT
  147            checked!
  148 
  149      c) amount of free memory to hold "central directory information" of
  150         all archive entries; one entry needs:
  151         128 bytes (Zip64), 96 bytes (32-bit) resp. 80 bytes (16-bit)
  152         + 3 * length of entry name
  153         + length of zip entry comment (when present)
  154         + length of extra field(s) (when present, e.g.: UT needs 9 bytes)
  155         + some bytes for book-keeping of memory allocation
  156 
  157    Conclusion:
  158      For systems with limited memory space (MSDOS, small AMIGAs, other
  159      environments without virtual memory), the number of archive entries
  160      is most often limited by condition c).
  161      For example, with approx. 100 kBytes of free memory after loading and
  162      initializing the program, a 16-bit DOS Zip cannot process more than 600
  163      to 1000 (+) archive entries.  (For the 16-bit Windows DLL or the 16-bit
  164      OS/2 port, limit c) is less important because Windows or OS/2 executables
  165      are not restricted to the 1024k area of real mode memory.  These 16-bit
  166      ports are limited by conditions a1) and b), say: at maximum approx.
  167      16000 entries!)
  168 
  169 
  170  2.2. Number of "new" entries (add operation)
  171      In addition to the restrictions above (2.1.), the following limits
  172      caused by the handling of the "new files" list apply:
  173 
  174      a) 16-bit executable:              <16k ((2^64)/4)
  175 
  176      b) stack size required for "qsort" operation on "new entries" list.
  177 
  178      NOTE: In the current executables, the overflow checks for these limits
  179            are missing!
  180 
  181      c) amount of free memory to hold the directory info list for new entries;
  182         one entry needs:
  183         32 bytes (Zip64), 24 bytes (32-bit) resp. 22 bytes (16-bit)
  184         + 3 * length of filename
  185 
  186      NOTE: For larger systems, the actual usability limits may be more
  187      performance issues (how long you want to wait) rather than available
  188      memory and other resources.
  189 
  190 D) Some technical remarks:
  191 
  192  1. For executables without support for "Zip64" archives and "LargeFile"
  193     I/O extensions, the 2GiByte size limit on archive files is a consequence
  194     of the portable C implementation used for the Info-ZIP programs.
  195     Zip archive processing requires random access to the archive file for
  196     jumping between different parts of the archive's structure.
  197     In standard C, this is done via stdio functions fseek()/ftell() resp.
  198     unix-io functions lseek()/tell().  In many (most?) C implementations,
  199     these functions use "signed long" variables to hold offset pointers
  200     into sequential files.  In most cases, this is a signed 32-bit number,
  201     which is limited to ca. 2E+09.  There may be specific C runtime library
  202     implementations that interpret the offset numbers as unsigned, but for
  203     us, this is not reliable in the context of portable programming.
  204 
  205  2. Similarly, for executables without "Zip64" and "LargeFile" support,
  206     the 2GiByte limit on the size of a single compressed archive member
  207     is again a consequence of the implementation in C.
  208     The variables used internally to count the size of the compressed
  209     data stream are of type "long", which is guaranted to be at least
  210     32-bit wide on all supported environments.
  211 
  212     But, why do we use "signed" long and not "unsigned long"?
  213 
  214     Throughout the I/O handling of the compressed data stream, the sign bit
  215     of the "long" numbers is (mis-)used as a kind of overflow detection.
  216     In the end, this is caused by the fact that standard C lacks any
  217     overflow checking on integer arithmetics and does not support access
  218     to the underlying hardware's overflow detection (the status bits,
  219     especially "carry" and "overflow" of the CPU's flags-register) in a
  220     system-independent manner.
  221 
  222     So, we "misuse" the most-significant bit of the compressed data size
  223     counters as carry bit for efficient overflow/underflow detection.  We
  224     could change the code to a different method of overflow detection, by
  225     using a bunch of "sanity" comparisons (kind of "is the calculated result
  226     plausible when compared with the operands"). But, this would "blow up"
  227     the code of the "inner loop", with remarkable loss of processing speed.
  228     Or, we could reduce the amount of consistency checks of the compressed
  229     data (e.g. detection of premature end of stream) to an absolute minimum,
  230     at the cost of the programs' stability when processing corrupted data.
  231 
  232  3. The argumentation above is somewhat out-dated. Beginning with the
  233     releases of Zip 3 and UnZip 6, Info-ZIP programs support archive
  234     sizes larger than 4GiB on systems where the required underlying
  235     support for 64-bit file offsets and file sizes is available from
  236     the OS (and the C runtime environment).
  237 
  238     For executables with support for "Zip64" archive format and "LargeFile"
  239     extension, the I/O limits are lifted by applying extended 64-bit off_t
  240     file offsets.  All limits discussed above are then based on integer
  241     sizes of 64 bits instead of 32, this should allow to handle file and
  242     archive sizes up to the limits of manufacturable hardware for the
  243     foreseeable future.  The reduction of the theoretical limits from
  244     (2^64 - 1) to (2^63 - 1) because of the throughout use of signed
  245     numbers can be neglected with the currently imaginable hardware.
  246 
  247     However, this new support partially breaks compatibility with older
  248     "legacy" systems.  And it should be noted that the portability and
  249     readability of the UnZip and Zip code has suffered somehow caused
  250     by the extensive use of non-standard language extension needed for
  251     64-bit support on the major target systems.
  252 
  253 Please report any problems to:  Zip-Bugs at www.info-zip.org
  254 
  255 Last updated:  25 May 2008, Ed Gordon
  256                02 January 2009, Christian Spieler