"Fossies" - the Fresh Open Source Software Archive

Member "xorriso-1.5.4/doc/checksums.txt" (30 Jan 2021, 14251 Bytes) of package /linux/misc/xorriso-1.5.4.pl02.tar.gz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file. See also the last Fossies "Diffs" side-by-side code changes report for "checksums.txt": 1.5.2_vs_1.5.4.

    1 
    2                     Description of libisofs MD5 checksumming
    3 
    4                by Thomas Schmitt    - mailto:scdbackup@gmx.net
    5                Libburnia project    - mailto:libburn-hackers@pykix.org
    6                                  26 Aug 2009
    7 
    8 
    9 MD5 is a 128 bit message digest with a very low probability to be the same for
   10 any pair of differing data files. It is described in RFC 1321. and can be
   11 computed e.g. by program md5sum.
   12 
   13 libisofs can equip its images with MD5 checksums for superblock, directory
   14 tree, the whole session, and for each single data file.
   15 See libisofs.h, iso_write_opts_set_record_md5().
   16 
   17 The data file checksums get loaded together with the directory tree if this
   18 is enabled by iso_read_opts_set_no_md5(). Loaded checksums can be inquired by
   19 iso_image_get_session_md5() and iso_file_get_md5().
   20 
   21 Stream recognizable checksum tags occupy exactly one block each. They can
   22 be detected by submitting a block to iso_util_decode_md5_tag(). 
   23 
   24 libisofs has own MD5 computation functions:
   25 iso_md5_start(), iso_md5_compute(), iso_md5_clone(), iso_md5_end(),
   26 iso_md5_match()
   27 
   28 
   29                           Representation in the Image
   30 
   31 There may be several stream recognizable checksum tags and a compact array
   32 of MD5 items at the end of the session. The latter allows to quickly load many
   33 file checksums from media with slow random access.
   34 
   35 
   36                               The Checksum Array
   37 
   38 Location and layout of the checksum array is recorded as AAIP attribute
   39 "isofs.ca" of the root node.
   40 See doc/susp_aaip_2_0.txt for a general description of AAIP and
   41 doc/susp_aaip_isofs_names.txt for the layout of "isofs.ca".
   42 
   43 The single data files hold an index to their MD5 checksum in individual AAIP
   44 attributes "isofs.cx". Index I means: array base address + 16 * I. 
   45 
   46 If there are N checksummed data files then the array consists of N + 2 entries
   47 with 16 bytes each.
   48 
   49 Entry number 0 holds a session checksum which covers the range from the session
   50 start block up to (but not including) the start block of the checksum area.
   51 This range is described by attribute "isofs.ca" of the root node.
   52 
   53 Entries 1 to N hold the checksums of individual data files.
   54 
   55 Entry number N + 1 holds the MD5 checksum of entries 0 to N.
   56 
   57 
   58                              The Checksum Tags
   59 
   60 Because the inquiry of AAIP attributes demands loading of the image tree,
   61 there are also checksum tags which can be detected on the fly when reading
   62 and checksumming the session from its start point as learned from a media
   63 table-of-content. 
   64 
   65 The superblock checksum tag is written after the ECMA-119 volume descriptors.
   66 The tree checksum tag is written after the  ECMA-119 directory entries.
   67 The session checksum tag is written after all payload including the checksum
   68 array. (Then follows eventual padding.)
   69 
   70 The tags are single lines of printable text at the very beginning of a block
   71 of 2048 bytes. They have the following format:
   72 
   73  Tag_id pos=# range_start=# range_size=# [session_start|next=#] md5=# self=#\n
   74 
   75 Tag_id distinguishes the following tag types
   76   "libisofs_rlsb32_checksum_tag_v1"     Relocated 64 kB superblock tag
   77   "libisofs_sb_checksum_tag_v1"         Superblock tag
   78   "libisofs_tree_checksum_tag_v1"       Directory tree tag
   79   "libisofs_checksum_tag_v1"            Session tag
   80 
   81 A relocated superblock may appear at LBA 0 of an image which was produced for
   82 being stored in a disk file or on overwritable media (e.g. DVD+RW, BD-RE).
   83 Typically there is a first session recorded with a superblock at LBA 32 and
   84 the next session may follow shortly after its session tag. (Typically at the
   85 next block address which is divisible by 32.) Normally no session starts after 
   86 the address given by parameter session_start=.
   87 
   88 Session oriented media like CD-R[W], DVD+R, BD-R will have no relocated
   89 superblock but rather bear a table-of-content on media level (to be inquired
   90 by MMC commands).
   91 
   92 
   93 Example:
   94 A relocated superblock which points to the last session. Then the first session
   95 which starts at Logical Block Address 32. The following sessions have the same
   96 structure as the first one.
   97 
   98 LBA 0:
   99    <... ECMA-119 System Area and Volume Descriptors ...>
  100 LBA 18:
  101    libisofs_rlsb32_checksum_tag_v1 pos=18 range_start=0 range_size=18 session_start=311936 md5=6fd252d5b1db52b3c5193447081820e4 self=526f7a3c7fefce09754275c6b924b6d9
  102    <... padding up to LBA 32 ...>
  103 LBA 32:
  104    <... First Session: ECMA-119 System Area and Volume Descriptors ...>
  105    libisofs_sb_checksum_tag_v1 pos=50 range_start=32 range_size=18 md5=17471035f1360a69eedbd1d0c67a6aa2 self=52d602210883eeababfc9cd287e28682
  106    <... ECMA-119 Directory Entries (the tree of file names) ...>
  107 LBA 334:
  108    libisofs_tree_checksum_tag_v1 pos=334 range_start=32 range_size=302 md5=41acd50285339be5318decce39834a45 self=fe100c338c8f9a494a5432b5bfe6bf3c
  109    <... Data file payload and checksum array ...>
  110 LBA 81554:
  111    libisofs_checksum_tag_v1 pos=81554 range_start=32 range_size=81522 md5=8adb404bdf7f5c0a078873bb129ee5b9 self=57c2c2192822b658240d62cbc88270cb
  112 
  113    <... more sessions ...>
  114 
  115 LBA 311936:
  116    <... Last Session: ECMA-119 System Area and Volume Descriptors ...>
  117 LBA 311954:
  118    libisofs_sb_checksum_tag_v1 pos=311954 range_start=311936 range_size=18 next=312286 md5=7f1586e02ac962432dc859a4ae166027 self=2c5fce263cd0ca6984699060f6253e62
  119    <... Last Session: tree, tree checksum tag, data payload, session tag ...>
  120 
  121 
  122 There are several tag parameters. Addresses are given as decimal numbers, MD5
  123 checksums as strings of 32 hex digits.
  124 
  125   pos=
  126   gives the block address where the tag supposes itself to be stored.
  127   If this does not match the block address where the tag is found then this
  128   either indicates that the tag is payload of the image or that the image has
  129   been relocated. (The latter makes the image unusable.)
  130 
  131   range_start=
  132   The block address where the session is supposed to start. If this does not
  133   match the session start on media then the volume descriptors of the
  134   image have been relocated. (This can happen with overwritable media. If
  135   checksumming started at LBA 0 and finds range_start=32, then one has to
  136   restart checksumming at LBA 32. See libburn/doc/cookbook.txt
  137   "ISO 9660 multi-session emulation on overwritable media" for background
  138   information.)
  139 
  140   range_size=
  141   The number of blocks beginning at range_start which are covered by the
  142   checksum of the tag.  
  143 
  144   Only with superblock tag and tree tag:
  145   next=
  146   The block address where the next tag is supposed to be found. This is
  147   to avoid the small possibility that a checksum tag with matching position
  148   is part of a directory entry or data file. The superblock tag is quite
  149   uniquely placed directly after the ECMA-119 Volume Descriptor Set Terminator
  150   where no such cleartext is supposed to reside by accident.
  151 
  152   Only with relocated 64 kB superblock tag:
  153   session_start=
  154   The start block address (System Area) of the session to which the relocated
  155   superblock points.  
  156 
  157   md5=
  158   The checksum payload of the tag as lower case hex digits.
  159 
  160   self=
  161   The MD5 checksum of the tag itself up to and including the last hex digit of
  162   parameter "md5=".
  163   
  164 The newline character at the end is mandatory. After that newline there may
  165 follow more lines. Their meaning is not necessarily described in this document.
  166 
  167 One such line type is the scdbackup checksum tag, an ancestor of libisofs tags
  168 which is suitable only for single session images which begin at LBA 0. It bears
  169 a checksum record which by its MD5 covers all bytes from LBA 0 up to the
  170 newline character preceding the scdbackup tag. See scdbackup/README appendix
  171 VERIFY for details.
  172 
  173 -------------------------------------------------------------------------------
  174 
  175                               Usage at Read Time
  176 
  177                      Checking Before Image Tree Loading
  178 
  179 In order to check for a trustworthy loadable image tree, read the first 32
  180 blocks from to the session start and look in block 16 to 32 for a superblock
  181 checksum tag by
  182   iso_util_decode_md5_tag(block, &tag_type, &pos,
  183                           &range_start, &range_size, &next_tag, md5, 0);
  184 
  185 If a tag of type 2 or 4 appears and has plausible parameters, then check
  186 whether its MD5 matches the MD5 of the data blocks which were read before.
  187 
  188 With tag type 2:
  189 
  190 Keep the original MD5 context of the data blocks and clone one for obtaining
  191 the MD5 bytes.
  192 If the MD5s match, then compute the checksum block and all following ones into
  193 the kept MD5 context and go on with reading and computing for the tree checksum
  194 tag. This will be found at block address next_tag, verified and parsed by:
  195   iso_util_decode_md5_tag(block, &tag_type, &pos,
  196                           &range_start, &range_size, &next_tag, md5, 3);
  197 
  198 Again, if the parameters match the reading state, the MD5 must match the
  199 MD5 computed from the data blocks which were before.
  200 If so, then the tree is ok and safe to be loaded by iso_image_import().
  201 
  202 With tag type 4:
  203 
  204 End the MD5 context and start a new context for the session which you will
  205 read next.
  206 
  207 Then look for the actual session by starting to read at the address given by
  208 parameter session_start= which is returned by iso_util_decode_md5_tag() as
  209 next_tag. Go on by looking for tag type 2 and follow above prescription.
  210 
  211 
  212                       Checking the Data Part of the Session
  213 
  214 In order to check the trustworthiness of a whole session, continue reading
  215 and checksumming after the tree was verified. 
  216 
  217 Read and checksum the blocks. When reaching block address next_tag (from the
  218 tree tag) submit this block to
  219 
  220   iso_util_decode_md5_tag(block, &tag_type, &pos,
  221                           &range_start, &range_size, &next_tag, md5, 1);
  222 
  223 If this returns 1, then check whether the returned parameters pos, range_start,
  224 and range_size match the state of block reading, and whether the returned
  225 bytes in parameter md5 match the MD5 computed from the data blocks which were
  226 read before the tag block.
  227 
  228 
  229                            Checking All Sessions
  230 
  231 If the media is sequentially recordable, obtain a table of content and check
  232 the first track of each session as prescribed above in Checking Before Image
  233 Tree Loading and in Checking the Data Part of the Session.
  234 
  235 With disk files or overwritable media, look for a relocated superblock tag
  236 but do not hop to address next_tag (given by session_start=). Instead look at
  237 LBA 32 for the first session and check it as prescribed above.
  238 After reaching its end, round up the read address to the next multiple of 32
  239 and check whether it is smaller than session_start= from the super block.
  240 If so, expect another session to start there.
  241 
  242 
  243                    Checking Single Files in a Loaded Image
  244 
  245 An image may consist of many sessions wherein many data blocks may not belong
  246 to files in the directory tree of the most recent session. Checking this
  247 tree and all its data files can ensure that all actually valid data in the
  248 image are trustworthy. This will leave out the trees of the older sessions
  249 and the obsolete data blocks of overwritten or deleted files.
  250 
  251 Once the image has been loaded, you can obtain MD5 sums from IsoNode objects
  252 which fulfill
  253   iso_node_get_type(node) == LIBISO_FILE
  254 
  255 The recorded checksum can be obtained by
  256   iso_file_get_md5(image, (IsoFile *) node, md5, 0);
  257 
  258 For accessing the file data in the loaded image use 
  259   iso_file_get_stream((IsoFile *) node);
  260 to get the data stream of the object.
  261 The checksums cover the data content as it was actually written into the ISO
  262 image stream, not necessarily as it was on hard disk before or afterwards.
  263 This implies that content filtered files bear the MD5 of the filtered data
  264 and not of the original files on disk. When checkreading, one has to avoid
  265 any reverse filtering. Dig out the stream which directly reads image data
  266 by calling iso_stream_get_input_stream() until it returns NULL and use
  267 iso_stream_get_size() rather than iso_file_get_size().
  268 
  269 Now you may call iso_stream_open(), iso_stream_read(), iso_stream_close()
  270 for reading file content from the loaded image.
  271 
  272 
  273                         Session Check in a Loaded Image
  274 
  275 iso_image_get_session_md5() gives start LBA and session payload size as of
  276 "isofs.ca" and the session checksum as of the checksum array.
  277 
  278 For reading you may use the IsoDataSource object which you submitted
  279 to iso_image_import() when reading the image. If this source is associated
  280 to a libburn drive, then libburn function burn_read_data() can read directly
  281 from it.
  282 
  283 -------------------------------------------------------------------------------
  284 
  285                             scdbackup Checksum Tags
  286 
  287 The session checksum tag does not occupy its whole block. So there is room to
  288 store a scdbackup stream checksum tag, which is an ancestor format of the tags
  289 described here. This feature allows scdbackup to omit its own checksum filter
  290 if using xorriso as ISO 9660 formatter program.
  291 Such a tag makes only sense if the session begins at LBA 0.
  292 
  293 See scdbackup-*/README, appendix VERIFY for a specification.
  294 
  295 Example of a scdbackup checksum tag:
  296 scdbackup_checksum_tag_v0.1 2456606865 61 2_2 B00109.143415 2456606865 485bbef110870c45754d7adcc844a72c c2355d5ea3c94d792ff5893dfe0d6d7b
  297 
  298 The tag is located at byte position 2456606865, contains 61 bytes of scdbackup
  299 checksum record (the next four words):
  300 Name of the backup volume is "2_2".
  301 Written in year B0 = 2010 (A9 = 2009, B1 = 2011), January (01), 9th (09),
  302 14:34:15 local time. 
  303 The size of the volume is 2456606865 bytes, which have a MD5 sum of
  304 485bbef110870c45754d7adcc844a72c.
  305 The checksum of "2_2 B00109.143415 2456606865 485bbef110870c45754d7adcc844a72c"
  306 is c2355d5ea3c94d792ff5893dfe0d6d7b.
  307 
  308 -------------------------------------------------------------------------------
  309 
  310 This text is under
  311 Copyright (c) 2009 - 2010 Thomas Schmitt <scdbackup@gmx.net>
  312 It shall only be modified in sync with libisofs and other software which
  313 makes use of libisofs checksums. Please mail change requests to mailing list
  314 <libburn-hackers@pykix.org> or to the copyright holder in private.
  315 Only if you cannot reach the copyright holder for at least one month it is
  316 permissible to modify this text under the same license as the affected
  317 copy of libisofs.
  318 If you do so, you commit yourself to taking reasonable effort to stay in 
  319 sync with the other interested users of this text.
  320