"Fossies" - the Fresh Open Source Software Archive 
Member "libisofs-1.5.4/doc/checksums.txt" (8 Jul 2020, 14251 Bytes) of package /linux/misc/libisofs-1.5.4.tar.gz:
As a special service "Fossies" has tried to format the requested text file into HTML format (style:
standard) with prefixed line numbers.
Alternatively you can here
view or
download the uninterpreted source code file.
See also the latest
Fossies "Diffs" side-by-side code changes report for "checksums.txt":
1.5.2_vs_1.5.4.
1
2 Description of libisofs MD5 checksumming
3
4 by Thomas Schmitt - mailto:scdbackup@gmx.net
5 Libburnia project - mailto:libburn-hackers@pykix.org
6 26 Aug 2009
7
8
9 MD5 is a 128 bit message digest with a very low probability to be the same for
10 any pair of differing data files. It is described in RFC 1321. and can be
11 computed e.g. by program md5sum.
12
13 libisofs can equip its images with MD5 checksums for superblock, directory
14 tree, the whole session, and for each single data file.
15 See libisofs.h, iso_write_opts_set_record_md5().
16
17 The data file checksums get loaded together with the directory tree if this
18 is enabled by iso_read_opts_set_no_md5(). Loaded checksums can be inquired by
19 iso_image_get_session_md5() and iso_file_get_md5().
20
21 Stream recognizable checksum tags occupy exactly one block each. They can
22 be detected by submitting a block to iso_util_decode_md5_tag().
23
24 libisofs has own MD5 computation functions:
25 iso_md5_start(), iso_md5_compute(), iso_md5_clone(), iso_md5_end(),
26 iso_md5_match()
27
28
29 Representation in the Image
30
31 There may be several stream recognizable checksum tags and a compact array
32 of MD5 items at the end of the session. The latter allows to quickly load many
33 file checksums from media with slow random access.
34
35
36 The Checksum Array
37
38 Location and layout of the checksum array is recorded as AAIP attribute
39 "isofs.ca" of the root node.
40 See doc/susp_aaip_2_0.txt for a general description of AAIP and
41 doc/susp_aaip_isofs_names.txt for the layout of "isofs.ca".
42
43 The single data files hold an index to their MD5 checksum in individual AAIP
44 attributes "isofs.cx". Index I means: array base address + 16 * I.
45
46 If there are N checksummed data files then the array consists of N + 2 entries
47 with 16 bytes each.
48
49 Entry number 0 holds a session checksum which covers the range from the session
50 start block up to (but not including) the start block of the checksum area.
51 This range is described by attribute "isofs.ca" of the root node.
52
53 Entries 1 to N hold the checksums of individual data files.
54
55 Entry number N + 1 holds the MD5 checksum of entries 0 to N.
56
57
58 The Checksum Tags
59
60 Because the inquiry of AAIP attributes demands loading of the image tree,
61 there are also checksum tags which can be detected on the fly when reading
62 and checksumming the session from its start point as learned from a media
63 table-of-content.
64
65 The superblock checksum tag is written after the ECMA-119 volume descriptors.
66 The tree checksum tag is written after the ECMA-119 directory entries.
67 The session checksum tag is written after all payload including the checksum
68 array. (Then follows eventual padding.)
69
70 The tags are single lines of printable text at the very beginning of a block
71 of 2048 bytes. They have the following format:
72
73 Tag_id pos=# range_start=# range_size=# [session_start|next=#] md5=# self=#\n
74
75 Tag_id distinguishes the following tag types
76 "libisofs_rlsb32_checksum_tag_v1" Relocated 64 kB superblock tag
77 "libisofs_sb_checksum_tag_v1" Superblock tag
78 "libisofs_tree_checksum_tag_v1" Directory tree tag
79 "libisofs_checksum_tag_v1" Session tag
80
81 A relocated superblock may appear at LBA 0 of an image which was produced for
82 being stored in a disk file or on overwritable media (e.g. DVD+RW, BD-RE).
83 Typically there is a first session recorded with a superblock at LBA 32 and
84 the next session may follow shortly after its session tag. (Typically at the
85 next block address which is divisible by 32.) Normally no session starts after
86 the address given by parameter session_start=.
87
88 Session oriented media like CD-R[W], DVD+R, BD-R will have no relocated
89 superblock but rather bear a table-of-content on media level (to be inquired
90 by MMC commands).
91
92
93 Example:
94 A relocated superblock which points to the last session. Then the first session
95 which starts at Logical Block Address 32. The following sessions have the same
96 structure as the first one.
97
98 LBA 0:
99 <... ECMA-119 System Area and Volume Descriptors ...>
100 LBA 18:
101 libisofs_rlsb32_checksum_tag_v1 pos=18 range_start=0 range_size=18 session_start=311936 md5=6fd252d5b1db52b3c5193447081820e4 self=526f7a3c7fefce09754275c6b924b6d9
102 <... padding up to LBA 32 ...>
103 LBA 32:
104 <... First Session: ECMA-119 System Area and Volume Descriptors ...>
105 libisofs_sb_checksum_tag_v1 pos=50 range_start=32 range_size=18 md5=17471035f1360a69eedbd1d0c67a6aa2 self=52d602210883eeababfc9cd287e28682
106 <... ECMA-119 Directory Entries (the tree of file names) ...>
107 LBA 334:
108 libisofs_tree_checksum_tag_v1 pos=334 range_start=32 range_size=302 md5=41acd50285339be5318decce39834a45 self=fe100c338c8f9a494a5432b5bfe6bf3c
109 <... Data file payload and checksum array ...>
110 LBA 81554:
111 libisofs_checksum_tag_v1 pos=81554 range_start=32 range_size=81522 md5=8adb404bdf7f5c0a078873bb129ee5b9 self=57c2c2192822b658240d62cbc88270cb
112
113 <... more sessions ...>
114
115 LBA 311936:
116 <... Last Session: ECMA-119 System Area and Volume Descriptors ...>
117 LBA 311954:
118 libisofs_sb_checksum_tag_v1 pos=311954 range_start=311936 range_size=18 next=312286 md5=7f1586e02ac962432dc859a4ae166027 self=2c5fce263cd0ca6984699060f6253e62
119 <... Last Session: tree, tree checksum tag, data payload, session tag ...>
120
121
122 There are several tag parameters. Addresses are given as decimal numbers, MD5
123 checksums as strings of 32 hex digits.
124
125 pos=
126 gives the block address where the tag supposes itself to be stored.
127 If this does not match the block address where the tag is found then this
128 either indicates that the tag is payload of the image or that the image has
129 been relocated. (The latter makes the image unusable.)
130
131 range_start=
132 The block address where the session is supposed to start. If this does not
133 match the session start on media then the volume descriptors of the
134 image have been relocated. (This can happen with overwritable media. If
135 checksumming started at LBA 0 and finds range_start=32, then one has to
136 restart checksumming at LBA 32. See libburn/doc/cookbook.txt
137 "ISO 9660 multi-session emulation on overwritable media" for background
138 information.)
139
140 range_size=
141 The number of blocks beginning at range_start which are covered by the
142 checksum of the tag.
143
144 Only with superblock tag and tree tag:
145 next=
146 The block address where the next tag is supposed to be found. This is
147 to avoid the small possibility that a checksum tag with matching position
148 is part of a directory entry or data file. The superblock tag is quite
149 uniquely placed directly after the ECMA-119 Volume Descriptor Set Terminator
150 where no such cleartext is supposed to reside by accident.
151
152 Only with relocated 64 kB superblock tag:
153 session_start=
154 The start block address (System Area) of the session to which the relocated
155 superblock points.
156
157 md5=
158 The checksum payload of the tag as lower case hex digits.
159
160 self=
161 The MD5 checksum of the tag itself up to and including the last hex digit of
162 parameter "md5=".
163
164 The newline character at the end is mandatory. After that newline there may
165 follow more lines. Their meaning is not necessarily described in this document.
166
167 One such line type is the scdbackup checksum tag, an ancestor of libisofs tags
168 which is suitable only for single session images which begin at LBA 0. It bears
169 a checksum record which by its MD5 covers all bytes from LBA 0 up to the
170 newline character preceding the scdbackup tag. See scdbackup/README appendix
171 VERIFY for details.
172
173 -------------------------------------------------------------------------------
174
175 Usage at Read Time
176
177 Checking Before Image Tree Loading
178
179 In order to check for a trustworthy loadable image tree, read the first 32
180 blocks from to the session start and look in block 16 to 32 for a superblock
181 checksum tag by
182 iso_util_decode_md5_tag(block, &tag_type, &pos,
183 &range_start, &range_size, &next_tag, md5, 0);
184
185 If a tag of type 2 or 4 appears and has plausible parameters, then check
186 whether its MD5 matches the MD5 of the data blocks which were read before.
187
188 With tag type 2:
189
190 Keep the original MD5 context of the data blocks and clone one for obtaining
191 the MD5 bytes.
192 If the MD5s match, then compute the checksum block and all following ones into
193 the kept MD5 context and go on with reading and computing for the tree checksum
194 tag. This will be found at block address next_tag, verified and parsed by:
195 iso_util_decode_md5_tag(block, &tag_type, &pos,
196 &range_start, &range_size, &next_tag, md5, 3);
197
198 Again, if the parameters match the reading state, the MD5 must match the
199 MD5 computed from the data blocks which were before.
200 If so, then the tree is ok and safe to be loaded by iso_image_import().
201
202 With tag type 4:
203
204 End the MD5 context and start a new context for the session which you will
205 read next.
206
207 Then look for the actual session by starting to read at the address given by
208 parameter session_start= which is returned by iso_util_decode_md5_tag() as
209 next_tag. Go on by looking for tag type 2 and follow above prescription.
210
211
212 Checking the Data Part of the Session
213
214 In order to check the trustworthiness of a whole session, continue reading
215 and checksumming after the tree was verified.
216
217 Read and checksum the blocks. When reaching block address next_tag (from the
218 tree tag) submit this block to
219
220 iso_util_decode_md5_tag(block, &tag_type, &pos,
221 &range_start, &range_size, &next_tag, md5, 1);
222
223 If this returns 1, then check whether the returned parameters pos, range_start,
224 and range_size match the state of block reading, and whether the returned
225 bytes in parameter md5 match the MD5 computed from the data blocks which were
226 read before the tag block.
227
228
229 Checking All Sessions
230
231 If the media is sequentially recordable, obtain a table of content and check
232 the first track of each session as prescribed above in Checking Before Image
233 Tree Loading and in Checking the Data Part of the Session.
234
235 With disk files or overwritable media, look for a relocated superblock tag
236 but do not hop to address next_tag (given by session_start=). Instead look at
237 LBA 32 for the first session and check it as prescribed above.
238 After reaching its end, round up the read address to the next multiple of 32
239 and check whether it is smaller than session_start= from the super block.
240 If so, expect another session to start there.
241
242
243 Checking Single Files in a Loaded Image
244
245 An image may consist of many sessions wherein many data blocks may not belong
246 to files in the directory tree of the most recent session. Checking this
247 tree and all its data files can ensure that all actually valid data in the
248 image are trustworthy. This will leave out the trees of the older sessions
249 and the obsolete data blocks of overwritten or deleted files.
250
251 Once the image has been loaded, you can obtain MD5 sums from IsoNode objects
252 which fulfill
253 iso_node_get_type(node) == LIBISO_FILE
254
255 The recorded checksum can be obtained by
256 iso_file_get_md5(image, (IsoFile *) node, md5, 0);
257
258 For accessing the file data in the loaded image use
259 iso_file_get_stream((IsoFile *) node);
260 to get the data stream of the object.
261 The checksums cover the data content as it was actually written into the ISO
262 image stream, not necessarily as it was on hard disk before or afterwards.
263 This implies that content filtered files bear the MD5 of the filtered data
264 and not of the original files on disk. When checkreading, one has to avoid
265 any reverse filtering. Dig out the stream which directly reads image data
266 by calling iso_stream_get_input_stream() until it returns NULL and use
267 iso_stream_get_size() rather than iso_file_get_size().
268
269 Now you may call iso_stream_open(), iso_stream_read(), iso_stream_close()
270 for reading file content from the loaded image.
271
272
273 Session Check in a Loaded Image
274
275 iso_image_get_session_md5() gives start LBA and session payload size as of
276 "isofs.ca" and the session checksum as of the checksum array.
277
278 For reading you may use the IsoDataSource object which you submitted
279 to iso_image_import() when reading the image. If this source is associated
280 to a libburn drive, then libburn function burn_read_data() can read directly
281 from it.
282
283 -------------------------------------------------------------------------------
284
285 scdbackup Checksum Tags
286
287 The session checksum tag does not occupy its whole block. So there is room to
288 store a scdbackup stream checksum tag, which is an ancestor format of the tags
289 described here. This feature allows scdbackup to omit its own checksum filter
290 if using xorriso as ISO 9660 formatter program.
291 Such a tag makes only sense if the session begins at LBA 0.
292
293 See scdbackup-*/README, appendix VERIFY for a specification.
294
295 Example of a scdbackup checksum tag:
296 scdbackup_checksum_tag_v0.1 2456606865 61 2_2 B00109.143415 2456606865 485bbef110870c45754d7adcc844a72c c2355d5ea3c94d792ff5893dfe0d6d7b
297
298 The tag is located at byte position 2456606865, contains 61 bytes of scdbackup
299 checksum record (the next four words):
300 Name of the backup volume is "2_2".
301 Written in year B0 = 2010 (A9 = 2009, B1 = 2011), January (01), 9th (09),
302 14:34:15 local time.
303 The size of the volume is 2456606865 bytes, which have a MD5 sum of
304 485bbef110870c45754d7adcc844a72c.
305 The checksum of "2_2 B00109.143415 2456606865 485bbef110870c45754d7adcc844a72c"
306 is c2355d5ea3c94d792ff5893dfe0d6d7b.
307
308 -------------------------------------------------------------------------------
309
310 This text is under
311 Copyright (c) 2009 - 2010 Thomas Schmitt <scdbackup@gmx.net>
312 It shall only be modified in sync with libisofs and other software which
313 makes use of libisofs checksums. Please mail change requests to mailing list
314 <libburn-hackers@pykix.org> or to the copyright holder in private.
315 Only if you cannot reach the copyright holder for at least one month it is
316 permissible to modify this text under the same license as the affected
317 copy of libisofs.
318 If you do so, you commit yourself to taking reasonable effort to stay in
319 sync with the other interested users of this text.
320