A hint: This file contains one or more very long lines, so maybe it is better readable using the pure text view mode that shows the contents as wrapped lines within the browser window.
1 T A B L E o f C O N T E N T S 2 --------------------------------- 3 4 1 Building Adobe XMPsdk and Samples in Terminal with the ./Generate_XXX_mac.sh scripts 5 1.1 Amazing Discovery 1 DumpFile is linked to libstdc++.6.dylib 6 1.2 Amazing Discovery 2 Millions of "weak symbol/visibility" messages 7 8 4 Build design for v0.26.1 9 4.8 Support for MinGW 10 11 5 Refactoring the Tiff Code 12 5.1 Background 13 5.2 How does Exiv2 decode the ExifData in a JPEG? 14 5.3 How is metadata organized in Exiv2 15 5.4 Where are the tags defined? 16 5.5 How do the MakerNotes get decoded? 17 5.6 How do the encoders work? 18 19 6 Using external XMP SDK via Conan 20 21 ========================================================================== 22 23 4 Build design for v0.26.1 24 25 Added : 2017-08-18 26 Modified: 2017-08-23 27 28 The purpose of the v0.26.1 is to release bug fixes and 29 experimental new features which may become defaults with v0.27 30 31 4.8 Support for MinGW 32 MinGW msys/1.0 was deprecated when v0.26 was released. 33 No support for MinGW msys/1.0 will be provided. 34 It's very likely that the MinGW msys/1.0 will build. 35 I will not provide any user support for MinGW msys/1.0 in future. 36 37 MinGW msys/2.0 might be supported as "experimental" in Exiv2 v0.26.2 38 39 40 ========================================================================== 41 42 5 Refactoring the Tiff Code 43 44 Added : 2017-09-24 45 Modified: 2017-09-24 46 47 5.1 Background 48 Tiff parsing is the root code of a metadata engine. 49 50 The Tiff parsing code in Exiv2 is very difficult to understand and has major architectural shortcomings: 51 52 1) It requires the Tiff file to be totally in memory 53 2) It cannot handle BigTiff 54 3) The parser doesn't know the source of the in memory tiff image 55 4) It uses memory mapping on the tiff file 56 - if the network connection is lost, horrible things happen 57 - it requires a lot of VM to map the complete file 58 - BigTiff file can be 100GB+ 59 - The memory mapping causes problems with Virus Detection software on Windows 60 5) The parser cannot deal with multi-page tiff files 61 6) It requires the total file to be in contiguous memory and defeats 'webready'. 62 63 The Tiff parsing code in Exiv2 is ingenious. It's also very robust. It works well. It can: 64 65 1) Handle 32-bit Tiff and Many Raw formats (which are derived from Tiff) 66 2) It can read and write Manufacturer's MakerNotes which are (mostly) in Tiff format 67 3) It probably has other great features that I haven't discovered 68 - because the code is so hard to understand, I can't simply browse and read it. 69 4) It separates file navigation from data analysis. 70 71 The code in image::printStructure was originally written to understand "what is a tiff?" 72 It has problems: 73 1) It was intended to be a single threaded debugging function and has security issues. 74 2) It doesn't handle BigTiff 75 3) It's messy. It's reading and processing metadata simultaneously. 76 77 The aim of this project is to 78 1) Reconsider the Tiff Code. 79 2) Keep everything good in the code and address known deficiencies 80 3) Establish a Team Exiv2 "Tiff Expert" who knows the code intimately. 81 82 5.2 How does Exiv2 decode the ExifData in a JPEG? 83 You can get my test file from http://clanmills.com/Stonehenge.jpg 84 85 808 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/build $ exiv2 -pS ~/Stonehenge.jpg 86 STRUCTURE OF JPEG FILE: /Users/rmills/Stonehenge.jpg 87 address | marker | length | data 88 0 | 0xffd8 SOI 89 2 | 0xffe1 APP1 | 15288 | Exif..II*...................... 90 15292 | 0xffe1 APP1 | 2610 | http://ns.adobe.com/xap/1.0/.<?x 91 17904 | 0xffed APP13 | 96 | Photoshop 3.0.8BIM.......'..... 92 18002 | 0xffe2 APP2 | 4094 | MPF.II*...............0100..... 93 22098 | 0xffdb DQT | 132 94 22232 | 0xffc0 SOF0 | 17 95 22251 | 0xffc4 DHT | 418 96 22671 | 0xffda SOS 97 809 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/build $ 98 99 Exiv2 calls JpegBase::readMetadata which locates the APP1/Exif segment. 100 It invokes the ExifParser: 101 ExifParser::decode(exifData_, rawExif.pData_, rawExif.size_); 102 This is thin wrapper over: 103 TiffParserWorker::decode(....) in tiffimage.cpp 104 105 What happens then? I don't know. The metadata is decoded in: 106 tiffvisitor.cpp TiffDecoder::visitEntry() 107 108 The design of the TiffMumble classes is the "Visitor" pattern 109 described in "Design Patterns" by Addison & Wesley. The aim of the pattern 110 is to separate parsing from dealing with the data. 111 112 The data is being stored in ExifData which is a vector. 113 Order is important and preserved. 114 As the data values are recovered they are stored as Exifdatum in the vector. 115 116 How does the tiff visitor work? I think the reader and processor 117 are connected by this line in TiffParser:: 118 rootDir->accept(reader); 119 120 The class tree for the decoder is: 121 122 class TiffDecoder : public TiffFinder { 123 class TiffReader , 124 class TiffFinder : public TiffVisitor { 125 class TiffVisitor { 126 public: 127 //! Events for the stop/go flag. See setGo(). 128 enum GoEvent { 129 geTraverse = 0, 130 geKnownMakernote = 1 131 }; 132 133 void setGo(GoEvent event, bool go); 134 virtual void visitEntry(TiffEntry* object) =0; 135 virtual void visitDataEntry(TiffDataEntry* object) =0; 136 virtual void visitImageEntry(TiffImageEntry* object) =0; 137 virtual void visitSizeEntry(TiffSizeEntry* object) =0; 138 virtual void visitDirectory(TiffDirectory* object) =0; 139 virtual void visitSubIfd(TiffSubIfd* object) =0; 140 virtual void visitMnEntry(TiffMnEntry* object) =0; 141 virtual void visitIfdMakernote(TiffIfdMakernote* object) =0; 142 virtual void visitIfdMakernoteEnd(TiffIfdMakernote* object); 143 virtual void visitBinaryArray(TiffBinaryArray* object) =0; 144 virtual void visitBinaryArrayEnd(TiffBinaryArray* object); 145 //! Operation to perform for an element of a binary array 146 virtual void visitBinaryElement(TiffBinaryElement* object) =0; 147 148 //! Check if stop flag for \em event is clear, return true if it's clear. 149 bool go(GoEvent event) const; 150 } 151 } 152 } 153 154 The reader works by stepping along the Tiff directory and calls the visitor's 155 "callbacks" as it reads. 156 157 There are 2000 lines of code in tiffcomposite.cpp and, to be honest, 158 I don't know what most of it does! 159 160 Set a breakpoint in src/exif.cpp#571. 161 That’s where he adds the key/value to the exifData vector. 162 Exactly how did he get here? That’s a puzzle. 163 164 void ExifData::add(const ExifKey& key, const Value* pValue) 165 { 166 add(Exifdatum(key, pValue)); 167 } 168 169 5.3 How is metadata organized in Exiv2 170 section.group.tag 171 172 section: Exif | IPTC | Xmp 173 group: Photo | Image | MakerNote | Nikon3 .... 174 tag: YResolution etc ... 175 176 820 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa ~/Stonehenge.jpg | cut -d' ' -f 1 | cut -d. -f 1 | sort | uniq 177 Exif 178 Iptc 179 Xmp 180 181 821 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa --grep Exif ~/Stonehenge.jpg | cut -d'.' -f 2 | sort | uniq 182 GPSInfo 183 Image 184 Iop 185 MakerNote 186 Nikon3 187 NikonAf2 188 NikonCb2b 189 NikonFi 190 NikonIi 191 NikonLd3 192 NikonMe 193 NikonPc 194 NikonVr 195 NikonWt 196 Photo 197 Thumbnail 198 199 822 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ 533 rmills@rmillsmbp:~/Downloads $ exiv2 -pa --grep Exif ~/Stonehenge.jpg | cut -d'.' -f 3 | cut -d' ' -f 1 | sort | uniq 200 AFAperture 201 AFAreaHeight 202 AFAreaMode 203 ... 204 XResolution 205 YCbCrPositioning 206 YResolution 207 534 rmills@rmillsmbp:~/Downloads $ 208 823 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ 209 210 The data in IFD0 of is Exiv2.Image: 211 212 826 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pR ~/Stonehenge.jpg | head -20 213 STRUCTURE OF JPEG FILE: /Users/rmills/Stonehenge.jpg 214 address | marker | length | data 215 0 | 0xffd8 SOI 216 2 | 0xffe1 APP1 | 15288 | Exif..II*...................... 217 STRUCTURE OF TIFF FILE (II): MemIo 218 address | tag | type | count | offset | value 219 10 | 0x010f Make | ASCII | 18 | 146 | NIKON CORPORATION 220 22 | 0x0110 Model | ASCII | 12 | 164 | NIKON D5300 221 34 | 0x0112 Orientation | SHORT | 1 | | 1 222 46 | 0x011a XResolution | RATIONAL | 1 | 176 | 300/1 223 58 | 0x011b YResolution | RATIONAL | 1 | 184 | 300/1 224 70 | 0x0128 ResolutionUnit | SHORT | 1 | | 2 225 82 | 0x0131 Software | ASCII | 10 | 192 | Ver.1.00 226 94 | 0x0132 DateTime | ASCII | 20 | 202 | 2015:07:16 20:25:28 227 106 | 0x0213 YCbCrPositioning | SHORT | 1 | | 1 228 118 | 0x8769 ExifTag | LONG | 1 | | 222 229 STRUCTURE OF TIFF FILE (II): MemIo 230 address | tag | type | count | offset | value 231 224 | 0x829a ExposureTime | RATIONAL | 1 | 732 | 10/4000 232 236 | 0x829d FNumber | RATIONAL | 1 | 740 | 100/10 233 827 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa --grep Image ~/Stonehenge.jpg 234 Exif.Image.Make Ascii 18 NIKON CORPORATION 235 Exif.Image.Model Ascii 12 NIKON D5300 236 Exif.Image.Orientation Short 1 top, left 237 Exif.Image.XResolution Rational 1 300 238 Exif.Image.YResolution Rational 1 300 239 Exif.Image.ResolutionUnit Short 1 inch 240 Exif.Image.Software Ascii 10 Ver.1.00 241 Exif.Image.DateTime Ascii 20 2015:07:16 20:25:28 242 Exif.Image.YCbCrPositioning Short 1 Centered 243 Exif.Image.ExifTag Long 1 222 244 Exif.Nikon3.ImageBoundary Short 4 0 0 6000 4000 245 Exif.Nikon3.ImageDataSize Long 1 6173648 246 Exif.NikonAf2.AFImageWidth Short 1 0 247 Exif.NikonAf2.AFImageHeight Short 1 0 248 Exif.Photo.ImageUniqueID Ascii 33 090caaf2c085f3e102513b24750041aa 249 Exif.Image.GPSTag Long 1 4060 250 828 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ 251 252 The data in IFD1 is Exiv2.Photo 253 254 The data in the MakerNote is another embedded TIFF (which more embedded tiffs) 255 256 829 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa --grep MakerNote ~/Stonehenge.jpg 257 Exif.Photo.MakerNote Undefined 3152 (Binary value suppressed) 258 Exif.MakerNote.Offset Long 1 914 259 Exif.MakerNote.ByteOrder Ascii 3 II 260 830 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ 261 262 The MakerNote decodes them into: 263 264 Exif.Nikon1, Exiv2.NikonAf2 and so on. I don't know exactly it achieves this. 265 However it means that tag-numbers can be reused in different IFDs. 266 Tag 0x0016 = Nikon GPSSpeed and can mean something different elsewhere. 267 268 5.4 Where are the tags defined? 269 270 There's an array of "TagInfo" data structures in each of the makernote decoders. 271 These define the tag (a number) and the tag name, the groupID (eg canonId) and the default type. 272 There's also a callback to print the value of the tag. This does the "interpretation" 273 that is performed by the -pt in the exiv2 command-line program. 274 275 TagInfo(0x4001, "ColorData", N_("Color Data"), N_("Color data"), canonId, makerTags, unsignedShort, -1, printValue), 276 277 5.5 How do the MakerNotes get decoded? 278 279 I don't know. It has something to do with this code in tiffcomposite.cpp#936 280 281 TiffMnEntry::doAccept(TiffVisitor& visitor) { ... } 282 283 Most makernotes are TiffStructures. So the TiffXXX classes are invoked recursively to decode the maker note. 284 285 #0 0x000000010058b4b0 in Exiv2::Internal::TiffDirectory::doAccept(Exiv2::Internal::TiffVisitor&) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffcomposite.cpp:916 286 This function iterated the array of entries 287 288 #1 0x000000010058b3c6 in Exiv2::Internal::TiffComponent::accept(Exiv2::Internal::TiffVisitor&) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffcomposite.cpp:891 289 #2 0x00000001005b5357 in Exiv2::Internal::TiffParserWorker::parse(unsigned char const*, unsigned int, unsigned int, Exiv2::Internal::TiffHeaderBase*) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffimage.cpp:2006 290 This function creates an array of TiffEntries 291 292 #3 0x00000001005a2a60 in Exiv2::Internal::TiffParserWorker::decode(Exiv2::ExifData&, Exiv2::IptcData&, Exiv2::XmpData&, unsigned char const*, unsigned int, unsigned int, void (Exiv2::Internal::TiffDecoder::* (*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned int, Exiv2::Internal::IfdId))(Exiv2::Internal::TiffEntryBase const*), Exiv2::Internal::TiffHeaderBase*) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffimage.cpp:1900 293 #4 0x00000001005a1ae9 in Exiv2::TiffParser::decode(Exiv2::ExifData&, Exiv2::IptcData&, Exiv2::XmpData&, unsigned char const*, unsigned int) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffimage.cpp:260 294 #5 0x000000010044d956 in Exiv2::ExifParser::decode(Exiv2::ExifData&, unsigned char const*, unsigned int) at /Users/rmills/gnu/github/exiv2/exiv2/src/exif.cpp:625 295 #6 0x0000000100498fd7 in Exiv2::JpegBase::readMetadata() at /Users/rmills/gnu/github/exiv2/exiv2/src/jpgimage.cpp:386 296 #7 0x000000010000bc59 in Action::Print::printList() at /Users/rmills/gnu/github/exiv2/exiv2/src/actions.cpp:530 297 #8 0x0000000100005835 in Action::Print::run(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) at /Users/rmills/gnu/github/exiv2/exiv2/src/actions.cpp:245 298 299 300 5.6 How do the encoders work? 301 302 I understand writeMetadata() and will document that soon. 303 I still have to study how the TiffVisitor writes metadata. 304 305 306 6 Using external XMP SDK via Conan 307 308 Section 1 describes how to compile the newer versions of XMP SDK with a bash script. This 309 approach had few limitations: 310 311 1) We had to include sources from other projects into the Exiv2 repository: Check the folder 312 xmpsdk/third-party. 313 2) Different scripts for compiling XMP SDK on Linux, Mac OSX and Windows. 314 3) Lot of configuration/compilation issues depending on the system configuration. 315 316 Taking into account that during the last months we have done a big effort in migrating the 317 manipulation of 3rd party dependencies to Conan, we have decided to do the same here. A conan recipe 318 has been written for XmpSdk at: 319 320 https://github.com/piponazo/conan-xmpsdk 321 322 And the recipe and package binaries can be found in the piponazo's bintray repository: 323 324 https://bintray.com/piponazo/piponazo 325 326 This conan recipe provides a custom CMake finder that will be used by our CMake code to properly 327 find XMP SDK in the conan cache and then be able to use the CMake variables: ${XMPSDK_LIBRARY} and 328 ${XMPSDK_INCLUDE_DIR}. 329 330 These are the steps you will need to follow to configure the project with the external XMP support: 331 332 # Add the conan-piponazo remote to your conan configuration (only once) 333 conan remote add conan-piponazo https://api.bintray.com/conan/piponazo/piponazo 334 335 mkdir build && cd build 336 337 # Run conan to bring the dependencies. Note that the XMPSDK is not enabled by default and you will 338 # need to enable the xmp option to bring it. 339 conan install .. --options xmp=True 340 341 # Configure the project with support for the external XMP version. Disable the normal XMP version 342 cmake -DCMAKE_BUILD_TYPE=Release -DEXIV2_ENABLE_XMP=OFF -DEXIV2_ENABLE_EXTERNAL_XMP=ON -DBUILD_SHARED_LIBS=ON .. 343 344 Note that the usage of the newer versions of XMP is experimental and it was included in Exiv2 345 because few users has requested it.