You are viewing a plain text version of this content. The canonical link for it is here.
Posted to sanselan-dev@incubator.apache.org by Charles Matthew Chen <ch...@gmail.com> on 2007/11/18 00:19:47 UTC
Re: Sanselan: capabilities, questions, thoughts

Hello again Endre,

   Sorry for the delay in getting back to you.  I have indeed got EXIF
update/replacement working, however I haven't yet settled upon the
best approach to this functionality's API.

  There's two problems.  First, to what extent should the data
structures representing EXIF metadata read from a JPEG resemble the
data structures that represent data to write into a JPEG's EXIF
segment?  I'm actually leaning towards using completely different data
structures, which will make this a bit more laborious to use but
clearer.

   Secondly, there is the complication that EXIF is encoded in a
series of TIFF directories/IFDs which are linked in two completely
different ways.  That is, there are the "main" directories which are
arranged in a sequence (IFD0, IFD1, IFD2, etc.).  There are also the
GPS, EXIF and Interoperability directories, which are referenced by
tags in IFD0.

  There are other complications, such as one Phil Harvey points out here:

http://www.sno.phy.queensu.ca/~phil/exiftool/writing.html

"IFD0/ExifIFD Ambiguity

ExifTool has a preferred location (IFD) where it writes all EXIF tags.
However, a number of tags have are written to different locations by
various digital cameras or image editors. Specifically, the following
tags have been observed in both IFD0 and ExifIFD: Make, Model,
Software, Artist, DateTimeOriginal, SensingMethod, CustomRendered,
ExposureMode, WhiteBalance, DigitalZoomRatio and SceneCaptureType. To
handle this ambiguity, ExifTool will delete the tag if it exists in
IFD0 when it is written to ExifIFD, and vice versa."

   Lastly, as you have stressed, it is key that we try to adhere as
closely as possible to the camera manufacturer's interpretation of the
EXIF standard, to the point of binary compitability where possible.
One key problem there is that different vendors use completely
different (and mysterious) strategies for padding directories and
entry values with extra bytes.

   I don't think it will be possible to hide all of this behind a
simple API.  One possible approach to the API might be to offer a
variety of methods representing common use cases (ie. removeAllEXIF(),
removeExifValues(<list of tags>), replaceExifValues(<list of tags &
values>), etc.) rather than a single, all purpose updateExifData(...)
method.

   I know I haven't addressed all of your comments... I will soon.  I
just didn't want to go any longer without responding.

   In the meantime, feel free to take a look at the
org.cmc.sanselan.formats.jpeg.exifRewrite package.  The core
functionality is there, and seems to work well across a handful of
vendors.

Charles.

On Sep 24, 2007 4:43 AM, Endre Stølsvik <st...@gmail.com> wrote:
> Charles Matthew Chen wrote:
> >> Sorry for bothering you this much.. But here's some more text for you!
> >
> > You're never a bother, Endre.  I truly appreciate your thoughts & questions...
>
> Okay, good..! Here's some more! :)
>
> >
> > It is true the the directory and value offsets can point anywhere
> > within that segment which (see my previous email) means that we'll
> > always have to parse and write the entire directory structure.  This
> > isn't incompatible with generally preserving binary compatibility at
> > the field level.
>
> No, I was just thinking that in the places where you effectively stash
> up some Map with key/value pairs (directly, and metaphorically), you
> could just as well try to keep the exact order of the pairs when writing
> them out, unless there was a compelling reason not to.
>
> This _should_ result in, if I don't change anything, an exact binary
> copy when writing it out again. I cannot, at least, quite understand why
> this couldn't be the case..
>
> >
> >
> >>      * The standard defines a MakerNote tag, which allows camera
> >> manufacturers to place any custom format metadata in the file. This is
> >> used increasingly by camera manufacturers to store a myriad of camera
> >> settings not listed in the Exif standard, such as shooting modes,
> >> post-processing settings, serial number, focusing modes, etc. As this
> >> tag format is proprietary and manufacturer-specific, it can be
> >> prohibitively difficult to retrieve this information from an image (or
> >> properly preserve it when rewriting an image). Some manufacturers
> >> encrypt portions of the information; for example, Nikon encrypts the
> >> detailed lens data in their newer MakerNote data versions.[3]
> >>
> >
> > I definitely plan on supporting MakerNote data eventually, but this is
> > non-trivial since it is vendor-specific.  In the meantime, it is easy
> > to preserve binary compatibility for MakerNotes.
> >
>
> I get it.
>
> One thing I've noticed, is that you never mention the IPTC metadata (as
> opposed to EXIF). Is that observation correct? For my part, I'd love to
> be able to stash data into those fields: caption, headline, keywords,
> category, supplemental categories, copyright/"author" etc. Each of those
> have a direct relationship with elements that I want my picture
> organizer application to .. organize.
>
> The point is that I'd want my application to write back the data that
> the user stashes onto files - so that if the user opens the files in
> another application, all his metadata efforts will be preserved. EXIF
> doesn't really cut that type of data, as I understand it (and at least,
> if there are overlap, I'd want to write to both places, supporting as
> many other applications as possible..)
>
> >
> >>      * Exif metadata is restricted in size to 64 kB in JPEG images
> >> because according to the specification this information must be
> >> contained within a single JPEG APP1 segment. Although the FlashPix
> >> extensions allow information to span multiple JPEG APP2 segments, these
> >> extensions are not commonly used. This has prompted some camera
> >> manufacturers to develop non-standard techniques for storing the large
> >> preview images used by some digital cameras for LCD review. These
> >> non-standard extensions are commonly lost if a user re-saves the image
> >> using image editor software, possibly rendering the image incompatible
> >> with the original camera that created it. "
> >
> > Right, the 64kb limit is real, but how often is this going to be a
> > problem for us in practice?
> >
> > I'd love to see more information on these "non-standard
> > techniques/extensions" are if you come across any.  Obviously that
> > could be a problem.
>
> Definitely. Those many test images from different cameras will of course
> come in handy. And Phil's HUGE test collection (albeit with missing
> actual picture data) would, I assume, be invaluable in that hunt..
>
> >
> > If an image doesn't involve any non-standard extensions, I don't
> > forsee this being a big issue. Few EXIF APP1 segments are so large.
> > Also, I don't know why a library user would add that much tag data to
> > an image?  Can you think of such a scenario?
>
> Well, maybe - the problem, in my scenario, might be that if the user
> tags his image with LOTS of tags and categories, descriptions and
> comments, and those annotations are longish, then the total, with all
> existing info, might overflow. There is basically nothing I can do (want
> to do!) to limit what a user annotates his photos with..
>
> However, I don't foresee this as a big problem. If it happens, I'll just
> have to save as much as possible into the file, and then notify the user
> of the problem.
>
> This would BTW mean that I would have to know from you how my data
> currently is looking in regards to total space..
>
>  From your other mail: excellent that you've already made great
> progress! That's very quick! In regards to data editing support, what I
> (probably naively) envisioned was some kind of hierarchy of actual, or
> similar, Maps and Lists (much like jdom, dom4j), that also keeps order
> of the elements(!). Keys possibly being a huge bunch of constants. If I
> want to change anything, I do that by using the set, add and
> delete/remove methods on this hierarchy. Then I may store back the info
> using some kind of "replaceMetadata(Exif exif, IPTC iptc, InputStream
> originalImage, OutputStream newImageDestination)" method. (This is the
> place where I'm talking about using LinkedHashMaps instead of simple
> HashMaps, just as a picture of what I'm going on about).
>    Here's some idea: Stuff which you didn't yet know how to parse, would
> in this structure be stored as pure byte-arrays, so that the
> replaceMetadata method would render a binary exact copy if one didn't
> change anything. Maybe one could have accessors to both the binary data
> and parsed data for all elements, on any level of the tree, with the
> parts that you didn't know how to handle returning null in the parsed
> case - which would leave future improvements natural in that they simply
> (after an upgrade) suddenly returned something (a new Map/List of
> parsed-and-binary-accessible data) instead of null. The idea here is
> that if I change some "parsed" value, you'd do the actual
> "bit-processing" directly in the tree (The structure is active, not a
> passive container), so that if I used the binary-getter, the new data
> would be reflected, and vice versa. The "write" method would thus
> basically consist of rootOfMetadataTree.getBinaryRepresentation(), and
> write the result directly out..
>    If the above is way out, then feel free to act as it was never written!
>
> I'm looking forward to start to use your library in earnest.
>
> I've gotten my caching strategy to work now. It was actually a bit
> harder, and a bit easier, than I thought. The ImageIO stuff requests a
> BufferedImage type which for (for example) JPEG is the
> BufferedImage.TYPE_3BYTE_BGR, but for the PNG case _would_ have been
> BufferedImage.TYPE_4BYTE_ARGB, but that type doesn't exist, so the
> bi.getType() sadly returns TYPE_CUSTOM. What is amazing then is that
> there (AFAIK) is no clear way to represent the format except for pretty
> much serialize the entire ImageTypeSpecifier (which consists of one
> ColorModel, and one SampleModel of a 1x1 pixel image!!), which of course
> makes comparisons rather hard ("is this BufferedImage of the type you
> need, please?!").
>    HOWEVER, it magically turns out that if you just keep the number of
> bands the same, you may give the ImageReader whatever type of BI you
> like (which was very good for my part!) - it seems to do exactly what I
> mentioned in another mail, namely scanline convert the image if the
> destination BufferedImage "suddenly" isn't of the exact correct type.
> That way, I simply /case/ the getNumOfBands of the ImageReader's
> requested type, and then "transform" it into TYPE_BYTE_GRAY,
> TYPE_INT_RGB and TYPE_INT_ARGB, and then give it a BI of that type, and
> hey presto.. Thus the BufferedImageCachingProvider now keys its pools
> (as envisioned) simply on "<type>:<width>x<height>", and thus I've
> pretty much eliminated GC, even on my test data which has quite a bunch
> of different sizes!
>
> PS: I'm not sure of your country of origin, but just in case the
> "universe is English", here's a rudeness: please give a nudge in the
> direction of i18n. I do understand that the character data might not
> have a defined charset, but given that at least the EXIF is a japanese
> standard (?), I would guess that data other than ascii could be handled?
> At least my precious "æ", "ø" and "å"! :-) (Really, those are not a big
> deal - chinese, japanese, thai and those funnies are way worse!)
>
> Regards,
> Endre.
>