You are viewing a plain text version of this content. The canonical link for it is here.
Posted to sanselan-dev@incubator.apache.org by Christopher Blunck <ch...@thebluncks.com> on 2008/11/14 03:33:22 UTC

Re: Sanselan support for writing IPTC fields

Hi Charles and Jonathan-

Just wanted to send along the datapoint that I integrated your latest  
0.95-incubator-SNAPSHOT with the +w support for IPTC and it works  
swimmingly for me.  Here's some context around my story...

I take images with cameras that set the EXIF rotation field.   
Unfortunately a lot of Flash players do not honor the rotation field  
when rendering the image.  As a result I have included a step in my  
publishing process that relies on JAI to rotate my images according to  
the rotation flag.  JAI, Jimi, AWT, and many other image manipulation  
libraries out there strip out EXIF and IPTC metadata.

In my world the EXIF and IPTC metadata is extremely important.  The  
photographers I work with use top-of-the-line optics and camera bodies  
and I want that information to be harvested by The Google or whomever  
else is scraping our images.  Likewise, the IPTC metadata describes  
the image a lot better than a filename or URL.  The bottom line is:   
we need the EXIF/IPTC metadata in our images even after they are  
rotated and resized for the web.

Here's the code I wrote that relies on 0.95-incubator-SNAPSHOT that  
works for us.  In our case we have a byte[] from the original  
"enriched" JPG file laying around and we want to apply the metadata  
from said image to a new "naked" image (one that has been resized or  
rotated and thus has lost it's metadata).  The resulting image (which  
contains the image data from the "naked" image and the metadata from  
the "enriched" image) is returned as a byte[].  See below for the code:

   /**
    * <p>
    * Enriches a naked picture using the EXIF and IPTC metadata
    * extracted from another image.  Returns the original naked picture
    * plus the EXIF and IPTC metadata read from the enriched image.
    * </p>
    *
    * <p>
    * This method is useful in situations where the result of a JAI
    * resize call created an image that lacks EXIF and IPTC metadata.
    * After rotation or resizing the image can be enriched using the
    * byte[] from the original image.
    * </p>
    *
    * @param enriched a JPEG image that contains EXIF/IPTC metadata  
that should
    *        be copied
    * @param naked a JPEG image that should have EXIF/IPTC metadata  
added to it
    * @returns the original "naked" JPEG image that contains the EXIF/ 
IPTC
    *        metadata contained in the enriched byte[]
    * @throws Exception if any of the underlying logic fails
    */
   public static byte[] enrich(byte[] enriched, byte[] naked)
     throws Exception {

     // read IPTC metadata from the original enriched image
     IImageMetadata metadata = Sanselan.getMetadata(enriched);
     JpegImageMetadata jpegMetadata = (JpegImageMetadata) metadata;
     JpegPhotoshopMetadata photoshopMetadata =  
jpegMetadata.getPhotoshop();
     if (photoshopMetadata == null) {
       _logger.error("original image metadata is null");
       return naked;
     }

     PhotoshopApp13Data data = photoshopMetadata.photoshopApp13Data;

     // read the EXIF metadata from the parsed JPEG metadata
     TiffOutputSet outputSet = jpegMetadata.getExif().getOutputSet();

     // enrich the naked byte[] with EXIF metadata
     ExifRewriter writer = new ExifRewriter();
     ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
     writer.updateExifMetadataLossless(naked, outputStream, outputSet);

     // enrich the partially clothed byte[] with IPTC metadata
     InputStream src = new  
ByteArrayInputStream(outputStream.toByteArray());
     ByteArrayOutputStream dest = new ByteArrayOutputStream();
     new JpegIptcRewriter().writeIPTC(src, dest, data);

     // return the fully clothed image as a byte[]
     return dest.toByteArray();
   }







-c

On Oct 21, 2008, at 12:39 AM, Charles Matthew Chen wrote:

> Hello Jonathan,
>
>   I neglected to commit some of the images in the sample image
> library that demonstrate IPTC/Photoshop data.  I also introduced a bug
> to the IPTC segment identification code which is now fixed.
>
>   I've posted updated builds here:
>
> http://people.apache.org/~cmchen/apache-sanselan-incubating-0.95-21102008-snapshot-bin.zip
> http://people.apache.org/~cmchen/apache-sanselan-incubating-0.95-21102008-snapshot-javadoc.jar
> http://people.apache.org/~cmchen/apache-sanselan-incubating-0.95-21102008-snapshot-src.zip
>
>   Also, I'm cc'ing the sanselan-dev mailing list.  Let's continue
> this discussion there.  Instructions for subscribing can be found
> here:
>
> http://cwiki.apache.org/confluence/display/SANSELAN/Index
>
> Thanks,
>   Charles
>
>
> On Mon, Oct 20, 2008 at 2:12 AM, Jonathan Giles <jo...@jogiles.co.nz>  
> wrote:
>> Charles,
>>
>> I downloaded and tried to run the test case to dump IPTC data. I  
>> get an
>> assertion failure error at the bottom of SanselanTest, on the line
>> assertTrue(filtered.size() > 0);
>>
>> It appears as if none of the images are being judged to have any  
>> IPTC data.
>> I have debugged and it is definitely finding the image files, so it  
>> seems to
>> be with how it is determining whether the IPTC data is there or not.
>>
>> Do you have any idea why this may be the case? I presume that the  
>> test case
>> succeeds for you? I'll keep looking to see if anything sticks out.
>>
>> Cheers,
>> Jonathan
>>
>> Charles Matthew Chen wrote:
>>>
>>> Hi Jonathan (and Christopher, Nepomuk, Mark and Bjorn),
>>>
>>>  A draft form of the IPTC changes are done, and I believe they are
>>> ready to be looked at.
>>>
>>>  I've prepared a snapshot of the repository that contains the IPTC
>>> read/write code.
>>>
>>>
>>> http://people.apache.org/~cmchen/apache-sanselan-incubating-0.95-19102008-snapshot-bin.zip
>>>
>>> http://people.apache.org/~cmchen/apache-sanselan-incubating-0.95-19102008-snapshot-javadoc.jar
>>>
>>> http://people.apache.org/~cmchen/apache-sanselan-incubating-0.95-19102008-snapshot-src.zip
>>>
>>>  The following unit tests demonstrate how to use the IPTC features:
>>>
>>>
>>> https://svn.apache.org/repos/asf/incubator/sanselan/trunk/src/test/java/org/apache/sanselan/formats/jpeg/iptc/IptcDumpTest.java
>>>
>>>  This test shows how to read and print the Photoshop/IPTC records
>>> (if any) present in a JPEG file.
>>>
>>>
>>> https://svn.apache.org/repos/asf/incubator/sanselan/trunk/src/test/java/org/apache/sanselan/formats/jpeg/iptc/IptcUpdateTest.java
>>>
>>>  This test shows how to remove, add/insert, and update
>>> Photoshop/IPTC records in a JPEG.
>>>
>>>  I haven't had time to document the Photoshop/IPTC changes yet.
>>> Here are the basics: IPTC is an image metadata standard.  Adobe
>>> Photoshop popularized a way of embedding IPTC data in App13 segments
>>> of JPEG files using a binary format very similar to Photoshop's  
>>> "image
>>> resource blocks."  This Photoshop/IPTC data is organized in a  
>>> "block"
>>> of "records."  This IPTC data is just one of many blocks that appear
>>> in the "Photoshop" App13 segments; modifying IPTC data should  
>>> usually
>>> leave the other blocks in the Photoshop App13 segment unchanged.  In
>>> practice, App13 segments are only used for Photoshop metadata.
>>>
>>>  The IPTC metadata in a Photoshop App13 segment is a series of
>>> key-value pairs.  The keys are "record type" bytes, defined by the
>>> standard and constantized in:
>>>
>>>
>>> https://svn.apache.org/repos/asf/incubator/sanselan/trunk/src/main/java/org/apache/sanselan/formats/jpeg/iptc/IPTCConstants.java
>>>
>>>  The values are strings.  In theory, these values should be encoded
>>> in ISO-8859-1 unless the first block of the segment has a "text
>>> encoding" record.  However, I have yet to find an image that
>>> demonstrates this "text encoding" record, so I haven't added support
>>> for it yet.  If you find one, please consider contributing it to the
>>> project.
>>>
>>>  Please take a look at the code as it stands and let me know if it
>>> meets your needs.  Any and all feedback is welcome.
>>>
>>>  More references:
>>>
>>>
>>> http://en.wikipedia.org/wiki/International_Press_Telecommunications_Council
>>> http://www.iptc.org/
>>>
>>>  There is some remaining work to be done (besides writing proper
>>> documentation).  Support for very large records is not yet done.  I
>>> have yet to find an image that demonstrates this (again, such an
>>> example image would be helpful).
>>>
>>>  Once the IPTC work is settled, I suggest we try to release 0.95 and
>>> have a discussion about what we want to have done before we release
>>> version 1.0.
>>>
>>> Thanks,
>>>  Charles
>>>
>>>