You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Benedikt Ritter <br...@apache.org> on 2016/06/01 12:55:13 UTC

Re: Creating EXIF tags (TiffOutputField) the right way

Hello Joakim,

glad you found out what to do. This would make for a good addition to the
user guide. Would you like to contribute your findings?

Benedikt

Joakim Knudsen <jo...@gmail.com> schrieb am Di., 31. Mai 2016 um
19:21 Uhr:

> Btw, ENCODING_UTF16 is just a String = "UTF-16LE" (Little Endian)
>
> On 31 May 2016 at 19:20, Joakim Knudsen <jo...@gmail.com> wrote:
>
> > Following a post on the User-Commons-Apache log (from 2012), I ended up
> > with the following code which seems to work.
> > It writes proper Unicode, which I can read back successfully using
> > ExifTool. I also see the comment nicely in Windows Explorer, and under
> File
> > > Properties.
> > Note I changed the field type from ASCII to FIELD_TYPE_UNDEFINED,
> > otherwise (with ASCII) it did not work. At least Windows couldn't make
> > sense of the EXIF data.
> >
> > // http://osdir.com/ml/user-commons-apache/2012-03/msg00046.html
> > byte[] unicodeMarker = new byte[]{ 0x55, 0x4E, 0x49, 0x43, 0x4F, 0x44,
> >         0x45, 0x00 };
> > byte[] comment = textToSet.getBytes(ENCODING_UTF16); // OR UTF-16BE if
> the file is big-endian!
> > byte[] bytesComment = new byte[unicodeMarker.length + comment.length];
> > System.arraycopy(unicodeMarker, 0, bytesComment, 0,
> unicodeMarker.length);
> > System.arraycopy(comment, 0, bytesComment, unicodeMarker.length,
> comment.length);
> >
> > TiffOutputField exif_comment = new
> TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT,
> >         TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> bytesComment.length, bytesComment);
> >
> >
> > I can now write UserComment: "æøå" without problems :)
> >
> >
> >
> > - Joakim
> >
> >
> > On 31 May 2016 at 17:39, Benedikt Ritter <br...@apache.org> wrote:
> >
> >> Hello Joachim,
> >>
> >> Joakim Knudsen <jo...@gmail.com> schrieb am Sa., 28. Mai 2016 um
> >> 21:10 Uhr:
> >>
> >> > Hi Benedikt, and thanks for replying!
> >> >
> >> > So, if FieldType is unused, maybe the alternative, simpler constructor
> >> is
> >> > more appropriate/correct to use?
> >> >
> >> > // try using the approach given in the example (modified from the GPS
> >> tag):
> >> > TiffOutputField exif_comment = TiffOutputField.create(
> >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> >> >         outputSet.byteOrder, textToSet);
> >> >
> >> > However, now Sanselan throws an ImageWriteException:
> >> > org.apache.sanselan.ImageWriteException: Tag has unexpected data type.
> >> >
> >> > So are you 100% sure field type should not be set (to ASCII)?
> >> >
> >>
> >> No, I'm just saying that it uses a hard coded encoding anyway :-)
> >>
> >>
> >> >
> >> > Next, you're saying the string to set (textToSet) is converted
> >> internally
> >> > to byte array, using US-ASCII encoding.
> >> > If I try writing "æøåæøå" to a file, I get "쎦쎸쎥쎦쎸쎥" when I copy the
> JPEG
> >> > out and check Properties in Windows Explorer.
> >> > If I write only ASCII characters, e.g. "Test", then that comes through
> >> just
> >> > fine.
> >> >
> >> > In summary, here is the code that works for me (except non-ASCII
> >> > characters):
> >> >
> >> >
> >> > *//
> >> >
> >> >
> >>
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> >> > <
> >> >
> >>
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> >> > >*byte
> >> > b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> >> >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> >> >         textToSet, outputSet.
> >> > *byteOrder*);
> >> >
> >> > // constructor arguments: taginfo tag fieldtype count bytes
> >> > TiffOutputField exif_comment = new
> >> > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> >> > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> >> >         b.length, b);
> >> >
> >>
> >> The provided links indicate to me, that it is possible to write non
> ASCII
> >> characters. Are you sure your code looks like what Damjan suggested?
> >>
> >> Benedikt
> >>
> >>
> >> >
> >> >
> >> >
> >> > Joakim
> >> >
> >> >
> >> >
> >> > On 22 May 2016 at 15:29, Benedikt Ritter <br...@apache.org> wrote:
> >> >
> >> > > Hello Joakim
> >> > >
> >> > > Joakim Knudsen <jo...@gmail.com> schrieb am Sa., 21. Mai
> 2016
> >> um
> >> > > 19:29 Uhr:
> >> > >
> >> > > > Hi List!
> >> > > >
> >> > > > I'm working on an Android app, where I want to read and write
> "EXIF
> >> > tags"
> >> > > > to JPEG files on the device. Sanselan 0.97 seems to work
> perfectly,
> >> > > > although it's a bit complicated to work with EXIF
> tags/directories.
> >> > > >
> >> > > > The specific tags I'm interested in, is EXIF_TAG_USER_COMMENT and
> >> > > > EXIF_TAG_IMAGE_DESCRIPTION.
> >> > > > According to the documentation I could find, UserComment is of
> field
> >> > type
> >> > > > "undefined", whereas ImageDescription is of field type ASCII.
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> >> > > >
> >> http://www.awaresystems.be/imaging/tiff/tifftags/imagedescription.html
> >> > > >
> >> > > > What's the proper way of creating those tags, wrt. charset etc? I
> >> want
> >> > as
> >> > > > wide as possible character support (æøå etc).
> >> > > >
> >> > > > I find different discussions online, with different advice. Seems
> >> two
> >> > > > constructors are going around, where the simpler one does not deal
> >> with
> >> > > > charset/encoding at all. This one uses the .create method:
> >> > > >
> >> > > > String textToSet = "Some Text æøå";
> >> > > >
> >> > > > TiffOutputField exif_comment = TiffOutputField.create(
> >> > > >                 TiffConstants.EXIF_TAG_USER_COMMENT,
> >> > > >                 outputSet.byteOrder, textToSet);
> >> > > >
> >> > > >
> >> > > > while this one uses the standard constructor:
> >> > > >
> >> > > > byte b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> >> > > >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> >> > > >         textToSet, outputSet.byteOrder
> >> > > > );
> >> > > >
> >> > > > // constructor arguments: taginfo tag fieldtype count bytes
> >> > > > TiffOutputField exif_comment2 = new
> >> > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> >> > > >         TiffConstants.EXIF_TAG_USER_COMMENT,
> >> > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> >> > > >         b.length, b);
> >> > > >
> >> > > > In this last one, the string to set has been converted to a byte
> >> array
> >> > > > first. But can/should I set the encoding anywhere?
> >> > > >
> >> > > > Is the field type even ASCII? This information seems to indicate
> >> it's
> >> > > > not ASCII...
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> >> > > >
> >> > > >
> >> > > > Need some help here, as you can see, to get this right. The second
> >> > > > approach above does seem to work in my app, but I'd like to be
> sure
> >> > > > I'm not somehow messing up the JPEGs on the deviced.
> >> > > >
> >> > >
> >> > > I've looked at the code of
> >> > > org.apache.commons.imaging.formats.tiff.taginfos.TagInfoGpsText
> >> > > (ExifTagConstants.EXIF_TAG_USER_COMMENT is an instance of
> >> > TagInfoGpsText).
> >> > > Here are my observations:
> >> > >
> >> > > - The FieldType parameter, which you have set to
> >> > > TiffFieldTypeConstants.FIELD_TYPE_ASCII is never used in the
> >> > implemenation
> >> > > of encodeValue(FieldType, Object, ByteOrder)
> >> > > - When converting the input String to byte array,
> >> String.getBytes(String
> >> > > charsetName) is used
> >> > > - For charsetName "US-ASCII" is always used (it can not be
> configured
> >> by
> >> > > the user)
> >> > >
> >> > > So my guess is, that the code will not handle characters not in the
> >> > > US-ASCII charset correctly.
> >> > >
> >> > > Benedikt
> >> > >
> >> > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > Joakim
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: Creating EXIF tags (TiffOutputField) the right way

Posted by Benedikt Ritter <br...@apache.org>.
Hello Joakim,

Joakim Knudsen <jo...@gmail.com> schrieb am Mi., 1. Juni 2016 um
15:10 Uhr:

> Sure! That would also give even more scrutiny to the code. I'm not 100%
> sure this is totally correct, but I got wonderful help from Phil Harvey
> (ExifTool) to get the charset/encoding correct.
> So I'm pretty confident. How do I contribute?
>

Looking at the Commons Imaging website [1] I realised, that we currently do
not have a user guide :o) To the best idea would probably be to add it to
the Sample Usage page [2]. The website is build from source in SVN [3]. You
would have to check that out, modify the documentation and then create an
SVN patch file, using

svn diff >> mypatch.diff

the mypatch.diff would then have to be attached to a Jira issue. More
information can be found in [5].



> Btw, you wouldn't happen to know anything about IPTC and XMP, would you? It
> seems the EXIF tags I'm writing (UserComment and ImageDescription) are not
> enough for the comment to appear as a caption in image viewer software
> (like Picasa etc). I was wondering (hoping) Sanselan could write the
> following tags:
>
> IPTC:Caption-Abstract
> and
> XMP:Description
>
>
To be honest, I don't know much about how Sanselan/Imaging works. I have
worked on the code for a while, but I don't use it in my current projects.
So the only thing I can do, is look through the code for you and try to
find an answer to your questions :-)

Benedikt

[1] http://commons.apache.org/proper/commons-imaging/index.html
[2] http://commons.apache.org/proper/commons-imaging/sampleusage.html
[3] http://svn.apache.org/repos/asf/commons/proper/imaging/trunk
[4] http://issues.apache.org/jira/browse/IMAGING
[5] http://commons.apache.org/patches.html


>
> Joakim
>
> On 1 June 2016 at 14:55, Benedikt Ritter <br...@apache.org> wrote:
>
> > Hello Joakim,
> >
> > glad you found out what to do. This would make for a good addition to the
> > user guide. Would you like to contribute your findings?
> >
> > Benedikt
> >
> > Joakim Knudsen <jo...@gmail.com> schrieb am Di., 31. Mai 2016 um
> > 19:21 Uhr:
> >
> > > Btw, ENCODING_UTF16 is just a String = "UTF-16LE" (Little Endian)
> > >
> > > On 31 May 2016 at 19:20, Joakim Knudsen <jo...@gmail.com>
> wrote:
> > >
> > > > Following a post on the User-Commons-Apache log (from 2012), I ended
> up
> > > > with the following code which seems to work.
> > > > It writes proper Unicode, which I can read back successfully using
> > > > ExifTool. I also see the comment nicely in Windows Explorer, and
> under
> > > File
> > > > > Properties.
> > > > Note I changed the field type from ASCII to FIELD_TYPE_UNDEFINED,
> > > > otherwise (with ASCII) it did not work. At least Windows couldn't
> make
> > > > sense of the EXIF data.
> > > >
> > > > // http://osdir.com/ml/user-commons-apache/2012-03/msg00046.html
> > > > byte[] unicodeMarker = new byte[]{ 0x55, 0x4E, 0x49, 0x43, 0x4F,
> 0x44,
> > > >         0x45, 0x00 };
> > > > byte[] comment = textToSet.getBytes(ENCODING_UTF16); // OR UTF-16BE
> if
> > > the file is big-endian!
> > > > byte[] bytesComment = new byte[unicodeMarker.length +
> comment.length];
> > > > System.arraycopy(unicodeMarker, 0, bytesComment, 0,
> > > unicodeMarker.length);
> > > > System.arraycopy(comment, 0, bytesComment, unicodeMarker.length,
> > > comment.length);
> > > >
> > > > TiffOutputField exif_comment = new
> > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >         TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > > bytesComment.length, bytesComment);
> > > >
> > > >
> > > > I can now write UserComment: "æøå" without problems :)
> > > >
> > > >
> > > >
> > > > - Joakim
> > > >
> > > >
> > > > On 31 May 2016 at 17:39, Benedikt Ritter <br...@apache.org> wrote:
> > > >
> > > >> Hello Joachim,
> > > >>
> > > >> Joakim Knudsen <jo...@gmail.com> schrieb am Sa., 28. Mai
> 2016
> > um
> > > >> 21:10 Uhr:
> > > >>
> > > >> > Hi Benedikt, and thanks for replying!
> > > >> >
> > > >> > So, if FieldType is unused, maybe the alternative, simpler
> > constructor
> > > >> is
> > > >> > more appropriate/correct to use?
> > > >> >
> > > >> > // try using the approach given in the example (modified from the
> > GPS
> > > >> tag):
> > > >> > TiffOutputField exif_comment = TiffOutputField.create(
> > > >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >> >         outputSet.byteOrder, textToSet);
> > > >> >
> > > >> > However, now Sanselan throws an ImageWriteException:
> > > >> > org.apache.sanselan.ImageWriteException: Tag has unexpected data
> > type.
> > > >> >
> > > >> > So are you 100% sure field type should not be set (to ASCII)?
> > > >> >
> > > >>
> > > >> No, I'm just saying that it uses a hard coded encoding anyway :-)
> > > >>
> > > >>
> > > >> >
> > > >> > Next, you're saying the string to set (textToSet) is converted
> > > >> internally
> > > >> > to byte array, using US-ASCII encoding.
> > > >> > If I try writing "æøåæøå" to a file, I get "쎦쎸쎥쎦쎸쎥" when I copy
> the
> > > JPEG
> > > >> > out and check Properties in Windows Explorer.
> > > >> > If I write only ASCII characters, e.g. "Test", then that comes
> > through
> > > >> just
> > > >> > fine.
> > > >> >
> > > >> > In summary, here is the code that works for me (except non-ASCII
> > > >> > characters):
> > > >> >
> > > >> >
> > > >> > *//
> > > >> >
> > > >> >
> > > >>
> > >
> >
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> > > >> > <
> > > >> >
> > > >>
> > >
> >
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> > > >> > >*byte
> > > >> > b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > > >> >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > > >> >         textToSet, outputSet.
> > > >> > *byteOrder*);
> > > >> >
> > > >> > // constructor arguments: taginfo tag fieldtype count bytes
> > > >> > TiffOutputField exif_comment = new
> > > >> > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > > >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >> > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > > >> >         b.length, b);
> > > >> >
> > > >>
> > > >> The provided links indicate to me, that it is possible to write non
> > > ASCII
> > > >> characters. Are you sure your code looks like what Damjan suggested?
> > > >>
> > > >> Benedikt
> > > >>
> > > >>
> > > >> >
> > > >> >
> > > >> >
> > > >> > Joakim
> > > >> >
> > > >> >
> > > >> >
> > > >> > On 22 May 2016 at 15:29, Benedikt Ritter <br...@apache.org>
> > wrote:
> > > >> >
> > > >> > > Hello Joakim
> > > >> > >
> > > >> > > Joakim Knudsen <jo...@gmail.com> schrieb am Sa., 21. Mai
> > > 2016
> > > >> um
> > > >> > > 19:29 Uhr:
> > > >> > >
> > > >> > > > Hi List!
> > > >> > > >
> > > >> > > > I'm working on an Android app, where I want to read and write
> > > "EXIF
> > > >> > tags"
> > > >> > > > to JPEG files on the device. Sanselan 0.97 seems to work
> > > perfectly,
> > > >> > > > although it's a bit complicated to work with EXIF
> > > tags/directories.
> > > >> > > >
> > > >> > > > The specific tags I'm interested in, is EXIF_TAG_USER_COMMENT
> > and
> > > >> > > > EXIF_TAG_IMAGE_DESCRIPTION.
> > > >> > > > According to the documentation I could find, UserComment is of
> > > field
> > > >> > type
> > > >> > > > "undefined", whereas ImageDescription is of field type ASCII.
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > > >> > > >
> > > >>
> > http://www.awaresystems.be/imaging/tiff/tifftags/imagedescription.html
> > > >> > > >
> > > >> > > > What's the proper way of creating those tags, wrt. charset
> etc?
> > I
> > > >> want
> > > >> > as
> > > >> > > > wide as possible character support (æøå etc).
> > > >> > > >
> > > >> > > > I find different discussions online, with different advice.
> > Seems
> > > >> two
> > > >> > > > constructors are going around, where the simpler one does not
> > deal
> > > >> with
> > > >> > > > charset/encoding at all. This one uses the .create method:
> > > >> > > >
> > > >> > > > String textToSet = "Some Text æøå";
> > > >> > > >
> > > >> > > > TiffOutputField exif_comment = TiffOutputField.create(
> > > >> > > >                 TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >> > > >                 outputSet.byteOrder, textToSet);
> > > >> > > >
> > > >> > > >
> > > >> > > > while this one uses the standard constructor:
> > > >> > > >
> > > >> > > > byte b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > > >> > > >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > > >> > > >         textToSet, outputSet.byteOrder
> > > >> > > > );
> > > >> > > >
> > > >> > > > // constructor arguments: taginfo tag fieldtype count bytes
> > > >> > > > TiffOutputField exif_comment2 = new
> > > >> > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > > >> > > >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > > >> > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > > >> > > >         b.length, b);
> > > >> > > >
> > > >> > > > In this last one, the string to set has been converted to a
> byte
> > > >> array
> > > >> > > > first. But can/should I set the encoding anywhere?
> > > >> > > >
> > > >> > > > Is the field type even ASCII? This information seems to
> indicate
> > > >> it's
> > > >> > > > not ASCII...
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > > >> > > >
> > > >> > > >
> > > >> > > > Need some help here, as you can see, to get this right. The
> > second
> > > >> > > > approach above does seem to work in my app, but I'd like to be
> > > sure
> > > >> > > > I'm not somehow messing up the JPEGs on the deviced.
> > > >> > > >
> > > >> > >
> > > >> > > I've looked at the code of
> > > >> > > org.apache.commons.imaging.formats.tiff.taginfos.TagInfoGpsText
> > > >> > > (ExifTagConstants.EXIF_TAG_USER_COMMENT is an instance of
> > > >> > TagInfoGpsText).
> > > >> > > Here are my observations:
> > > >> > >
> > > >> > > - The FieldType parameter, which you have set to
> > > >> > > TiffFieldTypeConstants.FIELD_TYPE_ASCII is never used in the
> > > >> > implemenation
> > > >> > > of encodeValue(FieldType, Object, ByteOrder)
> > > >> > > - When converting the input String to byte array,
> > > >> String.getBytes(String
> > > >> > > charsetName) is used
> > > >> > > - For charsetName "US-ASCII" is always used (it can not be
> > > configured
> > > >> by
> > > >> > > the user)
> > > >> > >
> > > >> > > So my guess is, that the code will not handle characters not in
> > the
> > > >> > > US-ASCII charset correctly.
> > > >> > >
> > > >> > > Benedikt
> > > >> > >
> > > >> > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > Joakim
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Re: Creating EXIF tags (TiffOutputField) the right way

Posted by Joakim Knudsen <jo...@gmail.com>.
Sure! That would also give even more scrutiny to the code. I'm not 100%
sure this is totally correct, but I got wonderful help from Phil Harvey
(ExifTool) to get the charset/encoding correct.
So I'm pretty confident. How do I contribute?

Btw, you wouldn't happen to know anything about IPTC and XMP, would you? It
seems the EXIF tags I'm writing (UserComment and ImageDescription) are not
enough for the comment to appear as a caption in image viewer software
(like Picasa etc). I was wondering (hoping) Sanselan could write the
following tags:

IPTC:Caption-Abstract
and
XMP:Description


Joakim

On 1 June 2016 at 14:55, Benedikt Ritter <br...@apache.org> wrote:

> Hello Joakim,
>
> glad you found out what to do. This would make for a good addition to the
> user guide. Would you like to contribute your findings?
>
> Benedikt
>
> Joakim Knudsen <jo...@gmail.com> schrieb am Di., 31. Mai 2016 um
> 19:21 Uhr:
>
> > Btw, ENCODING_UTF16 is just a String = "UTF-16LE" (Little Endian)
> >
> > On 31 May 2016 at 19:20, Joakim Knudsen <jo...@gmail.com> wrote:
> >
> > > Following a post on the User-Commons-Apache log (from 2012), I ended up
> > > with the following code which seems to work.
> > > It writes proper Unicode, which I can read back successfully using
> > > ExifTool. I also see the comment nicely in Windows Explorer, and under
> > File
> > > > Properties.
> > > Note I changed the field type from ASCII to FIELD_TYPE_UNDEFINED,
> > > otherwise (with ASCII) it did not work. At least Windows couldn't make
> > > sense of the EXIF data.
> > >
> > > // http://osdir.com/ml/user-commons-apache/2012-03/msg00046.html
> > > byte[] unicodeMarker = new byte[]{ 0x55, 0x4E, 0x49, 0x43, 0x4F, 0x44,
> > >         0x45, 0x00 };
> > > byte[] comment = textToSet.getBytes(ENCODING_UTF16); // OR UTF-16BE if
> > the file is big-endian!
> > > byte[] bytesComment = new byte[unicodeMarker.length + comment.length];
> > > System.arraycopy(unicodeMarker, 0, bytesComment, 0,
> > unicodeMarker.length);
> > > System.arraycopy(comment, 0, bytesComment, unicodeMarker.length,
> > comment.length);
> > >
> > > TiffOutputField exif_comment = new
> > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT,
> > >         TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > bytesComment.length, bytesComment);
> > >
> > >
> > > I can now write UserComment: "æøå" without problems :)
> > >
> > >
> > >
> > > - Joakim
> > >
> > >
> > > On 31 May 2016 at 17:39, Benedikt Ritter <br...@apache.org> wrote:
> > >
> > >> Hello Joachim,
> > >>
> > >> Joakim Knudsen <jo...@gmail.com> schrieb am Sa., 28. Mai 2016
> um
> > >> 21:10 Uhr:
> > >>
> > >> > Hi Benedikt, and thanks for replying!
> > >> >
> > >> > So, if FieldType is unused, maybe the alternative, simpler
> constructor
> > >> is
> > >> > more appropriate/correct to use?
> > >> >
> > >> > // try using the approach given in the example (modified from the
> GPS
> > >> tag):
> > >> > TiffOutputField exif_comment = TiffOutputField.create(
> > >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > >> >         outputSet.byteOrder, textToSet);
> > >> >
> > >> > However, now Sanselan throws an ImageWriteException:
> > >> > org.apache.sanselan.ImageWriteException: Tag has unexpected data
> type.
> > >> >
> > >> > So are you 100% sure field type should not be set (to ASCII)?
> > >> >
> > >>
> > >> No, I'm just saying that it uses a hard coded encoding anyway :-)
> > >>
> > >>
> > >> >
> > >> > Next, you're saying the string to set (textToSet) is converted
> > >> internally
> > >> > to byte array, using US-ASCII encoding.
> > >> > If I try writing "æøåæøå" to a file, I get "쎦쎸쎥쎦쎸쎥" when I copy the
> > JPEG
> > >> > out and check Properties in Windows Explorer.
> > >> > If I write only ASCII characters, e.g. "Test", then that comes
> through
> > >> just
> > >> > fine.
> > >> >
> > >> > In summary, here is the code that works for me (except non-ASCII
> > >> > characters):
> > >> >
> > >> >
> > >> > *//
> > >> >
> > >> >
> > >>
> >
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> > >> > <
> > >> >
> > >>
> >
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=KWuZ4gsZFUOBvZWUN0LUfA@mail.gmail.com%3E
> > >> > >*byte
> > >> > b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > >> >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > >> >         textToSet, outputSet.
> > >> > *byteOrder*);
> > >> >
> > >> > // constructor arguments: taginfo tag fieldtype count bytes
> > >> > TiffOutputField exif_comment = new
> > >> > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > >> > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > >> >         b.length, b);
> > >> >
> > >>
> > >> The provided links indicate to me, that it is possible to write non
> > ASCII
> > >> characters. Are you sure your code looks like what Damjan suggested?
> > >>
> > >> Benedikt
> > >>
> > >>
> > >> >
> > >> >
> > >> >
> > >> > Joakim
> > >> >
> > >> >
> > >> >
> > >> > On 22 May 2016 at 15:29, Benedikt Ritter <br...@apache.org>
> wrote:
> > >> >
> > >> > > Hello Joakim
> > >> > >
> > >> > > Joakim Knudsen <jo...@gmail.com> schrieb am Sa., 21. Mai
> > 2016
> > >> um
> > >> > > 19:29 Uhr:
> > >> > >
> > >> > > > Hi List!
> > >> > > >
> > >> > > > I'm working on an Android app, where I want to read and write
> > "EXIF
> > >> > tags"
> > >> > > > to JPEG files on the device. Sanselan 0.97 seems to work
> > perfectly,
> > >> > > > although it's a bit complicated to work with EXIF
> > tags/directories.
> > >> > > >
> > >> > > > The specific tags I'm interested in, is EXIF_TAG_USER_COMMENT
> and
> > >> > > > EXIF_TAG_IMAGE_DESCRIPTION.
> > >> > > > According to the documentation I could find, UserComment is of
> > field
> > >> > type
> > >> > > > "undefined", whereas ImageDescription is of field type ASCII.
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > >> > > >
> > >>
> http://www.awaresystems.be/imaging/tiff/tifftags/imagedescription.html
> > >> > > >
> > >> > > > What's the proper way of creating those tags, wrt. charset etc?
> I
> > >> want
> > >> > as
> > >> > > > wide as possible character support (æøå etc).
> > >> > > >
> > >> > > > I find different discussions online, with different advice.
> Seems
> > >> two
> > >> > > > constructors are going around, where the simpler one does not
> deal
> > >> with
> > >> > > > charset/encoding at all. This one uses the .create method:
> > >> > > >
> > >> > > > String textToSet = "Some Text æøå";
> > >> > > >
> > >> > > > TiffOutputField exif_comment = TiffOutputField.create(
> > >> > > >                 TiffConstants.EXIF_TAG_USER_COMMENT,
> > >> > > >                 outputSet.byteOrder, textToSet);
> > >> > > >
> > >> > > >
> > >> > > > while this one uses the standard constructor:
> > >> > > >
> > >> > > > byte b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > >> > > >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > >> > > >         textToSet, outputSet.byteOrder
> > >> > > > );
> > >> > > >
> > >> > > > // constructor arguments: taginfo tag fieldtype count bytes
> > >> > > > TiffOutputField exif_comment2 = new
> > >> > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > >> > > >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > >> > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > >> > > >         b.length, b);
> > >> > > >
> > >> > > > In this last one, the string to set has been converted to a byte
> > >> array
> > >> > > > first. But can/should I set the encoding anywhere?
> > >> > > >
> > >> > > > Is the field type even ASCII? This information seems to indicate
> > >> it's
> > >> > > > not ASCII...
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > >> > > >
> > >> > > >
> > >> > > > Need some help here, as you can see, to get this right. The
> second
> > >> > > > approach above does seem to work in my app, but I'd like to be
> > sure
> > >> > > > I'm not somehow messing up the JPEGs on the deviced.
> > >> > > >
> > >> > >
> > >> > > I've looked at the code of
> > >> > > org.apache.commons.imaging.formats.tiff.taginfos.TagInfoGpsText
> > >> > > (ExifTagConstants.EXIF_TAG_USER_COMMENT is an instance of
> > >> > TagInfoGpsText).
> > >> > > Here are my observations:
> > >> > >
> > >> > > - The FieldType parameter, which you have set to
> > >> > > TiffFieldTypeConstants.FIELD_TYPE_ASCII is never used in the
> > >> > implemenation
> > >> > > of encodeValue(FieldType, Object, ByteOrder)
> > >> > > - When converting the input String to byte array,
> > >> String.getBytes(String
> > >> > > charsetName) is used
> > >> > > - For charsetName "US-ASCII" is always used (it can not be
> > configured
> > >> by
> > >> > > the user)
> > >> > >
> > >> > > So my guess is, that the code will not handle characters not in
> the
> > >> > > US-ASCII charset correctly.
> > >> > >
> > >> > > Benedikt
> > >> > >
> > >> > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > Joakim
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>