You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Albretch Mueller <lb...@gmail.com> on 2011/09/09 03:37:52 UTC
org.apache.tika.exception.TikaException while trying to get images'
metadata ...
After reading this post:
~
http://blog.jeroenreijn.com/2010/04/metadata-extraction-with-apache-tika.html
~
getting metadata from files using tika seemed easy. Right now what I
am most interested in is images but all tika gives me is:
~
org.apache.tika.exception.TikaException: Can't read JPEG metadata
at org.apache.tika.parser.image.ImageMetadataExtractor.parseJpeg(ImageMetadataExtractor.java:92)
at org.apache.tika.parser.jpeg.JpegParser.parse(JpegParser.java:66)
~
org.apache.tika.exception.TikaException: image/png parse error
at org.apache.tika.parser.image.ImageParser.parse(ImageParser.java:91)
~
org.apache.tika.exception.TikaException: image/gif parse error
at org.apache.tika.parser.image.ImageParser.parse(ImageParser.java:91)
~
I am using tika-app-0.9
~
How could you read metadata from files? Do you have any true code examples?
~
lbrtchx
thank you
Re: org.apache.tika.exception.TikaException while trying to get
images' metadata ...
Posted by Albretch Mueller <lb...@gmail.com>.
Hmm! But, why do I have then so many corrupted files?
~
$ ls -l
total 240
-rw-r--r-- 1 knoppix knoppix 23772 Sep 9 16:05 26-3.jpg
-rw-r--r-- 1 knoppix knoppix 172471 Sep 9 16:05 3997912989_5e666b3a4b.jpg
-rwxr-xr-x 1 knoppix knoppix 9620 Sep 9 16:05 6926.jpeg
-rwxr-xr-x 1 knoppix knoppix 16131 Sep 9 16:05 edeploy_os.jpg
-rwxr-xr-x 1 knoppix knoppix 76 Sep 9 16:06 free.gif
-rwxr--r-- 1 knoppix knoppix 7379 Sep 9 16:05 s-tree.png
$ md5sum *.*
6f9ae626882a00d518fbaf7059a051d5 26-3.jpg
912f608b84ae22513d617a1b85116f10 3997912989_5e666b3a4b.jpg
43f9e5f26a4d703f6ffe0e281735abaa 6926.jpeg
a377a0415f5e0d62b80d7822632f8ba7 edeploy_os.jpg
655c9615b9b32cfad4b004e0a9787239 free.gif
39c4c12814ec6a7f0848073829f90e6a s-tree.png
and here they are:
http://hsymbolicus.files.wordpress.com/2011/09/26-3.jpg
http://hsymbolicus.files.wordpress.com/2011/09/6926.jpeg
http://hsymbolicus.files.wordpress.com/2011/09/3997912989_5e666b3a4b.jpg
http://hsymbolicus.files.wordpress.com/2011/09/edeploy_os.jpg
http://hsymbolicus.files.wordpress.com/2011/09/free.gif
http://hsymbolicus.files.wordpress.com/2011/09/s-tree.png
Some of these files I had even grabbed from the Internet for my tests
~
Please, let me know (your theory of) what is going on?
~
Thank you
lbrtchx
Re: org.apache.tika.exception.TikaException while trying to get
images' metadata ...
Posted by Nick Burch <ni...@alfresco.com>.
On Thu, 8 Sep 2011, Albretch Mueller wrote:
> ~
> getting metadata from files using tika seemed easy. Right now what I
> am most interested in is images but all tika gives me is:
> ~
> org.apache.tika.exception.TikaException: Can't read JPEG metadata
> at org.apache.tika.parser.image.ImageMetadataExtractor.parseJpeg(ImageMetadataExtractor.java:92)
> at org.apache.tika.parser.jpeg.JpegParser.parse(JpegParser.java:66)
> ~
> org.apache.tika.exception.TikaException: image/png parse error
> at org.apache.tika.parser.image.ImageParser.parse(ImageParser.java:91)
This sort of thing is normally caused by corrupt images. Are you able to
share any of your problematic files?
Nick