You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Jonathan Koren <jo...@soe.ucsc.edu> on 2009/02/05 02:34:33 UTC

request: better exception handling

Tika needs handle the exceptions of its underlying libraries cleaner.   
Apparently for certain exceptions, it simply throws them back up the  
stack where eventually they get missed via a "throws Exception"  
statement.  I propose that whatever exceptions get thrown by the  
underlying libraries get handled/ignored as appropriate by Tika.  If  
Tika has to rethrow them, it should catch the RandomLibraryException  
and then rethrow it as a TikaException, since that's exception that's  
provided by Tika.

I bring this up, because what I assume is an IIOException from   
com.sun.imageio.plugins.jpeg.JPEGMetadata ("JFIF not permitted in  
stream metadata") got rethrown by Tika and it caused my program to  
fail as it got missed by all my catches and eventually rethrown all  
the way back up to main.
--
Jonathan Koren
jonathan@soe.ucsc.edu
http://www.soe.ucsc.edu/~jonathan/



Re: request: better exception handling

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Feb 5, 2009 at 2:34 AM, Jonathan Koren <jo...@soe.ucsc.edu> wrote:
> I bring this up, because what I assume is an IIOException from
>  com.sun.imageio.plugins.jpeg.JPEGMetadata ("JFIF not permitted in stream
> metadata") got rethrown by Tika and it caused my program to fail as it got
> missed by all my catches and eventually rethrown all the way back up to
> main.

This is something I've been worrying about as well.

The problem is that currently Tika has no way to distinguish between
IOExceptions caused by the document input stream failing and by the
parser library failing to parse the document. The former should be
allowed to reach the client application as documented in the @throws
IOException clause of the parse() method, but the latter should be
caught and wrapped into a TikaException.

I've been doing some background work to enable such distinctions, see
https://issues.apache.org/jira/browse/IO-192. Would you be interested
in joining the effort?

BR,

Jukka Zitting