You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Sami Siren (JIRA)" <ji...@apache.org> on 2010/02/15 13:03:27 UTC

[jira] Created: (TIKA-378) TikaConfig should notify users if it cannot initialize some parser

TikaConfig should notify users if it cannot initialize some parser
------------------------------------------------------------------

                 Key: TIKA-378
                 URL: https://issues.apache.org/jira/browse/TIKA-378
             Project: Tika
          Issue Type: Bug
          Components: config
            Reporter: Sami Siren
             Fix For: 0.6


It would be nice if TikaConfig would somehow signal that it cannot load parser classes. Currently it just silently ignores all throwables.

I would be ok with about any kind of signaling (even just wrapping and rethrowing the exception). If we wan't to maintain back compat in functionality we could indroduce a new config option into the configuration file or a method in TikaConfig class that would enable rethrowing exceptions on parser initializations.

What do others think?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (TIKA-378) TikaConfig should notify users if it cannot initialize some parser

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834783#action_12834783 ] 

Jukka Zitting commented on TIKA-378:
------------------------------------

How about if we introduced a protected TikaConfig method like handleParserLoadFailure(String parserClassName, Throwable throwable) that does nothing by default but that you could override to add whatever logging or other error handling?

The rationale for silently ignoring such errors is the fairly common scenario where just a subset of the required parser libraries is in the classpath but you still want to use the default Tika configuration without adjusting it to match the available libraries.

> TikaConfig should notify users if it cannot initialize some parser
> ------------------------------------------------------------------
>
>                 Key: TIKA-378
>                 URL: https://issues.apache.org/jira/browse/TIKA-378
>             Project: Tika
>          Issue Type: Bug
>          Components: config
>    Affects Versions: 0.6
>            Reporter: Sami Siren
>
> It would be nice if TikaConfig would somehow signal that it cannot load parser classes. Currently it just silently ignores all throwables.
> I would be ok with about any kind of signaling (even just wrapping and rethrowing the exception). If we wan't to maintain back compat in functionality we could indroduce a new config option into the configuration file or a method in TikaConfig class that would enable rethrowing exceptions on parser initializations.
> What do others think?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (TIKA-378) TikaConfig should notify users if it cannot initialize some parser

Posted by "Ken Krugler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834464#action_12834464 ] 

Ken Krugler commented on TIKA-378:
----------------------------------

Would it be sufficient to add a method that forces initialization of all known parsers?

Or is this a runtime situation, where you want an exception thrown when either an explicit Tika parser can't load the required implementation jar, or the Autodetect parser hits the same situation?


> TikaConfig should notify users if it cannot initialize some parser
> ------------------------------------------------------------------
>
>                 Key: TIKA-378
>                 URL: https://issues.apache.org/jira/browse/TIKA-378
>             Project: Tika
>          Issue Type: Bug
>          Components: config
>            Reporter: Sami Siren
>             Fix For: 0.6
>
>
> It would be nice if TikaConfig would somehow signal that it cannot load parser classes. Currently it just silently ignores all throwables.
> I would be ok with about any kind of signaling (even just wrapping and rethrowing the exception). If we wan't to maintain back compat in functionality we could indroduce a new config option into the configuration file or a method in TikaConfig class that would enable rethrowing exceptions on parser initializations.
> What do others think?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (TIKA-378) TikaConfig should notify users if it cannot initialize some parser

Posted by "Sami Siren (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834681#action_12834681 ] 

Sami Siren commented on TIKA-378:
---------------------------------

Basically my situation is such that I am extending tika with external parsers. When deploying, initialization of parsers may fail for many reasons, for example you might have problems with loading native libraries, class not found etc. Currently you are totally in the dark what it comes to quessing what is the problem, you only see that your parser is not there. I am basically seeking a way to tell the user (via logging or something else) that there was problems initializing the parsers you have configured and here's the cause of the problems (eg an exception).

> TikaConfig should notify users if it cannot initialize some parser
> ------------------------------------------------------------------
>
>                 Key: TIKA-378
>                 URL: https://issues.apache.org/jira/browse/TIKA-378
>             Project: Tika
>          Issue Type: Bug
>          Components: config
>            Reporter: Sami Siren
>             Fix For: 0.6
>
>
> It would be nice if TikaConfig would somehow signal that it cannot load parser classes. Currently it just silently ignores all throwables.
> I would be ok with about any kind of signaling (even just wrapping and rethrowing the exception). If we wan't to maintain back compat in functionality we could indroduce a new config option into the configuration file or a method in TikaConfig class that would enable rethrowing exceptions on parser initializations.
> What do others think?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (TIKA-378) TikaConfig should notify users if it cannot initialize some parser

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated TIKA-378:
-------------------------------

    Affects Version/s: 0.6
        Fix Version/s:     (was: 0.6)

> TikaConfig should notify users if it cannot initialize some parser
> ------------------------------------------------------------------
>
>                 Key: TIKA-378
>                 URL: https://issues.apache.org/jira/browse/TIKA-378
>             Project: Tika
>          Issue Type: Bug
>          Components: config
>    Affects Versions: 0.6
>            Reporter: Sami Siren
>
> It would be nice if TikaConfig would somehow signal that it cannot load parser classes. Currently it just silently ignores all throwables.
> I would be ok with about any kind of signaling (even just wrapping and rethrowing the exception). If we wan't to maintain back compat in functionality we could indroduce a new config option into the configuration file or a method in TikaConfig class that would enable rethrowing exceptions on parser initializations.
> What do others think?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (TIKA-378) TikaConfig should notify users if it cannot initialize some parser

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved TIKA-378.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.7
         Assignee: Jukka Zitting

After thinking about this more I decided to remove (see revision 911457) the empty catch of Throwables introduced for TIKA-217. Now the configuration parser will throw TikaExceptions whenever it encounters configuration errors.

As mentioned, the main cause for potential exceptions so far has been an incomplete classpath caused by a deployment choosing (quite correctly) not to include parser libraries for formats that aren't needed. To avoid such errors I modified the parser classes in tika-parsers to be loadable even when the underlying parser library is not available. After this change any exceptions from TikaConfig are far more likely to be caused by real configuration problems.

> TikaConfig should notify users if it cannot initialize some parser
> ------------------------------------------------------------------
>
>                 Key: TIKA-378
>                 URL: https://issues.apache.org/jira/browse/TIKA-378
>             Project: Tika
>          Issue Type: Bug
>          Components: config
>    Affects Versions: 0.6
>            Reporter: Sami Siren
>            Assignee: Jukka Zitting
>             Fix For: 0.7
>
>
> It would be nice if TikaConfig would somehow signal that it cannot load parser classes. Currently it just silently ignores all throwables.
> I would be ok with about any kind of signaling (even just wrapping and rethrowing the exception). If we wan't to maintain back compat in functionality we could indroduce a new config option into the configuration file or a method in TikaConfig class that would enable rethrowing exceptions on parser initializations.
> What do others think?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.