You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Ken Krugler (JIRA)" <ji...@apache.org> on 2011/07/26 05:41:09 UTC

[jira] [Commented] (TIKA-686) Split tika-parsers into separate components

    [ https://issues.apache.org/jira/browse/TIKA-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070923#comment-13070923 ] 

Ken Krugler commented on TIKA-686:
----------------------------------

I'm in favor of anything that helps with avoiding dependencies on POI, if all I want to parse are text-ish formats :)

I assume we could still have a tika-parsers that has all of the parsers, which just has dependencies on all of the tika-parser-xxx components.

Note that there's still the issue of some of Tika's functionality gracefully handling missing components. IIRC, some of Tika's configuration is still driven primarily by data, versus some combination of data plus what's available at run time.

> Split tika-parsers into separate components
> -------------------------------------------
>
>                 Key: TIKA-686
>                 URL: https://issues.apache.org/jira/browse/TIKA-686
>             Project: Tika
>          Issue Type: Wish
>          Components: parser
>    Affects Versions: 0.9
>            Reporter: Christopher Currie
>            Priority: Minor
>
> The email thread [1] from two years ago that led to splitting Tika into separate components also suggested splitting tika-parsers into separate components based on dependencies. This would be extremely useful, especially in cases where a given parser has no dependencies beyond tika-core. Please consider refactoring the parsers into separate components for 1.0.
> [1] http://markmail.org/message/tavirkqhn6r2szrz

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira