You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2015/05/25 20:47:51 UTC
[DISCUSS] Thinking about completely refactoring the ExternalParser
and using commons-exec
Hey Everyone,
ExternalParser is way broke. I have some patches that somewhat fix it, but in doing so, I realized, why not just use commons-exec? I realize that this is another dependency into core, but commons-exec simplifies a lot of the stuff that's broke with ExternalParser (reading its streams, for one).
Thoughts? Objections?
Note this is in reference to fixing FFMPEG parsing, which I've nearly done.
Cheers,
Chris
Re: [DISCUSS] Thinking about completely refactoring the ExternalParser
and using commons-exec
Posted by Nick Burch <ap...@gagravarr.org>.
On Mon, 25 May 2015, Tyler Palsulich wrote:
>> Maybe we could push some or all of external parser into the
>> tika-parsers module, so we don't have to add more dependencies into
>> core?
>
> What is the argument for having ExternalParser in core? Provide an
> easy-to-extend class for downstream users to create their own external
> parser?
I can't be certain, as svn blame suggests is was done over 4 yeras ago,
but I've got a feeling we said something like "the other parser abstract
and base classes are in core, and it has no dependencies, so why not put
it in core too". Quite possible there's a comment around that in the list
archives or jira, if someone fancies a few minutes with google to verify!
Nick
Re: [DISCUSS] Thinking about completely refactoring the
ExternalParser and using commons-exec
Posted by Tyler Palsulich <tp...@gmail.com>.
On Mon, May 25, 2015 at 4:05 PM, Nick Burch <ap...@gagravarr.org> wrote:
> On Mon, 25 May 2015, Mattmann, Chris A (3980) wrote:
>
>> ExternalParser is way broke. I have some patches that somewhat fix it,
>> but in doing so, I realized, why not just use commons-exec? I realize that
>> this is another dependency into core, but commons-exec simplifies a lot of
>> the stuff that's broke with ExternalParser (reading its streams, for one).
>>
>
> Maybe we could push some or all of external parser into the tika-parsers
> module, so we don't have to add more dependencies into core?
What is the argument for having ExternalParser in core? Provide an
easy-to-extend class for downstream users to create their own external
parser?
Tyler
Re: [DISCUSS] Thinking about completely refactoring the ExternalParser
and using commons-exec
Posted by Nick Burch <ap...@gagravarr.org>.
On Mon, 25 May 2015, Mattmann, Chris A (3980) wrote:
> ExternalParser is way broke. I have some patches that somewhat fix it,
> but in doing so, I realized, why not just use commons-exec? I realize
> that this is another dependency into core, but commons-exec simplifies a
> lot of the stuff that's broke with ExternalParser (reading its streams,
> for one).
Maybe we could push some or all of external parser into the tika-parsers
module, so we don't have to add more dependencies into core?
Nick