You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Clemens Wyss DEV <cl...@mysign.ch> on 2013/02/20 13:40:49 UTC

OOMs, is ForkParser the way to go?

Facing OOMs while extracting content for my Lucene documents. Therefore I'd like to make use of the ForkParser.

Where is the ForkParser described (options et al)? What exactly does it do (the name suggests that the effective parsing is being done in a separate forked process, which I'd appreciate)?
Can the parser be "reused" and can the (corresponding) ForkServer be kept open?

Thx
Clemens

AW: OOMs, is ForkParser the way to go?

Posted by Clemens Wyss DEV <cl...@mysign.ch>.
Playing around with the ForkParser I guess I have found allmost all answers:
> Can the parser be "reused"
YES

> can the (corresponding) ForkServer be kept open
The workers remain open until the "local parser" is closed

Anything that should be taken into consideration when using ForkParser?

------------------------------------------------------------------------------------------------------------------------------
Von: Clemens Wyss DEV [mailto:clemensdev@mysign.ch] 
Gesendet: Mittwoch, 20. Februar 2013 13:41
An: user@tika.apache.org
Betreff: OOMs, is ForkParser the way to go?

Facing OOMs while extracting content for my Lucene documents. Therefore I'd like to make use of the ForkParser. 

Where is the ForkParser described (options et al)? What exactly does it do (the name suggests that the effective parsing is being done in a separate forked process, which I'd appreciate)?
Can the parser be "reused" and can the (corresponding) ForkServer be kept open?

Thx
Clemens