You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Karl Wright (JIRA)" <ji...@apache.org> on 2018/11/30 22:05:00 UTC

[jira] [Commented] (CONNECTORS-1560) Improve tika-server robustness via -spawnChild

    [ https://issues.apache.org/jira/browse/CONNECTORS-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705329#comment-16705329 ] 

Karl Wright commented on CONNECTORS-1560:
-----------------------------------------

[~tallison@apache.org], ManifoldCF does not ship the Tika Server.  We provide a transformation connector that talks to it, but that is all.  There is also an embedded Tika transformer which works for many people, but if people run into difficulties with it we recommend using the external server and setting it up themselves.




> Improve tika-server robustness via -spawnChild
> ----------------------------------------------
>
>                 Key: CONNECTORS-1560
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1560
>             Project: ManifoldCF
>          Issue Type: Wish
>            Reporter: Tim Allison
>            Priority: Major
>
> I'd encourage you to consider adopting the new {{-spawnChild}} mode in tika-server.  See the documentation here: https://wiki.apache.org/tika/TikaJAXRS#Making%20Tika%20Server%20Robust%20to%20OOMs,%20Infinite%20Loops%20and%20Memory%20Leaks
> The small downside is that the server can go down for a few seconds during the restart.   Clients have to be prepared for an IOException on files that are being parsed when the child server goes down and/or if the child is being restarted.  The upside is that your users will be protected against infinite loops, OOM and memory leaks...things that we used to just hope never happened...but they do, and they will.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)