You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/01/18 15:28:26 UTC

[jira] [Resolved] (TIKA-2017) Tika Server Cannot handle large files; add option for metadata only

     [ https://issues.apache.org/jira/browse/TIKA-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison resolved TIKA-2017.
-------------------------------
       Resolution: Not A Problem
    Fix Version/s:     (was: 1.15)

Haven't heard anything for a while.  Please reopen if this is still a problem.

> Tika Server Cannot handle large files; add option for metadata only
> -------------------------------------------------------------------
>
>                 Key: TIKA-2017
>                 URL: https://issues.apache.org/jira/browse/TIKA-2017
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Harshavardhan Manjunatha
>
> Tika-Python uses Tika REST Server to parse both content & metadata. In this case, the CSV file was 600 MB in size. Tika REST Server runs out of Heap Space since it tries to parse Content also. There should an option to make a REST API call to Tika Server just to parse & return metadata.
> {code}
> Jun 22, 2016 6:38:40 PM org.slf4j.impl.JCLLoggerAdapter warn
> WARNING: /rmeta/text
> java.lang.RuntimeException: org.apache.cxf.interceptor.Fault: Java heap space
>         at org.apache.cxf.interceptor.AbstractFaultChainInitiatorObserver.onMessage(AbstractFaultChainInitiatorObserver.java:116)
>         at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:371)
>         at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
>         at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:251)
>         at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:261)
>         at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:70)
>         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1088)
>         at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1024)
>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
>         at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
>         at org.eclipse.jetty.server.Server.handle(Server.java:370)
>         at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
>         at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:982)
>         at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1043)
>         at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865)
>         at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
>         at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.cxf.interceptor.Fault: Java heap space
>         at org.apache.cxf.service.invoker.AbstractInvoker.createFault(AbstractInvoker.java:163)
>         at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:129)
>         at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:200)
>         at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:99)
>         at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59)
>         at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96)
>         at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)
>         ... 21 more
> Caused by: java.lang.OutOfMemoryError: Java heap space
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)