You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "vigi (Jira)" <ji...@apache.org> on 2020/10/29 10:41:00 UTC

[jira] [Commented] (TIKA-3009) XML Parser reset() detection no working in weblogic 12.2.1.3

    [ https://issues.apache.org/jira/browse/TIKA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222824#comment-17222824 ] 

vigi commented on TIKA-3009:
----------------------------

I confirm this behaviour, it happens after detecting the mime type for 10 XML files. The 11th time you will have to wait for 5 minutes and then a new pool of SAX parsers is built and the process starts over.

 

probably a try / catch in the releaseParser method when doing the reset call would be needed, even though that's what the canReset boolean is supposed to prevent in the first place.

> XML Parser reset() detection no working in weblogic 12.2.1.3
> ------------------------------------------------------------
>
>                 Key: TIKA-3009
>                 URL: https://issues.apache.org/jira/browse/TIKA-3009
>             Project: Tika
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.20, 1.21, 1.22, 1.23
>         Environment: JDK 1.8.0_231
> Oracle Weblogic Server 12.2.1.3
>            Reporter: Daniel
>            Priority: Critical
>
> Starting with tika 1.20 the org.apache.tika.utils.XMLReaderUtils try to detect if a XML parser supports the reset() functionality by calling reset() during the poolParser creation and watching for a UnsupportedOperationException.
> This unfortunately does not work in weblogic server as the attained RegistryParser itself caches underlying SAX parsers. Only after first use the reset() of the underlying SAXParser is called and will produce the UnsupportedOperationException. A first call to reset() will not produce this exception and XMLReaderUtils thinks, the parser supports reset() which in effect is not true.
> This results in exhaustion of the parser pool and intermittent errors and delays in processing as the pool is reset when a parser is not available after 5 minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)