You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Hudson (Jira)" <ji...@apache.org> on 2022/06/14 18:15:00 UTC

[jira] [Commented] (TIKA-3792) AutoDetectParser should not decorate content handlers more than once

    [ https://issues.apache.org/jira/browse/TIKA-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554234#comment-17554234 ] 

Hudson commented on TIKA-3792:
------------------------------

SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk8 #642 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk8/642/])
TIKA-3792 -- only apply the handler decorator once for legacy xhtml processing of embedded documents (tallison: [https://github.com/apache/tika/commit/0ea65717a449c5d248b4f6dd980f783069358fe0])
* (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/java/org/apache/tika/sax/UpcasingContentHandlerDecoratorFactory.java
* (add) tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/tika-config-doubling-custom-handler-decorator.xml
* (edit) tika-core/src/main/java/org/apache/tika/sax/ContentHandlerDecoratorFactory.java
* (add) tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/java/org/apache/tika/sax/DoublingContentHandlerDecoratorFactory.java
* (edit) tika-core/src/main/java/org/apache/tika/parser/AutoDetectParserConfig.java
* (edit) tika-core/src/main/java/org/apache/tika/parser/RecursiveParserWrapper.java
* (edit) tika-core/src/main/java/org/apache/tika/parser/AutoDetectParser.java
* (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/java/org/apache/tika/parser/AutoDetectParserConfigTest.java


> AutoDetectParser should not decorate content handlers more than once
> --------------------------------------------------------------------
>
>                 Key: TIKA-3792
>                 URL: https://issues.apache.org/jira/browse/TIKA-3792
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>             Fix For: 2.4.1
>
>
> We added a contenthandlerdecorator factory in the AutoDetectParser.  This will decorate the contenthandler for each parse.  We need to add some checks so that a handler isn't redecorated on embedded files.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)