You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Hudson (Jira)" <ji...@apache.org> on 2022/11/16 00:52:00 UTC

[jira] [Commented] (TIKA-3928) Need to apply metadata filters after we extract parse exceptions in PipesServer

    [ https://issues.apache.org/jira/browse/TIKA-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17634584#comment-17634584 ] 

Hudson commented on TIKA-3928:
------------------------------

FAILURE: Integrated in Jenkins build Tika ยป tika-main-jdk8 #917 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk8/917/])
TIKA-3928 -- add workaround to extract container parse exception even if metadatafilter renames field or removes it. (tallison: [https://github.com/apache/tika/commit/6a14cd785233a6dfa29f98f73b18f3f3a074f093])
* (edit) tika-core/src/main/java/org/apache/tika/pipes/PipesClient.java
* (edit) tika-core/src/main/java/org/apache/tika/pipes/PipesServer.java
* (edit) tika-core/src/main/java/org/apache/tika/pipes/emitter/EmitData.java
TIKA-3928 -- add workaround to extract container parse exception even if metadatafilter renames field or removes it. (tallison: [https://github.com/apache/tika/commit/055a3ceb69bdda2c002f38f6dadf2e4d9d29f616])
* (edit) tika-integration-tests/tika-pipes-opensearch-integration-tests/src/test/java/org/apache/tika/pipes/xsearch/tests/TikaPipesXSearchBase.java


> Need to apply metadata filters after we extract parse exceptions in PipesServer
> -------------------------------------------------------------------------------
>
>                 Key: TIKA-3928
>                 URL: https://issues.apache.org/jira/browse/TIKA-3928
>             Project: Tika
>          Issue Type: Task
>          Components: tika-pipes
>            Reporter: Tim Allison
>            Priority: Major
>
> Our current code runs the MetadataFilter during the parse in the RecursiveParserWrapper.  The problem is that we later rely on the container exception in the metadata to return the proper status in the pipes parser (e.g. PARSE_EXCEPTION).  If we use a no-op filter in the RecursiveParserWrapper, extract the container stacktrace and then apply the filter in the PipesServer, we're good to go.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)