You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2022/04/06 11:36:00 UTC

[jira] [Resolved] (TIKA-3716) Add metadata element for all parsers that processed a file

     [ https://issues.apache.org/jira/browse/TIKA-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison resolved TIKA-3716.
-------------------------------
    Fix Version/s: 2.4.0
       Resolution: Fixed

> Add metadata element for all parsers that processed a file
> ----------------------------------------------------------
>
>                 Key: TIKA-3716
>                 URL: https://issues.apache.org/jira/browse/TIKA-3716
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Minor
>             Fix For: 2.4.0
>
>
> We currently have a "parsed by" data element in the metadata, but this only works for the initial container file.  It would be useful to record all parsers that touched a file and its embedded files.  This information is recorded in the RecursiveParserWrapper -- /rmeta, -J -- but it would also be useful for the legacy {{e.g. /tika}} endpoints.
> We recognize that this information will be added to the container file's metadata after the full parse and will not appear in the xhtml markup because of the way the XHTMLHandler works.  However, it will appear in the json output of the {{/tika}} endpoint and for those calling Tika programmatically.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)