You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Stephen H (Jira)" <ji...@apache.org> on 2022/04/26 14:59:00 UTC

[jira] [Created] (TIKA-3738) ForkParser missing metadata for some document formats

Stephen H created TIKA-3738:
-------------------------------

             Summary: ForkParser missing metadata for some document formats
                 Key: TIKA-3738
                 URL: https://issues.apache.org/jira/browse/TIKA-3738
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 2.3.0
         Environment: Java 11.0.14.
            Reporter: Stephen H
         Attachments: ForkParserIntegrationTest.java.diff, testVideoMetadataMp4.mp4

When using ForkParser, metadata from some parsers is not being returned in the Metadata object or in the head of the returned XML. These include OpenDocument Presentation (ODP), OpenDocument Spreadsheet (ODS), Microsoft Word 2006 XML, MP4 Audio (M4A) and MP4 Video (MP4).

Patch for ForkParserIntegrationTest showing the issue for these file types is attached, along with an MP4 video file containing metadata as there doesn't appear to be one currently in the test set.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)