You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2011/05/03 07:17:03 UTC
[jira] [Created] (TIKA-652) Custom metadata from more formats
Custom metadata from more formats
---------------------------------
Key: TIKA-652
URL: https://issues.apache.org/jira/browse/TIKA-652
Project: Tika
Issue Type: Improvement
Components: parser
Affects Versions: 0.9
Reporter: Nick Burch
Assignee: Nick Burch
Currently, Tika handles custom metadata from Open Document files. Any custom metadata is returned with a custom: prefix (see OpenOfficeParserTest#testOO2Metadata for example)
Microsoft file formats don't include custom metadata in the parsing, and nor does PDF
Assuming we're happy with including custom metadata from Documents in the parsing step, with the custom: prefix, I'll go ahead and add it for the Microsoft (ole2 and ooxml) and PDF parsers
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (TIKA-652) Custom metadata from more formats
Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Burch resolved TIKA-652.
-----------------------------
Resolution: Fixed
Fix Version/s: 1.0
Fixed in r1100100. Both the Microsoft Office and Open Document parsers handle custom metadata in the same way now, with the custom: prefix on the entries
> Custom metadata from more formats
> ---------------------------------
>
> Key: TIKA-652
> URL: https://issues.apache.org/jira/browse/TIKA-652
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 0.9
> Reporter: Nick Burch
> Assignee: Nick Burch
> Fix For: 1.0
>
>
> Currently, Tika handles custom metadata from Open Document files. Any custom metadata is returned with a custom: prefix (see OpenOfficeParserTest#testOO2Metadata for example)
> Microsoft file formats don't include custom metadata in the parsing, and nor does PDF
> Assuming we're happy with including custom metadata from Documents in the parsing step, with the custom: prefix, I'll go ahead and add it for the Microsoft (ole2 and ooxml) and PDF parsers
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira