You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jörg Ehrlich (JIRA)" <ji...@apache.org> on 2012/06/12 14:44:42 UTC
[jira] [Updated] (TIKA-929) Consistent, namespaced definitions for
office file related metadata
[ https://issues.apache.org/jira/browse/TIKA-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jörg Ehrlich updated TIKA-929:
------------------------------
Attachment: tika_OOXMLOffice_namespaces.patch
This patch should help to resolve this issue.
The patch contains the following:
* Definition of the OOXML namespace properties in Tika-core, except those properties which have equivalent definitions already in the Office Namespace interface.
* Declared the old properties in the MSOffice interface deprecated
* Adjustment of the related parsers to additionally map to the new OOXML properties
* Adjustment of related tests.
> Consistent, namespaced definitions for office file related metadata
> -------------------------------------------------------------------
>
> Key: TIKA-929
> URL: https://issues.apache.org/jira/browse/TIKA-929
> Project: Tika
> Issue Type: Improvement
> Reporter: Nick Burch
> Attachments: tika_OOXMLOffice_namespaces.patch
>
>
> Currently, we have the MSOffice metadata definitions, which is a mixture of Properties and Strings, none of them namespaced. Despite the name, the keys apply to a wide range of Office Documents (not just MS ones), and the keys are taken from a mixture of sources.
> Similar to TIKA-925 / TIKA-928, we should replace these with prefixed versions drawn from a few well known externally defined namespaces, then deprecate the old ones.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira