You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2014/05/13 04:04:14 UTC

[jira] [Commented] (TIKA-1295) Make some Dublin Core items multi-valued

    [ https://issues.apache.org/jira/browse/TIKA-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995945#comment-13995945 ] 

Ray Gauss II commented on TIKA-1295:
------------------------------------

+1 for the data model more accurately reflecting the standard and for multilingual fields, but with a simple text bag how would you know which value corresponds to which language?

I think this is another example that highlights the need for a more structured underlying metadata store as mentioned in section IV of the [metadata roadmap|http://wiki.apache.org/tika/MetadataRoadmap].

> Make some Dublin Core items multi-valued
> ----------------------------------------
>
>                 Key: TIKA-1295
>                 URL: https://issues.apache.org/jira/browse/TIKA-1295
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>            Priority: Minor
>             Fix For: 1.6
>
>
> According to: http://www.pdfa.org/2011/08/pdfa-metadata-xmp-rdf-dublin-core, dc:title, dc:description and dc:rights should allow multiple values because of language alternatives.  Unless anyone objects in the next few days, I'll switch those to Property.toInternalTextBag() from Property.toInternalText().  I'll also modify PDFParser to extract dc:rights.



--
This message was sent by Atlassian JIRA
(v6.2#6252)