You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "kiran (JIRA)" <ji...@apache.org> on 2013/02/28 17:01:12 UTC

[jira] [Commented] (NUTCH-1537) Legacy metadata package needs to take advantage of Apache Tika metadata package more.

    [ https://issues.apache.org/jira/browse/NUTCH-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589620#comment-13589620 ] 

kiran commented on NUTCH-1537:
------------------------------

Hi Lewis,

Do you mean we need to take advantage in defining more Tika properties or the other plugins to use data parsed by Tika ? We have plugin parse-tika that parses the tags and content for now. 

I am interested in this but i am not clearly sure what you mean by taking advantage ?
                
> Legacy metadata package needs to take advantage of Apache Tika metadata package more.
> -------------------------------------------------------------------------------------
>
>                 Key: NUTCH-1537
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1537
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.6, 2.1
>            Reporter: Lewis John McGibbney
>            Priority: Minor
>             Fix For: 1.7, 2.2
>
>
> In Nutch, classes from the metadata package are being used in quite a number of places. It is not currently being used to reflect the work going on in Apache Tika and we need to better leverage the vocabularies available to us from the dependency on Apache Tika.
> The introduction of TikaCoreProperties in Tika 1.2 is not currently leveraged in Nutch. This is just one example of an improved way for us to add metadata to Nutch documents.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira