You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Julien Nioche (JIRA)" <ji...@apache.org> on 2014/06/23 12:13:24 UTC

[jira] [Commented] (NUTCH-1561) improve usability of parse-metatags and index-metadata

    [ https://issues.apache.org/jira/browse/NUTCH-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040595#comment-14040595 ] 

Julien Nioche commented on NUTCH-1561:
--------------------------------------

+1 to commit but please update [http://wiki.apache.org/nutch/IndexMetatags] accordingly

> improve usability of parse-metatags and index-metadata
> ------------------------------------------------------
>
>                 Key: NUTCH-1561
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1561
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.6
>            Reporter: Sebastian Nagel
>            Priority: Minor
>             Fix For: 1.9
>
>         Attachments: NUTCH-1561-v1.patch
>
>
> Usually, the plugins parse-metatags and index-metadata are used in combination: the former "extracts" meta tags, the latter adds the extracted tags as fields to the index. 
> Configuration of the two plugins differs which causes pitfalls and reduces the usability (see example config):
> * the property "metatags.names" of parse-metatags uses ';' as separator instead of ',' used by index-metadata
> * meta tags have to be lowercased in index-metadata
> {code}
> <property>
>   <name>metatags.names</name>
>   <value>DC.creator;DCTERMS.bibliographicCitation</value>
> </property>
> <property>
>   <name>index.parse.md</name>
>   <value>metatag.dc.creator,metatag.dcterms.bibliographiccitation</value>
> </property>
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)