You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2014/07/30 11:00:46 UTC

[Nutch Wiki] Update of "IndexMetatags" by JulienNioche

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "IndexMetatags" page has been changed by JulienNioche:
https://wiki.apache.org/nutch/IndexMetatags?action=diff&rev1=4&rev2=5

Comment:
https://issues.apache.org/jira/browse/NUTCH-1561

  <value>protocol-http|urlfilter-regex|parse-(html|tika|metatags)|index-(basic|anchor|metadata)|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
  </property>
  }}}
-  1. In the file `conf/nutch-site.xml`, specify which metatags should be indexed. Either specify specific metatags you want to index, or you can index all metatags. To index all, provide a '*' for the value of the property "metatags.names", otherwise provide the list of names separated by ';'. For example, to only index the metatag 'role', add the following configuration to conf/nutch-site.xml:
+  1. In the file `conf/nutch-site.xml`, specify which metatags should be indexed. Either specify specific metatags you want to index, or you can index all metatags. To index all, provide a '*' for the value of the property "metatags.names", otherwise provide the list of names separated by ','. For example, to only index the metatag 'role', add the following configuration to conf/nutch-site.xml:
   {{{
  <!-- Used only if plugin parse-metatags is enabled. -->
  <property>
  <name>metatags.names</name>
- <value>description;keywords</value>
+ <value>description,keywords</value>
- <description> Names of the metatags to extract, separated by;.
+ <description> Names of the metatags to extract, separated by ','.
    Use '*' to extract all metatags. Prefixes the names with 'metatag.'
    in the parse-metadata. For instance to index description and keywords,
    you need to activate the plugin index-metadata and set the value of the
-   parameter 'index.parse.md' to 'metatag.description;metatag.keywords'.
+   parameter 'index.parse.md' to 'metatag.description,metatag.keywords'.
  </description>
  </property>
  }}}