You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2013/03/21 23:13:07 UTC

[Nutch Wiki] Update of "IndexMetatags" by kiranchitturi

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "IndexMetatags" page has been changed by kiranchitturi:
http://wiki.apache.org/nutch/IndexMetatags?action=diff&rev1=3&rev2=4

  = Nutch - Parse Metatags =
  '''Summary:''' When crawling HTML pages, it might be necessary to retrieve information which is stored in HTML Meta tags. This tutorial shows how to install the plugin and configure Nutch to parse meta tags into separate fields in the Solr index. Note that Nutch pushes the information to Solr, so this tutorial also includes the changes required to Solr. This article relates to the parse`-metatags` plugin, provided in jira:  https://issues.apache.org/jira/browse/NUTCH-809
  
+ 
+ {{{
+ The current version of plugin in 1.x series cannot parse multiValued metatags. Please check https://issues.apache.org/jira/browse/NUTCH-1467 for patch.
+ 
+ This plugin is not included in 2.x series. Please check https://issues.apache.org/jira/browse/NUTCH-1478 for patch.
+ }}}
  == Plugin Information ==
  This plugin has been committed to the trunk in revision 1303371 and will be available in Nutch 1.5. It parses specified meta tags and relies on the index`-metadata `plugin.