You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2014/11/19 22:36:20 UTC

[Nutch Wiki] Update of "PluginCentral" by JorgeLuis

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "PluginCentral" page has been changed by JorgeLuis:
https://wiki.apache.org/nutch/PluginCentral?action=diff&rev1=84&rev2=85

Comment:
Adding a couple of links to two more plugins

   * [[http://issues.apache.org/jira/browse/NUTCH-422|index-extra]] - Adds user-configurable fields to the index.
   * [[http://issues.apache.org/jira/browse/NUTCH-427|protocol-smb]] - Allows Nutch to crawl MS Windows Shares folder.
   * [[IndexMetatags|Index HTML Metatags]]: allows to parse HTML metatags and store them in separate index fields
+  * [[https://github.com/jorgelbg/mimetype-filter|mimetype-filter]] - Allows Nutch to filter crawled documents before indexing.
+  * [[https://github.com/jorgelbg/links-extractor|links-extractor]] - Allows Nutch to index the inlinks and outlinks of any Web page.