You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2010/06/24 14:16:57 UTC

[Nutch Wiki] Update of "AboutPlugins" by AlexMc

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "AboutPlugins" page has been changed by AlexMc.
The comment on this change is: removing references to lucene and adding in an extra link.
http://wiki.apache.org/nutch/AboutPlugins?action=diff&rev1=5&rev2=6

--------------------------------------------------

- Nutch's plugin system is based on the one used in Eclipse 2.x.  Plugins are central to how nutch works.  All of the parsing, indexing and searching that nutch does is actually accomplished by various plugins.
+ Nutch's plugin system is based on the one used in [[http://www.eclipse.org/articles/Article-Plug-in-architecture/plugin_architecture.html|Eclipse 2.x]].  Plugins are central to how nutch works.  All of the parsing, indexing and searching that nutch does is actually accomplished by various plugins.
  
- In writing a plugin, you're actually providing one or more ''extensions'' of the existing ''extension-points'' . The core Nutch ''extension-points'' are themselves defined in a plugin, the [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/plugin/ExtensionPoint.html|NutchExtensionPoints]] plugin (they are listed in the !NutchExtensionPoints [[http://svn.apache.org/viewcvs.cgi/lucene/nutch/trunk/src/plugin/nutch-extensionpoints/plugin.xml?view=markup|plugin.xml]] file). Each ''extension-point'' defines an interface that must be implemented by the ''extension''. The core extension points are:
+ In writing a plugin, you're actually providing one or more ''extensions'' of the existing ''extension-points'' . The core Nutch ''extension-points'' are themselves defined in a plugin, the [[http://nutch.apache.org/apidocs/org/apache/nutch/plugin/ExtensionPoint.html|NutchExtensionPoints]] plugin (they are listed in the !NutchExtensionPoints [[http://svn.apache.org/viewcvs.cgi/nutch/trunk/src/plugin/nutch-extensionpoints/plugin.xml?view=markup|plugin.xml]] file). Each ''extension-point'' defines an interface that must be implemented by the ''extension''. The core extension points are:
  
-  * [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/clustering/OnlineClusterer.html|OnlineClusterer]] -- An extension point interface for online search results clustering algorithms (from javadoc).
+  * [[http://nutch.apache.org/apidocs/org/apache/nutch/clustering/OnlineClusterer.html|OnlineClusterer]] -- An extension point interface for online search results clustering algorithms (from javadoc).
-  * [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/indexer/IndexingFilter.html|IndexingFilter]] -- Permits one to add metadata to the indexed fields. All plugins found which implement this extension point are run sequentially on the parse (from javadoc).
+  * [[http://nutch.apache.org/apidocs/org/apache/nutch/indexer/IndexingFilter.html|IndexingFilter]] -- Permits one to add metadata to the indexed fields. All plugins found which implement this extension point are run sequentially on the parse (from javadoc).
-  * [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/ontology/Ontology.html|Ontology]]
+  * [[http://nutch.apache.org/apidocs/org/apache/nutch/ontology/Ontology.html|Ontology]]
-  * [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/parse/Parser.html|Parser]] -- Parser implementations read through fetched documents in order to extract data to be indexed.  This is what you need to implement if you want Nutch to be able to parse a new type of content, or extract more data from currently parseable content.
+  * [[http://nutch.apache.org/apidocs/org/apache/nutch/parse/Parser.html|Parser]] -- Parser implementations read through fetched documents in order to extract data to be indexed.  This is what you need to implement if you want Nutch to be able to parse a new type of content, or extract more data from currently parseable content.
-  * [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/parse/HtmlParseFilter.html|HtmlParseFilter]] -- Permits one to add additional metadata to HTML parses (from javadoc).
+  * [[http://nutch.apache.org/apidocs/org/apache/nutch/parse/HtmlParseFilter.html|HtmlParseFilter]] -- Permits one to add additional metadata to HTML parses (from javadoc).
-  * [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/protocol/Protocol.html|Protocol]] -- Protocol implementations allow nutch to use different protocols (ftp, http, etc.) to fetch documents.
+  * [[http://nutch.apache.org/apidocs/org/apache/nutch/protocol/Protocol.html|Protocol]] -- Protocol implementations allow nutch to use different protocols (ftp, http, etc.) to fetch documents.
-  * [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/QueryFilter.html|QueryFilter]] -- Extension point for query translation. Permits one to add metadata to a query (from javadoc).
+  * [[http://nutch.apache.org/apidocs/org/apache/nutch/searcher/QueryFilter.html|QueryFilter]] -- Extension point for query translation. Permits one to add metadata to a query (from javadoc).
-  * [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/net/URLFilter.html|URLFilter]] -- URLFilter implementations limit the URLs that nutch attempts to fetch.  The [[http://lucene.apache.org/nutch/apidocs/org/apache/nutch/net/RegexURLFilter.html|RegexURLFilter]] distributed with Nutch provides a great deal of control over what URLs Nutch crawls, however if you have very complicated rules about what URLs you want to crawl, you can write your own implementation.
+  * [[http://nutch.apache.org/apidocs/org/apache/nutch/net/URLFilter.html|URLFilter]] -- URLFilter implementations limit the URLs that nutch attempts to fetch.  The [[http://nutch.apache.org/apidocs/org/apache/nutch/net/RegexURLFilter.html|RegexURLFilter]] distributed with Nutch provides a great deal of control over what URLs Nutch crawls, however if you have very complicated rules about what URLs you want to crawl, you can write your own implementation.
-  * [[http://svn.apache.org/viewcvs.cgi/lucene/nutch/trunk/src/java/org/apache/nutch/analysis/NutchAnalyzer.java?view=markup|NutchAnalyzer]] -- An extension point that provides some language specific analyzers (see MultiLingualSupport proposal). ''Since it is in development stage, it is not in released javadoc''.
+  * [[http://svn.apache.org/viewcvs.cgi/nutch/trunk/src/java/org/apache/nutch/analysis/NutchAnalyzer.java?view=markup|NutchAnalyzer]] -- An extension point that provides some language specific analyzers (see MultiLingualSupport proposal). ''Since it is in development stage, it is not in released javadoc''.
+ 
+ 
+ (Note: At time of edit 24th June 2010 the apidocs are not on the website)
  
  == Source Files ==