You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2011/07/03 06:07:01 UTC

[Nutch Wiki] Update of "bin/nutch plugin" by LewisJohnMcgibbney

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "bin/nutch plugin" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/bin/nutch%20plugin

Comment:
Update to reflect Nutch 1.3 API

New page:
Plugin is an alias for org.apache.nutch.plugin.PluginRepository

This command can be used to load a plugin from the repository and execute its class main(). The plugin repositority is a registry of all plugins. At system boot up the plugin repository is built by parsing the mainifest files of all plugins. Plugins that are required which do not exist under other plugins are not registed. For each plugin a plugin descriptor instance will be created. The descriptor represents all meta information about a plugin so a plugin instance will be created later when it is required, this allows so called ''lazy'' plugin loading.

When loading plugins and building them into our working Nutch distribution need to be aware of various files. 
 * hadoop-default.xml
 * hadoop-site.xml
 * nutch-default.xml
 * nutch-site.xml
/!\ :This needs to be clearer, it would help if property values within nutch-site.xml were described in detail: /!\

Usage:

{{{
bin/nutch org.apache.nutch.plugin.PluginRepository <pluginId> <className> [args ...]
}}}

'''<pluginId>''': The id of the plugin you wish to execute. e.g. the COMMAND

'''<className>''': The class with the main() function you wish to execute.

'''[args ...]''': 0 ...n arguments to pass to the plugin. This is sparsley documented as arguments are plugin specific as well as dependent.


CommandLineOptions