You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2013/03/20 19:06:00 UTC
[Nutch Wiki] Update of "CommandLineOptions" by kiranchitturi

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "CommandLineOptions" page has been changed by kiranchitturi:
http://wiki.apache.org/nutch/CommandLineOptions?action=diff&rev1=46&rev2=47

  
  = Nutch Command Line Options of bin/nutch =
  
- Th following is a '''complete''' list of Nutch command line options. That is to say that some or all of the options may not be available in the particular version of Nutch you are using. For version specific options please see the relevant check box, once you know that such a command exists for your particular Nutch distribution, you can navigate to the relevant wiki entry for a detailed descritpion of the tool. 
+ The following is a '''complete''' list of Nutch command line options. That is to say that some or all of the options may not be available in the particular version of Nutch you are using. For version specific options please see the relevant check box, once you know that such a command exists for your particular Nutch distribution, you can navigate to the relevant wiki entry for a detailed descritpion of the tool. 
  
- The script bin/nutch is a helper which picks different java classes to "run".
+ The script bin/nutch is a helper which picks different java classes to "run". The new script bin/crawl [NUTCH-1087]
  
  '''Note''': Most commands print help when invoked w/o parameters.
  
@@ -14, +14 @@

  
  ||'''command'''||'''function'''||'''version'''||
  || || || '''1.x''' || '''2.x''' ||
- ||[[bin/nutch_crawl]]||One-step crawler for intranets|| X ||X||
+ ||[[bin/nutch crawl]]||One-step crawler for intranets|| X ||X||
- ||[[bin/nutch_readdb]]||Read / dump crawl db|| X ||X||
+ ||[[bin/nutch readdb]]||Read / dump crawl db|| X ||X||
  ||[[bin/nutch mergedb]]||Merge crawldb-s, with optional filtering|| X ||||
  ||[[bin/nutch readlinkdb]]||Read / dump link db|| X ||||
- ||[[bin/nutch_inject]]||Inject new urls into the database|| X ||X||
+ ||[[bin/nutch inject]]||Inject new urls into the database|| X ||X||
- ||[[bin/nutch_hostinject]]||Inject new urls into the hostdatabase||  ||X||
+ ||[[bin/nutch hostinject]]||Inject new urls into the hostdatabase||  ||X||
- ||[[bin/nutch_generate]]||Generate new segments to fetch from crawldb|| X ||X||
+ ||[[bin/nutch generate]]||Generate new segments to fetch from crawldb|| X ||X||
- ||[[bin/nutch_freegen]]||Generate new segments to fetch from text files|| X ||||
+ ||[[bin/nutch freegen]]||Generate new segments to fetch from text files|| X ||||
- ||[[bin/nutch_fetch]]||Fetch a segment's pages|| X ||X||
+ ||[[bin/nutch fetch]]||Fetch a segment's pages|| X ||X||
- ||[[bin/nutch_parse]]||Parse a segment's pages|| X ||X||
+ ||[[bin/nutch parse]]||Parse a segment's pages|| X ||X||
- ||[[bin/nutch_readseg]]||Read / dump segment data|| X ||||
+ ||[[bin/nutch readseg]]||Read / dump segment data|| X ||||
- ||[[bin/nutch_mergesegs]]||Merges multiple segments, with optional filtering and slicing|| X ||||
+ ||[[bin/nutch mergesegs]]||Merges multiple segments, with optional filtering and slicing|| X ||||
- ||[[bin/nutch_updatedb]]||Update crawldb (from segments if in 1.x) after fetching|| X ||X||
+ ||[[bin/nutch updatedb]]||Update crawldb (from segments if in 1.x) after fetching|| X ||X||
- ||[[bin/nutch_updatehostdb]]||Update hostdb after fetching|| ||X||
+ ||[[bin/nutch updatehostdb]]||Update hostdb after fetching|| ||X||
- ||[[bin/nutch_invertlinks]]||Create a linkdb from parsed segments|| X ||||
+ ||[[bin/nutch invertlinks]]||Create a linkdb from parsed segments|| X ||||
- ||[[bin/nutch_mergelinkdb]]||Merge's linkdb-s, with optional filtering|| X ||||
+ ||[[bin/nutch mergelinkdb]]||Merge's linkdb-s, with optional filtering|| X ||||
  ||[[bin/nutch elasticindex]]||Run the elastic search indexer on parsed batches|| ||X||
  ||[[bin/nutch solrindex]]||Run the solr indexer on parsed segments and linkdb|| X ||X||
  ||[[bin/nutch solrdedup]]||Removes duplicate documents from solr|| X ||X||