You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2006/03/07 01:08:58 UTC

[Nutch Wiki] Trivial Update of "nutch-0.8-dev/bin/nutch dedup" by JeffRitchie

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The following page has been changed by JeffRitchie:
http://wiki.apache.org/nutch/nutch-0%2e8-dev/bin/nutch_dedup

The comment on the change is:
new page added

New page:
= "dedup" is an alias for "org.apache.nutch.indexer.DeleteDuplicates" =

== Removes duplicate pages from a set of segment indexes. ==

=== Usage ===
 nutch-0.8-dev/bin/nutch org.apache.nutch.indexer.!DeleteDuplicates <indexes> ...

  '''<indexes>:''' Path to directories containing indexes.[[BR]]

=== Configuration Files ===
 hadoop-default.xml[[BR]]
 hadoop-site.xml[[BR]]
 nutch-default.xml[[BR]]
 nutch-site.xml[[BR]]

=== Other Files ===
 None.

=== Caveats and Notes ===
 None.

DevelopmentCommandLineOptions