You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2015/02/24 23:49:20 UTC

[Nutch Wiki] Trivial Update of "bin/nutch parse" by LewisJohnMcgibbney

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "bin/nutch parse" page has been changed by LewisJohnMcgibbney:
https://wiki.apache.org/nutch/bin/nutch%20parse?action=diff&rev1=2&rev2=3

  Check Fetcher.java and FetcherOutput.java for further details.
  
  {{{
- Usage: bin/nutch parse <segmentdir>
+ Usage: bin/nutch parse <segment> [-noFilter] [-noNormalize]
+      <segment>    - path to segment you wish to parse
+      -noFilter    - optional flag to NOT filtering URLs
+      -noNormalize - optional flag for NOT normalizing URLs
  }}}
  
  '''<segmentdir>''': This should be the path to the segment directory containing our data for parsing.