You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by sn...@apache.org on 2020/04/28 08:48:37 UTC

[nutch] branch master updated: NUTCH-2501 allow to set Java heap size when using crawl script in distributed mode - fix examples of `-D property=value` in bin/crawl : there must be a blank after `-D` because these arguments are first parsed by bin/crawl

This is an automated email from the ASF dual-hosted git repository.

snagel pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git


The following commit(s) were added to refs/heads/master by this push:
     new a455eb5  NUTCH-2501 allow to set Java heap size when using crawl script in distributed mode - fix examples of `-D property=value` in bin/crawl : there must be a blank   after `-D` because these arguments are first parsed by bin/crawl
a455eb5 is described below

commit a455eb52ef560338d90a412bef9f6adfa45fc424
Author: Sebastian Nagel <sn...@apache.org>
AuthorDate: Tue Apr 28 10:46:12 2020 +0200

    NUTCH-2501 allow to set Java heap size when using crawl script in distributed mode
    - fix examples of `-D property=value` in bin/crawl : there must be a blank
      after `-D` because these arguments are first parsed by bin/crawl
---
 src/bin/crawl | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/bin/crawl b/src/bin/crawl
index 8690929..331ee65 100755
--- a/src/bin/crawl
+++ b/src/bin/crawl
@@ -26,7 +26,7 @@
 #   -D <propery>=<value>                  A Nutch or Hadoop property to pass to Nutch calls overwriting
 #                                         properties defined in configuration files, e.g.
 #                                           increase content limit to 2MB:
-#                                             -Dhttp.content.limit=2097152
+#                                             -D http.content.limit=2097152
 #                                         (in distributed mode) configure memory of map and reduce tasks:
 #                                           -D mapreduce.map.memory.mb=4608    -D mapreduce.map.java.opts=-Xmx4096m
 #                                           -D mapreduce.reduce.memory.mb=4608 -D mapreduce.reduce.java.opts=-Xmx4096m
@@ -83,10 +83,10 @@ function __print_usage {
   echo -e "  -D\t\t\t\t\tA Nutch or Hadoop property to pass to Nutch calls overwriting"
   echo -e "  \t\t\t\t\tproperties defined in configuration files, e.g."
   echo -e "  \t\t\t\t\tincrease content limit to 2MB:"
-  echo -e "  \t\t\t\t\t  -Dhttp.content.limit=2097152"
+  echo -e "  \t\t\t\t\t  -D http.content.limit=2097152"
   echo -e "  \t\t\t\t\t(distributed mode only) configure memory of map and reduce tasks:"
-  echo -e "  \t\t\t\t\t  -Dmapreduce.map.memory.mb=4608    -Dmapreduce.map.java.opts=-Xmx4096m"
-  echo -e "  \t\t\t\t\t  -Dmapreduce.reduce.memory.mb=4608 -Dmapreduce.reduce.java.opts=-Xmx4096m"
+  echo -e "  \t\t\t\t\t  -D mapreduce.map.memory.mb=4608    -D mapreduce.map.java.opts=-Xmx4096m"
+  echo -e "  \t\t\t\t\t  -D mapreduce.reduce.memory.mb=4608 -D mapreduce.reduce.java.opts=-Xmx4096m"
   echo -e "  -w|--wait <NUMBER[SUFFIX]>\t\tTime to wait before generating a new segment when no URLs"
   echo -e "  \t\t\t\t\tare scheduled for fetching. Suffix can be: s for second,"
   echo -e "  \t\t\t\t\tm for minute, h for hour and d for day. If no suffix is"