You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by sn...@apache.org on 2020/04/28 08:48:37 UTC
[nutch] branch master updated: NUTCH-2501 allow to set Java heap
size when using crawl script in distributed mode - fix examples of `-D
property=value` in bin/crawl : there must be a blank after `-D` because
these arguments are first parsed by bin/crawl
This is an automated email from the ASF dual-hosted git repository.
snagel pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git
The following commit(s) were added to refs/heads/master by this push:
new a455eb5 NUTCH-2501 allow to set Java heap size when using crawl script in distributed mode - fix examples of `-D property=value` in bin/crawl : there must be a blank after `-D` because these arguments are first parsed by bin/crawl
a455eb5 is described below
commit a455eb52ef560338d90a412bef9f6adfa45fc424
Author: Sebastian Nagel <sn...@apache.org>
AuthorDate: Tue Apr 28 10:46:12 2020 +0200
NUTCH-2501 allow to set Java heap size when using crawl script in distributed mode
- fix examples of `-D property=value` in bin/crawl : there must be a blank
after `-D` because these arguments are first parsed by bin/crawl
---
src/bin/crawl | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/crawl b/src/bin/crawl
index 8690929..331ee65 100755
--- a/src/bin/crawl
+++ b/src/bin/crawl
@@ -26,7 +26,7 @@
# -D <propery>=<value> A Nutch or Hadoop property to pass to Nutch calls overwriting
# properties defined in configuration files, e.g.
# increase content limit to 2MB:
-# -Dhttp.content.limit=2097152
+# -D http.content.limit=2097152
# (in distributed mode) configure memory of map and reduce tasks:
# -D mapreduce.map.memory.mb=4608 -D mapreduce.map.java.opts=-Xmx4096m
# -D mapreduce.reduce.memory.mb=4608 -D mapreduce.reduce.java.opts=-Xmx4096m
@@ -83,10 +83,10 @@ function __print_usage {
echo -e " -D\t\t\t\t\tA Nutch or Hadoop property to pass to Nutch calls overwriting"
echo -e " \t\t\t\t\tproperties defined in configuration files, e.g."
echo -e " \t\t\t\t\tincrease content limit to 2MB:"
- echo -e " \t\t\t\t\t -Dhttp.content.limit=2097152"
+ echo -e " \t\t\t\t\t -D http.content.limit=2097152"
echo -e " \t\t\t\t\t(distributed mode only) configure memory of map and reduce tasks:"
- echo -e " \t\t\t\t\t -Dmapreduce.map.memory.mb=4608 -Dmapreduce.map.java.opts=-Xmx4096m"
- echo -e " \t\t\t\t\t -Dmapreduce.reduce.memory.mb=4608 -Dmapreduce.reduce.java.opts=-Xmx4096m"
+ echo -e " \t\t\t\t\t -D mapreduce.map.memory.mb=4608 -D mapreduce.map.java.opts=-Xmx4096m"
+ echo -e " \t\t\t\t\t -D mapreduce.reduce.memory.mb=4608 -D mapreduce.reduce.java.opts=-Xmx4096m"
echo -e " -w|--wait <NUMBER[SUFFIX]>\t\tTime to wait before generating a new segment when no URLs"
echo -e " \t\t\t\t\tare scheduled for fetching. Suffix can be: s for second,"
echo -e " \t\t\t\t\tm for minute, h for hour and d for day. If no suffix is"