You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tinkerpop.apache.org by sp...@apache.org on 2021/11/11 11:41:13 UTC

[tinkerpop] branch master updated: Added docs on hadoop setup for docs generation CTR

This is an automated email from the ASF dual-hosted git repository.

spmallette pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/tinkerpop.git


The following commit(s) were added to refs/heads/master by this push:
     new db34324  Added docs on hadoop setup for docs generation CTR
db34324 is described below

commit db3432471bd02ae7039c1eab85c0c38a2ad7405e
Author: Stephen Mallette <st...@amazon.com>
AuthorDate: Thu Nov 11 06:23:08 2021 -0500

    Added docs on hadoop setup for docs generation CTR
---
 .../dev/developer/development-environment.asciidoc | 72 ++++++++++++++++++----
 1 file changed, 60 insertions(+), 12 deletions(-)

diff --git a/docs/src/dev/developer/development-environment.asciidoc b/docs/src/dev/developer/development-environment.asciidoc
index 32cf040..4f3847b 100644
--- a/docs/src/dev/developer/development-environment.asciidoc
+++ b/docs/src/dev/developer/development-environment.asciidoc
@@ -123,18 +123,62 @@ referenced above:
 [source,xml]
 ----
 <configuration>
-    <property>
-        <name>yarn.nodemanager.aux-services</name>
-        <value>mapreduce_shuffle</value>
-    </property>
-    <property>
-        <name>yarn.nodemanager.vmem-check-enabled</name>
-        <value>false</value>
-    </property>
-    <property>
-        <name>yarn.nodemanager.vmem-pmem-ratio</name>
-        <value>4</value>
-    </property>
+  <property>
+    <name>yarn.nodemanager.aux-services</name>
+    <value>mapreduce_shuffle</value>
+  </property>
+  <property>
+    <name>yarn.nodemanager.vmem-check-enabled</name>
+    <value>false</value>
+  </property>
+  <property>
+    <name>yarn.nodemanager.vmem-pmem-ratio</name>
+    <value>4</value>
+  </property>
+</configuration>
+----
+
+The `/etc/hadoop/mapred-site.xml` file prefers the following configuration:
+
+[source,xml]
+----
+<configuration>
+  <property>
+    <name>mapreduce.framework.name</name>
+    <value>yarn</value>
+  </property>
+  <property>
+    <name>mapred.map.tasks</name>
+    <value>4</value>
+  </property>
+  <property>
+    <name>mapred.reduce.tasks</name>
+    <value>4</value>
+  </property>
+  <property>
+    <name>mapreduce.job.counters.limit</name>
+    <value>1000</value>
+  </property>
+  <property>
+    <name>mapreduce.jobtracker.address</name>
+    <value>localhost:9001</value>
+  </property>
+  <property>
+    <name>mapreduce.map.memory.mb</name>
+    <value>2048</value>
+  </property>
+  <property>
+    <name>mapreduce.reduce.memory.mb</name>
+    <value>4096</value>
+  </property>
+  <property>
+    <name>mapreduce.map.java.opts</name>
+    <value>-Xmx2048m</value>
+  </property>
+  <property>
+    <name>mapreduce.reduce.java.opts</name>
+    <value>-Xmx4096m</value>
+  </property>
 </configuration>
 ----
 
@@ -142,6 +186,9 @@ Also note that link:http://www.grymoire.com/Unix/Awk.html[awk] version `4.0.1` i
 The link:https://tinkerpop.apache.org/docs/current/recipes/#olap-spark-yarn[YARN recipe] also uses the `zip` program to
 create an archive so that needs to be installed, too, if you don't have it already.
 
+The Hadoop 3.3.x installation instructions call for installing `pdsh` but installing that seems to cause permission
+problems when executing `sbin/start-dfs.sh`. Skipping that prerequisite seems to solve the problem.
+
 Documentation can be generated locally with:
 
 [source,text]
@@ -268,6 +315,7 @@ If confronted with "Permission denied" errors on Linux, it may be necessary to d
 sudo groupadd docker
 sudo usermod -aG docker $USER
 newgrp docker
+sudo chmod 666 /var/run/docker.sock
 ----
 
 [[release-environment]]