You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by tm...@apache.org on 2019/02/13 03:38:57 UTC
[impala] 01/04: IMPALA-7214: [DOCS] More on decoupling impala and
DataNodes
This is an automated email from the ASF dual-hosted git repository.
tmarshall pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
commit 5b32a0d60110be7c21184819c2dffbb7cbff750f
Author: Alex Rodoni <ar...@cloudera.com>
AuthorDate: Tue Feb 12 12:40:42 2019 -0800
IMPALA-7214: [DOCS] More on decoupling impala and DataNodes
Change-Id: I4b6f1c704c1e328af9f0beec73f8b6b61fba992e
Reviewed-on: http://gerrit.cloudera.org:8080/12457
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
---
docs/topics/impala_processes.xml | 10 +++------
docs/topics/impala_troubleshooting.xml | 39 +++++++++++++++++-----------------
2 files changed, 23 insertions(+), 26 deletions(-)
diff --git a/docs/topics/impala_processes.xml b/docs/topics/impala_processes.xml
index 71986d3..70366dd 100644
--- a/docs/topics/impala_processes.xml
+++ b/docs/topics/impala_processes.xml
@@ -55,10 +55,7 @@ under the License.
Start one instance of the Impala catalog service.
</li>
- <li>
- Start the main Impala service on one or more DataNodes, ideally on all DataNodes to maximize local
- processing and avoid network traffic due to remote reads.
- </li>
+ <li> Start the main Impala daemon services. </li>
</ol>
<p>
@@ -101,9 +98,8 @@ under the License.
<codeblock rev="1.2">$ sudo service impala-catalog start</codeblock>
- <p>
- Start the Impala service on each DataNode using a command similar to the following:
- </p>
+ <p> Start the Impala daemon services using a command similar to the
+ following: </p>
<p>
<codeblock>$ sudo service impala-server start</codeblock>
diff --git a/docs/topics/impala_troubleshooting.xml b/docs/topics/impala_troubleshooting.xml
index 250c899..80b7363 100644
--- a/docs/topics/impala_troubleshooting.xml
+++ b/docs/topics/impala_troubleshooting.xml
@@ -123,17 +123,17 @@ terminate called after throwing an instance of 'boost::exception_detail::clone_i
<concept id="trouble_io" rev="">
<title>Troubleshooting I/O Capacity Problems</title>
<conbody>
- <p>
- Impala queries are typically I/O-intensive. If there is an I/O problem with storage devices,
- or with HDFS itself, Impala queries could show slow response times with no obvious cause
- on the Impala side. Slow I/O on even a single DataNode could result in an overall slowdown, because
- queries involving clauses such as <codeph>ORDER BY</codeph>, <codeph>GROUP BY</codeph>, or <codeph>JOIN</codeph>
- do not start returning results until all DataNodes have finished their work.
- </p>
- <p>
- To test whether the Linux I/O system itself is performing as expected, run Linux commands like
- the following on each DataNode:
- </p>
+ <p> Impala queries are typically I/O-intensive. If there is an I/O problem
+ with storage devices, or with HDFS itself, Impala queries could show
+ slow response times with no obvious cause on the Impala side. Slow I/O
+ on even a single Impala daemon could result in an overall slowdown,
+ because queries involving clauses such as <codeph>ORDER BY</codeph>,
+ <codeph>GROUP BY</codeph>, or <codeph>JOIN</codeph> do not start
+ returning results until all executor Impala daemons have finished their
+ work. </p>
+ <p> To test whether the Linux I/O system itself is performing as expected,
+ run Linux commands like the following on each host Impala daemon is
+ running: </p>
<codeblock>
$ sudo sysctl -w vm.drop_caches=3 vm.drop_caches=0
vm.drop_caches = 3
@@ -265,14 +265,15 @@ $ sudo dd if=/dev/sdd bs=1M of=/dev/null count=1k
</p>
<p>
- <note>
- Replace <varname>hostname</varname> and <varname>port</varname> with the hostname and port of
- your Impala state store host machine and web server port. The default port is 25010.
- </note>
- The number of <codeph>impalad</codeph> instances listed should match the expected number of
- <codeph>impalad</codeph> instances installed in the cluster. There should also be one
- <codeph>impalad</codeph> instance installed on each DataNode
- </p>
+ <note> Replace <varname>hostname</varname> and
+ <varname>port</varname> with the hostname and port of your
+ Impala state store host machine and web server port. The
+ default port is 25010. </note> The number of
+ <codeph>impalad</codeph> instances listed should match the
+ expected number of <codeph>impalad</codeph> instances
+ installed in the cluster. There should also be one
+ <codeph>impalad</codeph> instance installed on each
+ DataNode.</p>
</entry>
<entry>
<p>