You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2018/04/11 00:06:46 UTC

[1/4] impala git commit: Remove Yarn from minicluster by default.

Repository: impala
Updated Branches:
  refs/heads/master 4c1538ab1 -> 830e3346f


Remove Yarn from minicluster by default.

Turns out that we start Yarn as part of the minicluster, but we never use it.
(HiveServer2 is configured to run MR jobs "locally" in process.) Likely, this
Yarn integration is a vestige of Yarn/Llama integration.  We can save memory by
not starting it by default.

There are some less-common tooks like tests/comparison/cluster.py which use
Yarn (and Hadoop Streaming). In deference to those tools, I've left a mechanism
to start Yarn rather than excising it altogether. After running
buildall the regular way, add Yarn to the cluster by running:
  testdata/cluster/admin -y start_cluster

I tested by running core tests. I did not test the kerberized minicluster.

Change-Id: I5504cc40b89e3c6d53fac0b7aa4b395fa63e8d79


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/942781d8
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/942781d8
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/942781d8

Branch: refs/heads/master
Commit: 942781d80f243d36c557edb67ad828e8be5ff0d5
Parents: 4c1538a
Author: Philip Zeyliger <ph...@cloudera.com>
Authored: Mon Apr 9 15:10:43 2018 -0700
Committer: Philip Zeyliger <ph...@cloudera.com>
Committed: Tue Apr 10 09:17:28 2018 -0700

----------------------------------------------------------------------
 testdata/cluster/admin | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/942781d8/testdata/cluster/admin
----------------------------------------------------------------------
diff --git a/testdata/cluster/admin b/testdata/cluster/admin
index cb33e21..74b5a9c 100755
--- a/testdata/cluster/admin
+++ b/testdata/cluster/admin
@@ -18,7 +18,7 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# This will create/control/destroy a local hdfs+yarn cluster.
+# This will create/control/destroy a local hdfs/yarn/kms/kudu cluster.
 #
 # All roles run on 127.0.0.1, just like the standard mini cluster included with hadoop.
 # The difference is with this cluster, each role runs in its own process and has its own
@@ -31,12 +31,14 @@ set -euo pipefail
 trap 'echo Error in $0 at line $LINENO: $(awk "NR == $LINENO" $0)' ERR
 
 : ${IMPALA_KERBERIZE=}
+: ${INCLUDE_YARN=}
 
-while getopts vk OPT; do
+while getopts vky OPT; do
   case $OPT in
     v) set -x;;
     k) export IMPALA_KERBERIZE=1;;
-    ?) echo "Usage: $0 [-v (verbose) -k (kerberize)] ACTION (see source...)"; exit 1;;
+    y) export INCLUDE_YARN=1;;
+    ?) echo "Usage: $0 [-v (verbose) -k (kerberize) -y (yarn)] ACTION (see source...)"; exit 1;;
   esac
 done
 shift $(($OPTIND-1))
@@ -54,7 +56,10 @@ export KILL_CLUSTER_MARKER=IBelongToTheMiniCluster
 
 if [[ "$TARGET_FILESYSTEM" == "hdfs" ]]; then
   # The check above indicates that the regular mini-cluster is in use.
-  SUPPORTED_SERVICES=(hdfs yarn kms)
+  SUPPORTED_SERVICES=(hdfs kms)
+  if [ -n "${INCLUDE_YARN}" ]; then
+    SUPPORTED_SERVICES+=(yarn)
+  fi
 else
   # Either a remote distributed file system or a local non-distributed file system is
   # in use. Currently the only service that is expected to work is Kudu, though in theory


[4/4] impala git commit: IMPALA-6805: Show current database in Impala shell prompt

Posted by ta...@apache.org.
IMPALA-6805: Show current database in Impala shell prompt

Prompt format:
[host:port] db_name>

Testing:
- Added new shell tests
- Ran end-to-end shell tests

Change-Id: Ifb0ae58507321e426e5f0f16518671420974a3fc
Reviewed-on: http://gerrit.cloudera.org:8080/9927
Reviewed-by: Fredy Wijaya <fw...@cloudera.com>
Reviewed-by: Michael Brown <mi...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/830e3346
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/830e3346
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/830e3346

Branch: refs/heads/master
Commit: 830e3346f186aebc879e4ef2927e08db97143100
Parents: 6dc13d9
Author: Fredy wijaya <fw...@cloudera.com>
Authored: Wed Apr 4 13:05:31 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Apr 10 20:52:48 2018 +0000

----------------------------------------------------------------------
 shell/impala_shell.py                 | 15 +++++++++--
 tests/shell/test_shell_interactive.py | 42 ++++++++++++++++++++++++++----
 2 files changed, 50 insertions(+), 7 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/830e3346/shell/impala_shell.py
----------------------------------------------------------------------
diff --git a/shell/impala_shell.py b/shell/impala_shell.py
index f13899d..93bdafb 100755
--- a/shell/impala_shell.py
+++ b/shell/impala_shell.py
@@ -105,6 +105,7 @@ class ImpalaShell(object, cmd.Cmd):
 
   # If not connected to an impalad, the server version is unknown.
   UNKNOWN_SERVER_VERSION = "Not Connected"
+  PROMPT_FORMAT = "[{host}:{port}] {db}> "
   DISCONNECTED_PROMPT = "[Not connected] > "
   UNKNOWN_WEBSERVER = "0.0.0.0"
   # Message to display in shell when cancelling a query
@@ -774,7 +775,8 @@ class ImpalaShell(object, cmd.Cmd):
     if self.imp_client.connected:
       self._print_if_verbose('Connected to %s:%s' % self.impalad)
       self._print_if_verbose('Server version: %s' % self.server_version)
-      self.prompt = "[%s:%s] > " % self.impalad
+      self.prompt = ImpalaShell.PROMPT_FORMAT.format(
+        host=self.impalad[0], port=self.impalad[1], db=ImpalaShell.DEFAULT_DB)
       self._validate_database()
     try:
       self.imp_client.build_default_query_options_dict()
@@ -1141,7 +1143,16 @@ class ImpalaShell(object, cmd.Cmd):
     """Executes a USE... query"""
     query = self._create_beeswax_query(args)
     if self._execute_stmt(query) is CmdStatus.SUCCESS:
-      self.current_db = args
+      self.current_db = args.strip('`').strip()
+      self.prompt = ImpalaShell.PROMPT_FORMAT.format(host=self.impalad[0],
+                                                     port=self.impalad[1],
+                                                     db=self.current_db)
+    elif args.strip('`') == self.current_db:
+      # args == current_db means -d option was passed but the "use [db]" operation failed.
+      # We need to set the current_db to None so that it does not show a database, which
+      # may not exist.
+      self.current_db = None
+      return CmdStatus.ERROR
     else:
       return CmdStatus.ERROR
 

http://git-wip-us.apache.org/repos/asf/impala/blob/830e3346/tests/shell/test_shell_interactive.py
----------------------------------------------------------------------
diff --git a/tests/shell/test_shell_interactive.py b/tests/shell/test_shell_interactive.py
index e9049fa..4065f9a 100755
--- a/tests/shell/test_shell_interactive.py
+++ b/tests/shell/test_shell_interactive.py
@@ -58,11 +58,11 @@ class TestImpalaShellInteractive(object):
   def teardown_class(cls):
     restore_shell_history(cls.tempfile_name)
 
-  def _expect_with_cmd(self, proc, cmd, expectations=()):
+  def _expect_with_cmd(self, proc, cmd, expectations=(), db="default"):
     """Executes a command on the expect process instance and verifies a set of
     assertions defined by the expections."""
     proc.sendline(cmd + ";")
-    proc.expect(":21000] >")
+    proc.expect(":21000] {db}>".format(db=db))
     if not expectations: return
     for e in expectations:
       assert e in proc.before
@@ -71,7 +71,7 @@ class TestImpalaShellInteractive(object):
   def test_local_shell_options(self):
     """Test that setting the local shell options works"""
     proc = pexpect.spawn(SHELL_CMD)
-    proc.expect(":21000] >")
+    proc.expect(":21000] default>")
     self._expect_with_cmd(proc, "set", ("LIVE_PROGRESS: False", "LIVE_SUMMARY: False"))
     self._expect_with_cmd(proc, "set live_progress=true")
     self._expect_with_cmd(proc, "set", ("LIVE_PROGRESS: True", "LIVE_SUMMARY: False"))
@@ -179,11 +179,14 @@ class TestImpalaShellInteractive(object):
     assert get_num_open_sessions(initial_impala_service) == num_sessions_initial + 1, \
         "Not connected to %s:21000" % hostname
     p.send_cmd("connect %s:21001" % hostname)
+
     # Wait for a little while
     sleep(2)
     # The number of sessions on the target impalad should have been incremented.
     assert get_num_open_sessions(target_impala_service) == num_sessions_target + 1, \
         "Not connected to %s:21001" % hostname
+    assert "[%s:21001] default>" % hostname in p.get_result().stdout
+
     # The number of sessions on the initial impalad should have been decremented.
     assert get_num_open_sessions(initial_impala_service) == num_sessions_initial, \
         "Connection to %s:21000 should have been closed" % hostname
@@ -260,7 +263,7 @@ class TestImpalaShellInteractive(object):
       os.remove(SHELL_HISTORY_FILE)
     assert not os.path.exists(SHELL_HISTORY_FILE)
     child_proc = pexpect.spawn(SHELL_CMD)
-    child_proc.expect(":21000] >")
+    child_proc.expect(":21000] default>")
     self._expect_with_cmd(child_proc, "@1", ("Command index out of range"))
     self._expect_with_cmd(child_proc, "rerun -1", ("Command index out of range"))
     self._expect_with_cmd(child_proc, "select 'first_command'", ("first_command"))
@@ -268,7 +271,7 @@ class TestImpalaShellInteractive(object):
     self._expect_with_cmd(child_proc, "@ -1", ("first_command"))
     self._expect_with_cmd(child_proc, "select 'second_command'", ("second_command"))
     child_proc.sendline('history;')
-    child_proc.expect(":21000] >")
+    child_proc.expect(":21000] default>")
     assert '[1]: select \'first_command\';' in child_proc.before;
     assert '[2]: select \'second_command\';' in child_proc.before;
     assert '[3]: history;' in child_proc.before;
@@ -498,6 +501,35 @@ class TestImpalaShellInteractive(object):
     result = run_impala_shell_interactive(query)
     assert '| id   |' in result.stdout
 
+  @pytest.mark.execute_serially
+  def test_shell_prompt(self):
+    proc = pexpect.spawn(SHELL_CMD)
+    proc.expect(":21000] default>")
+    self._expect_with_cmd(proc, "use foo", (), 'default')
+    self._expect_with_cmd(proc, "use functional", (), 'functional')
+    self._expect_with_cmd(proc, "use foo", (), 'functional')
+    self._expect_with_cmd(proc, 'use `tpch`', (), 'tpch')
+    self._expect_with_cmd(proc, 'use ` tpch `', (), 'tpch')
+
+    proc = pexpect.spawn(SHELL_CMD, ['-d', 'functional'])
+    proc.expect(":21000] functional>")
+    self._expect_with_cmd(proc, "use foo", (), 'functional')
+    self._expect_with_cmd(proc, "use tpch", (), 'tpch')
+    self._expect_with_cmd(proc, "use foo", (), 'tpch')
+
+    proc = pexpect.spawn(SHELL_CMD, ['-d', ' functional '])
+    proc.expect(":21000] functional>")
+
+    proc = pexpect.spawn(SHELL_CMD, ['-d', '` functional `'])
+    proc.expect(":21000] functional>")
+
+    # Start an Impala shell with an invalid DB.
+    proc = pexpect.spawn(SHELL_CMD, ['-d', 'foo'])
+    proc.expect(":21000] default>")
+    self._expect_with_cmd(proc, "use foo", (), 'default')
+    self._expect_with_cmd(proc, "use functional", (), 'functional')
+    self._expect_with_cmd(proc, "use foo", (), 'functional')
+
 def run_impala_shell_interactive(input_lines, shell_args=None):
   """Runs a command in the Impala shell interactively."""
   # if argument "input_lines" is a string, makes it into a list


[2/4] impala git commit: Revert "Remove Yarn from minicluster by default."

Posted by ta...@apache.org.
Revert "Remove Yarn from minicluster by default."

This reverts commit c05df104570fa2cb7067599bbe3b87740ca9f09e.

Change-Id: I00151795581d22a9852cceaca1d21325d68dbe59
Reviewed-on: http://gerrit.cloudera.org:8080/9969
Reviewed-by: Philip Zeyliger <ph...@cloudera.com>
Tested-by: Philip Zeyliger <ph...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/01b6995a
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/01b6995a
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/01b6995a

Branch: refs/heads/master
Commit: 01b6995abf54954e824c6b2e06f6dc4007e28713
Parents: 942781d
Author: Philip Zeyliger <ph...@cloudera.com>
Authored: Tue Apr 10 16:19:52 2018 +0000
Committer: Philip Zeyliger <ph...@cloudera.com>
Committed: Tue Apr 10 16:21:09 2018 +0000

----------------------------------------------------------------------
 testdata/cluster/admin | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/01b6995a/testdata/cluster/admin
----------------------------------------------------------------------
diff --git a/testdata/cluster/admin b/testdata/cluster/admin
index 74b5a9c..cb33e21 100755
--- a/testdata/cluster/admin
+++ b/testdata/cluster/admin
@@ -18,7 +18,7 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# This will create/control/destroy a local hdfs/yarn/kms/kudu cluster.
+# This will create/control/destroy a local hdfs+yarn cluster.
 #
 # All roles run on 127.0.0.1, just like the standard mini cluster included with hadoop.
 # The difference is with this cluster, each role runs in its own process and has its own
@@ -31,14 +31,12 @@ set -euo pipefail
 trap 'echo Error in $0 at line $LINENO: $(awk "NR == $LINENO" $0)' ERR
 
 : ${IMPALA_KERBERIZE=}
-: ${INCLUDE_YARN=}
 
-while getopts vky OPT; do
+while getopts vk OPT; do
   case $OPT in
     v) set -x;;
     k) export IMPALA_KERBERIZE=1;;
-    y) export INCLUDE_YARN=1;;
-    ?) echo "Usage: $0 [-v (verbose) -k (kerberize) -y (yarn)] ACTION (see source...)"; exit 1;;
+    ?) echo "Usage: $0 [-v (verbose) -k (kerberize)] ACTION (see source...)"; exit 1;;
   esac
 done
 shift $(($OPTIND-1))
@@ -56,10 +54,7 @@ export KILL_CLUSTER_MARKER=IBelongToTheMiniCluster
 
 if [[ "$TARGET_FILESYSTEM" == "hdfs" ]]; then
   # The check above indicates that the regular mini-cluster is in use.
-  SUPPORTED_SERVICES=(hdfs kms)
-  if [ -n "${INCLUDE_YARN}" ]; then
-    SUPPORTED_SERVICES+=(yarn)
-  fi
+  SUPPORTED_SERVICES=(hdfs yarn kms)
 else
   # Either a remote distributed file system or a local non-distributed file system is
   # in use. Currently the only service that is expected to work is Kudu, though in theory


[3/4] impala git commit: Remove Yarn from minicluster by default. (2nd try)

Posted by ta...@apache.org.
Remove Yarn from minicluster by default. (2nd try)

Remove Yarn from minicluster by default.

Turns out that we start Yarn as part of the minicluster, but we never use it.
(HiveServer2 is configured to run MR jobs "locally" in process.) Likely, this
Yarn integration is a vestige of Yarn/Llama integration.  We can save memory by
not starting it by default.

There are some less-common tooks like tests/comparison/cluster.py which use
Yarn (and Hadoop Streaming). In deference to those tools, I've left a mechanism
to start Yarn rather than excising it altogether. After running
buildall the regular way, add Yarn to the cluster by running:
  testdata/cluster/admin -y start_cluster

I tested by running core tests. I did not test the kerberized minicluster.

[Due to a git mishap, a version of this was previously checked in and reverted.]

Change-Id: I97053a44bbe32048e6c35cc28680d1c7696af13f
Reviewed-on: http://gerrit.cloudera.org:8080/9970
Reviewed-by: Michael Brown <mi...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/6dc13d93
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/6dc13d93
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/6dc13d93

Branch: refs/heads/master
Commit: 6dc13d933b5ea9a41e584d83e95db72b9e8e19b3
Parents: 01b6995
Author: Philip Zeyliger <ph...@cloudera.com>
Authored: Tue Apr 10 09:21:20 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Apr 10 20:15:30 2018 +0000

----------------------------------------------------------------------
 testdata/cluster/admin | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/6dc13d93/testdata/cluster/admin
----------------------------------------------------------------------
diff --git a/testdata/cluster/admin b/testdata/cluster/admin
index cb33e21..74b5a9c 100755
--- a/testdata/cluster/admin
+++ b/testdata/cluster/admin
@@ -18,7 +18,7 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# This will create/control/destroy a local hdfs+yarn cluster.
+# This will create/control/destroy a local hdfs/yarn/kms/kudu cluster.
 #
 # All roles run on 127.0.0.1, just like the standard mini cluster included with hadoop.
 # The difference is with this cluster, each role runs in its own process and has its own
@@ -31,12 +31,14 @@ set -euo pipefail
 trap 'echo Error in $0 at line $LINENO: $(awk "NR == $LINENO" $0)' ERR
 
 : ${IMPALA_KERBERIZE=}
+: ${INCLUDE_YARN=}
 
-while getopts vk OPT; do
+while getopts vky OPT; do
   case $OPT in
     v) set -x;;
     k) export IMPALA_KERBERIZE=1;;
-    ?) echo "Usage: $0 [-v (verbose) -k (kerberize)] ACTION (see source...)"; exit 1;;
+    y) export INCLUDE_YARN=1;;
+    ?) echo "Usage: $0 [-v (verbose) -k (kerberize) -y (yarn)] ACTION (see source...)"; exit 1;;
   esac
 done
 shift $(($OPTIND-1))
@@ -54,7 +56,10 @@ export KILL_CLUSTER_MARKER=IBelongToTheMiniCluster
 
 if [[ "$TARGET_FILESYSTEM" == "hdfs" ]]; then
   # The check above indicates that the regular mini-cluster is in use.
-  SUPPORTED_SERVICES=(hdfs yarn kms)
+  SUPPORTED_SERVICES=(hdfs kms)
+  if [ -n "${INCLUDE_YARN}" ]; then
+    SUPPORTED_SERVICES+=(yarn)
+  fi
 else
   # Either a remote distributed file system or a local non-distributed file system is
   # in use. Currently the only service that is expected to work is Kudu, though in theory