You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by bo...@apache.org on 2020/04/20 13:34:43 UTC
[impala] 01/02: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

This is an automated email from the ASF dual-hosted git repository.

boroknagyz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit bc9d7e063dd71f9af75967f8c6af9ac1a6f49149
Author: David Knupp <dk...@cloudera.com>
AuthorDate: Thu Dec 19 12:58:58 2019 -0800

    IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
    
    This is the main patch for making the the impala-shell cross-compatible with
    python 2 and python 3. The goal is wind up with a version of the shell that will
    pass python e2e tests irrepsective of the version of python used to launch the
    shell, under the assumption that the test framework itself will continue to run
    with python 2.7.x for the time being.
    
    Notable changes for reviewers to consider:
    
    - With regard to validating the patch, my assumption is that simply passing
      the existing set of e2e shell tests is sufficient to confirm that the shell
      is functioning properly. No new tests were added.
    
    - A new pytest command line option was added in conftest.py to enable a user
      to specify a path to an alternate impala-shell executable to test. It's
      possible to use this to point to an instance of the impala-shell that was
      installed as a standalone python package in a separate virtualenv.
    
      Example usage:
      USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=/<path to virtualenv>/bin/impala-shell -sv shell/test_shell_commandline.py
    
      The target virtualenv may be based on either python3 or python2. However,
      this has no effect on the version of python used to run the test framework,
      which remains tied to python 2.7.x for the foreseeable future.
    
    - The $IMPALA_HOME/bin/impala-shell.sh now sets up the impala-shell python
      environment independenty from bin/set-pythonpath.sh. The default version
      of thrift is thrift-0.11.0 (See IMPALA-9489).
    
    - The wording of the header changed a bit to include the python version
      used to run the shell.
    
        Starting Impala Shell with no authentication using Python 3.7.5
        Opened TCP connection to localhost:21000
        ...
    
        OR
    
        Starting Impala Shell with LDAP-based authentication using Python 2.7.12
        Opened TCP connection to localhost:21000
        ...
    
    - By far, the biggest hassle has been juggling str versus unicode versus
      bytes data types. Python 2.x was fairly loose and inconsistent in
      how it dealt with strings. As a quick demo of what I mean:
    
      Python 2.7.12 (default, Nov 12 2018, 14:36:49)
      [GCC 5.4.0 20160609] on linux2
      Type "help", "copyright", "credits" or "license" for more information.
      >>> d = 'like a duck'
      >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8')
      True
    
      ...and yet there are weird unexpected gotchas.
    
      >>> d.decode('utf-8') == d.encode('utf-8')
      True
      >>> d.encode('utf-8') == bytearray(d, 'utf-8')
      True
      >>> d.decode('utf-8') == bytearray(d, 'utf-8')   # fails the eq property?
      False
    
      As a result, this was inconsistency was reflected in the way we handled
      strings in the impala-shell code, but things still just worked.
    
      In python3, there's a much clearer distinction between strings and bytes, and
      as such, much tighter type consistency is expected by standard libs like
      subprocess, re, sqlparse, prettytable, etc., which are used throughout the
      shell. Even simple calls that worked in python 2.x:
    
      >>> import re
      >>> re.findall('foo', b'foobar')
      ['foo']
    
      ...can throw exceptions in python 3.x:
    
      >>> import re
      >>> re.findall('foo', b'foobar')
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall
          return _compile(pattern, flags).findall(string)
      TypeError: cannot use a string pattern on a bytes-like object
    
      Exceptions like this resulted in a many, if not most shell tests failing
      under python 3.
    
      What ultimately seemed like a better approach was to try to weed out as many
      existing spurious str.encode() and str.decode() calls as I could, and try to
      implement what is has colloquially been called a "unicode sandwich" -- namely,
      "bytes on the outside, unicode on the inside, encode/decode at the edges."
    
      The primary spot in the shell where we call decode() now is when sanitising
      input...
    
      args = self.sanitise_input(args.decode('utf-8'))
    
      ...and also whenever a library like re required it. Similarly, str.encode()
      is primarily used where a library like readline or csv requires is.
    
    - PYTHONIOENCODING needs to be set to utf-8 to override the default setting for
      python 2. Without this, piping or redirecting stdout results in unicode errors.
    
    - from __future__ import unicode_literals was added throughout
    
    Testing:
    
      To test the changes, I ran the e2e shell tests the way we always do (against
      the normal build tarball), and then I set up a python 3 virtual env with the
      shell installed as a package, and manually ran the tests against that.
    
      No effort has been made at this point to come up with a way to integrate
      testing of the shell in a python3 environment into our automated test
      processes.
    
    Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615
    Reviewed-on: http://gerrit.cloudera.org:8080/15524
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 bin/impala-shell.sh                    |  33 +++++++-
 bin/set-pythonpath.sh                  |   3 +-
 shell/TSSLSocketWithWildcardSAN.py     |   1 +
 shell/compatibility.py                 |  11 ++-
 shell/impala-shell                     |   4 +-
 shell/impala_client.py                 |  20 ++---
 shell/impala_shell.py                  | 135 +++++++++++++++++++--------------
 shell/impala_shell_config_defaults.py  |   2 +
 shell/make_shell_tarball.sh            |  19 +++--
 shell/option_parser.py                 |   3 +-
 shell/packaging/make_python_package.sh |   7 +-
 shell/packaging/requirements.txt       |   5 +-
 shell/packaging/setup.py               |   2 +-
 shell/pkg_resources.py                 |   2 +
 shell/shell_output.py                  |  66 +++++++++-------
 tests/conftest.py                      |  10 +++
 tests/shell/test_shell_commandline.py  |  52 +++++++++----
 tests/shell/test_shell_interactive.py  |  49 ++++++++----
 tests/shell/util.py                    |  77 ++++++++++++++++---
 19 files changed, 352 insertions(+), 149 deletions(-)

diff --git a/bin/impala-shell.sh b/bin/impala-shell.sh
index 1206f27..a9000443 100755
--- a/bin/impala-shell.sh
+++ b/bin/impala-shell.sh
@@ -17,7 +17,38 @@
 # specific language governing permissions and limitations
 # under the License.
 
+set -euo pipefail
 
 # This script runs the impala shell from a dev environment.
+PYTHONPATH=${PYTHONPATH:-}
 SHELL_HOME=${IMPALA_SHELL_HOME:-${IMPALA_HOME}/shell}
-exec impala-python ${SHELL_HOME}/impala_shell.py "$@"
+
+# ${IMPALA_HOME}/bin has bootstrap_toolchain.py, required by bootstrap_virtualenv.py
+PYTHONPATH=${PYTHONPATH}:${IMPALA_HOME}/bin
+
+# Default version of thrift for the impala-shell is thrift >= 0.11.0.
+PYTHONPATH=${PYTHONPATH}:${SHELL_HOME}/build/thrift-11-gen/gen-py
+IMPALA_THRIFT_PY_VERSION=${IMPALA_THRIFT11_VERSION}
+
+THRIFT_PY_ROOT="${IMPALA_TOOLCHAIN}/thrift-${IMPALA_THRIFT_PY_VERSION}"
+
+LD_LIBRARY_PATH+=":$(PYTHONPATH=${PYTHONPATH} \
+  python "$IMPALA_HOME/infra/python/bootstrap_virtualenv.py" \
+  --print-ld-library-path)"
+
+IMPALA_PY_DIR="$(dirname "$0")/../infra/python"
+IMPALA_PYTHON_EXECUTABLE="${IMPALA_PY_DIR}/env/bin/python"
+
+for PYTHON_LIB_DIR in ${THRIFT_PY_ROOT}/python/lib{64,}; do
+  [[ -d ${PYTHON_LIB_DIR} ]] || continue
+  for PKG_DIR in ${PYTHON_LIB_DIR}/python*/site-packages; do
+    PYTHONPATH=${PYTHONPATH}:${PKG_DIR}/
+  done
+done
+
+# Note that this uses the external system python executable
+PYTHONPATH=${PYTHONPATH} python "${IMPALA_PY_DIR}/bootstrap_virtualenv.py"
+
+# This uses the python executable in the impala python env
+PYTHONIOENCODING='utf-8' PYTHONPATH=${PYTHONPATH} \
+  exec "${IMPALA_PYTHON_EXECUTABLE}" ${SHELL_HOME}/impala_shell.py "$@"
diff --git a/bin/set-pythonpath.sh b/bin/set-pythonpath.sh
index 6b19b20..9e1cf8b 100755
--- a/bin/set-pythonpath.sh
+++ b/bin/set-pythonpath.sh
@@ -27,8 +27,9 @@
 export PYTHONPATH=${IMPALA_HOME}:${IMPALA_HOME}/bin
 
 # Generated Thrift files are used by tests and other scripts.
-if [ -n "${USE_THRIFT11_GEN_PY:-}" ]; then
+if [ "${USE_THRIFT11_GEN_PY:-}" == "true" ]; then
   PYTHONPATH=${PYTHONPATH}:${IMPALA_HOME}/shell/build/thrift-11-gen/gen-py
+  THRIFT_HOME="${IMPALA_TOOLCHAIN}/thrift-${IMPALA_THRIFT11_VERSION}"
 else
   PYTHONPATH=${PYTHONPATH}:${IMPALA_HOME}/shell/gen-py
 fi
diff --git a/shell/TSSLSocketWithWildcardSAN.py b/shell/TSSLSocketWithWildcardSAN.py
index c6ff1ca..8f5b53e 100755
--- a/shell/TSSLSocketWithWildcardSAN.py
+++ b/shell/TSSLSocketWithWildcardSAN.py
@@ -16,6 +16,7 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from __future__ import print_function, unicode_literals
 
 import re
 import ssl
diff --git a/shell/compatibility.py b/shell/compatibility.py
index 635f827..829c204 100644
--- a/shell/compatibility.py
+++ b/shell/compatibility.py
@@ -19,11 +19,20 @@
 # under the License.
 from __future__ import print_function, unicode_literals
 
-
 """
 A module where we can aggregate python2 -> 3 code contortions.
 """
 
+import os
+import sys
+
+
+if sys.version_info.major == 2:
+  # default is typically ASCII, but unicode_literals dictates UTF-8
+  # See also https://stackoverflow.com/questions/492483/setting-the-correct-encoding-when-piping-stdout-in-python  # noqa
+  os.environ['PYTHONIOENCODING'] = 'utf-8'
+
+
 try:
   _xrange = xrange
 except NameError:
diff --git a/shell/impala-shell b/shell/impala-shell
index 48dcf60..408a447 100755
--- a/shell/impala-shell
+++ b/shell/impala-shell
@@ -18,7 +18,7 @@
 # under the License.
 
 
-# This script runs the Impala shell. Python is required. 
+# This script runs the Impala shell. Python is required.
 #
 # This script assumes that the supporting library files for the Impala shell are
 # rooted in either the same directory that this script is in, or in a directory
@@ -51,4 +51,4 @@ for EGG in $(ls ${SHELL_HOME}/ext-py/*.egg); do
 done
 
 PYTHONPATH="${EGG_PATH}${SHELL_HOME}/gen-py:${SHELL_HOME}/lib:${PYTHONPATH}" \
-  exec python ${SHELL_HOME}/impala_shell.py "$@"
+  PYTHONIOENCODING='utf-8' exec python ${SHELL_HOME}/impala_shell.py "$@"
diff --git a/shell/impala_client.py b/shell/impala_client.py
index 1cf3995..9e5704b 100755
--- a/shell/impala_client.py
+++ b/shell/impala_client.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python
+# -*- coding: utf-8 -*-
 #
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
@@ -16,7 +17,7 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-from __future__ import print_function
+from __future__ import print_function, unicode_literals
 from compatibility import _xrange as xrange
 
 from bitarray import bitarray
@@ -118,7 +119,7 @@ class ImpalaClient(object):
                ldap_password=None, use_ldap=False, client_connect_timeout_ms=60000,
                verbose=True, use_http_base_transport=False, http_path=None):
     self.connected = False
-    self.impalad_host = impalad[0].encode('ascii', 'ignore')
+    self.impalad_host = impalad[0]
     self.impalad_port = int(impalad[1])
     self.kerberos_host_fqdn = kerberos_host_fqdn
     self.imp_service = None
@@ -378,8 +379,8 @@ class ImpalaClient(object):
 
     if self.use_ldap:
       # Set the BASIC auth header
-      auth = base64.encodestring(
-          "{0}:{1}".format(self.user, self.ldap_password)).strip('\n')
+      user_passwd = "{0}:{1}".format(self.user, self.ldap_password)
+      auth = base64.encodestring(user_passwd.encode()).decode().strip('\n')
       transport.setCustomHeaders({"Authorization": "Basic {0}".format(auth)})
 
     transport.open()
@@ -1005,12 +1006,12 @@ class ImpalaBeeswaxClient(ImpalaClient):
     return ImpalaService.Client(protocol)
 
   def _options_to_string_list(self, set_query_options):
-      if sys.version_info.major < 3:
-        key_value_pairs = set_query_options.iteritems()
-      else:
-        key_value_pairs = set_query_options.items()
+    if sys.version_info.major < 3:
+      key_value_pairs = set_query_options.iteritems()
+    else:
+      key_value_pairs = set_query_options.items()
 
-      return ["%s=%s" % (k, v) for (k, v) in key_value_pairs]
+    return ["%s=%s" % (k, v) for (k, v) in key_value_pairs]
 
   def _open_session(self):
     # Beeswax doesn't have a "session" concept independent of connections, so
@@ -1241,4 +1242,3 @@ class ImpalaBeeswaxClient(ImpalaClient):
           raise RPCException("ERROR: %s" % e.message)
         if "QueryNotFoundException" in str(e):
           raise QueryStateException('Error: Stale query handle')
-
diff --git a/shell/impala_shell.py b/shell/impala_shell.py
index bab3897..dd1d461 100755
--- a/shell/impala_shell.py
+++ b/shell/impala_shell.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python
+# -*- coding: utf-8 -*-
 #
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
@@ -18,7 +19,7 @@
 # under the License.
 #
 # Impala's shell
-from __future__ import print_function
+from __future__ import print_function, unicode_literals
 from compatibility import _xrange as xrange
 
 import cmd
@@ -41,8 +42,8 @@ import traceback
 from impala_client import ImpalaHS2Client, ImpalaBeeswaxClient, QueryOptionLevels
 from impala_shell_config_defaults import impala_shell_defaults
 from option_parser import get_option_parser, get_config_from_file
-from shell_output import DelimitedOutputFormatter, OutputStream, PrettyOutputFormatter
-from shell_output import OverwritingStdErrOutputStream
+from shell_output import (DelimitedOutputFormatter, OutputStream, PrettyOutputFormatter,
+                          OverwritingStdErrOutputStream)
 from subprocess import call
 from shell_exceptions import (RPCException, DisconnectedException, QueryStateException,
     QueryCancelledByShellException, MissingThriftMethodException)
@@ -66,6 +67,7 @@ DEFAULT_BEESWAX_PORT = 21000
 DEFAULT_HS2_PORT = 21050
 DEFAULT_HS2_HTTP_PORT = 28000
 
+
 def strip_comments(sql):
   """sqlparse default implementation of strip comments has a bad performance when parsing
   very large SQL due to the grouping. This is because the default implementation tries to
@@ -78,6 +80,7 @@ def strip_comments(sql):
   stack.postprocess.append(sqlparse.filters.SerializerUnicode())
   return ''.join(stack.run(sql, 'utf-8')).strip()
 
+
 class CmdStatus:
   """Values indicate the execution status of a command to the cmd shell driver module
   SUCCESS and ERROR continue running the shell and ABORT exits the shell
@@ -94,23 +97,14 @@ class FatalShellException(Exception):
   error should be logged to stderr before raising this exception."""
   pass
 
-class ImpalaPrettyTable(prettytable.PrettyTable):
-  """Patched version of PrettyTable with different unicode handling - instead of throwing
-  exceptions when a character can't be converted to unicode, it is replaced with a
-  placeholder character."""
-  def _unicode(self, value):
-    if not isinstance(value, basestring):
-      value = str(value)
-    if not isinstance(value, unicode):
-      # If a value cannot be encoded, replace it with a placeholder.
-      value = unicode(value, self.encoding, "replace")
-    return value
 
 class QueryOptionDisplayModes:
   REGULAR_OPTIONS_ONLY = 1
   ALL_OPTIONS = 2
 
-class ImpalaShell(object, cmd.Cmd):
+
+# Py3 method resolution order requires 'object' to be last w/ multiple inheritance
+class ImpalaShell(cmd.Cmd, object):
   """ Simple Impala Shell.
 
   Implements the context manager interface to ensure client connections and sessions
@@ -437,7 +431,7 @@ class ImpalaShell(object, cmd.Cmd):
     # Strip any comments to make a statement such as the following be considered as
     # ending with a delimiter:
     # select 1 + 1; -- this is a comment
-    line = strip_comments(line).encode('utf-8').rstrip()
+    line = strip_comments(line).rstrip()
     if line.endswith(ImpalaShell.CMD_DELIM):
       try:
         # Look for an open quotation in the entire command, and not just the
@@ -502,10 +496,10 @@ class ImpalaShell(object, cmd.Cmd):
 
       # partial_cmd is already populated, add the current input after a newline.
       if self.partial_cmd and cmd:
-        self.partial_cmd = "%s\n%s" % (self.partial_cmd, cmd)
+        self.partial_cmd = "{0}\n{1}".format(self.partial_cmd, cmd)
       else:
         # If the input string is empty or partial_cmd is empty.
-        self.partial_cmd = "%s%s" % (self.partial_cmd, cmd)
+        self.partial_cmd = "{0}{1}".format(self.partial_cmd, cmd)
       # Remove the most recent item from history if:
       #   -- The current state of user input in incomplete.
       #   -- The most recent user input is not an empty string
@@ -515,16 +509,17 @@ class ImpalaShell(object, cmd.Cmd):
       return str()
     elif self.partial_cmd:  # input ends with a delimiter and partial_cmd is not empty
       if cmd != ImpalaShell.CMD_DELIM:
-        completed_cmd = "%s\n%s" % (self.partial_cmd, cmd)
+        completed_cmd = "{0}\n{1}".format(self.partial_cmd, cmd)
       else:
-        completed_cmd = "%s%s" % (self.partial_cmd, cmd)
+        completed_cmd = "{0}{1}".format(self.partial_cmd, cmd)
       # Reset partial_cmd to an empty string
       self.partial_cmd = str()
       # Replace the most recent history item with the completed command.
       completed_cmd = sqlparse.format(completed_cmd)
       if self.readline and current_history_len > 0:
-        self.readline.replace_history_item(current_history_len - 1,
-            completed_cmd.encode('utf-8'))
+        if sys.version_info.major == 2:
+          completed_cmd = completed_cmd.encode('utf-8')
+        self.readline.replace_history_item(current_history_len - 1, completed_cmd)
       # Revert the prompt to its earlier state
       self.prompt = self.cached_prompt
     else:  # Input has a delimiter and partial_cmd is empty
@@ -603,7 +598,10 @@ class ImpalaShell(object, cmd.Cmd):
         host=self.impalad[0], port=self.impalad[1], db=db)
 
   def precmd(self, args):
-    args = self.sanitise_input(args)
+    if sys.version_info.major == 2:
+      args = self.sanitise_input(args.decode('utf-8'))  # python2
+    else:
+      args = self.sanitise_input(args)  # python3
     if not args: return args
     # Split args using sqlparse. If there are multiple queries present in user input,
     # the length of the returned query list will be greater than one.
@@ -620,7 +618,7 @@ class ImpalaShell(object, cmd.Cmd):
       print("Connection lost, reconnecting...", file=sys.stderr)
       self._connect()
       self._validate_database(immediately=True)
-    return args.encode('utf-8')
+    return args
 
   def onecmd(self, line):
     """Overridden to ensure the variable replacement is processed in interactive
@@ -1033,6 +1031,7 @@ class ImpalaShell(object, cmd.Cmd):
     Impala shell cannot get child query handle so it cannot
     query live progress for COMPUTE STATS query. Disable live
     progress/summary callback for COMPUTE STATS query."""
+
     query = self._build_query_string(self.last_leading_comment, self.orig_cmd, args)
     (prev_live_progress, prev_live_summary) = self.live_progress, self.live_summary
     (self.live_progress, self.live_summary) = False, False
@@ -1234,10 +1233,9 @@ class ImpalaShell(object, cmd.Cmd):
     Should be called after the query has finished and before data is fetched.
     All data is left aligned.
     """
-    table = ImpalaPrettyTable()
+    table = prettytable.PrettyTable()
     for column in column_names:
-      # Column names may be encoded as utf-8
-      table.add_column(column.decode('utf-8', 'ignore'), [])
+      table.add_column(column, [])
     table.align = "l"
     return table
 
@@ -1251,15 +1249,15 @@ class ImpalaShell(object, cmd.Cmd):
     query = self._build_query_string(self.last_leading_comment, self.orig_cmd, args)
     # Use shlex to deal with escape quotes in string literals.
     # Set posix=False to preserve the quotes.
-    tokens = shlex.split(strip_comments(query.lstrip()).encode('utf-8'),
-                         posix=False)
+    tokens = shlex.split(strip_comments(query.lstrip()), posix=False)
     try:
       # Because the WITH clause may precede DML or SELECT queries,
       # just checking the first token is insufficient.
       is_dml = False
-      if filter(self.DML_REGEX.match, tokens): is_dml = True
+      if any(self.DML_REGEX.match(t) for t in tokens):
+        is_dml = True
       return self._execute_stmt(query, is_dml=is_dml, print_web_link=True)
-    except ValueError as e:
+    except ValueError:
       return self._execute_stmt(query, print_web_link=True)
 
   def do_use(self, args):
@@ -1436,11 +1434,9 @@ class ImpalaShell(object, cmd.Cmd):
     """
     if ImpalaShell._has_leading_comment(line):
       leading_comment, line = ImpalaShell.strip_leading_comment(line.strip())
-      line = line.encode('utf-8')
-      if leading_comment:
-        leading_comment = leading_comment.encode('utf-8')
     else:
       leading_comment, line = None, line.strip()
+
     if line and line[0] == '@':
       line = 'rerun ' + line[1:]
     return super(ImpalaShell, self).parseline(line) + (leading_comment,)
@@ -1517,6 +1513,9 @@ class ImpalaShell(object, cmd.Cmd):
     history_len = self.readline.get_current_history_length()
     # load the history and replace the shell's delimiter with EOL
     history_items = map(self.readline.get_history_item, xrange(1, history_len + 1))
+    if sys.version_info.major == 2:
+      src_delim = src_delim.encode('utf-8')
+      tgt_delim = tgt_delim.encode('utf-8')
     history_items = [item.replace(src_delim, tgt_delim) for item in history_items]
     # Clear the original history and replace it with the mutated history.
     self.readline.clear_history()
@@ -1560,7 +1559,7 @@ class ImpalaShell(object, cmd.Cmd):
     return True
 
 
-TIPS=[
+TIPS = [
   "Press TAB twice to see a list of available commands.",
   "After running a query, type SUMMARY to see a summary of where time was spent.",
   "The SET command shows the current value of all shell and query options.",
@@ -1586,16 +1585,18 @@ command."
 EXPLAIN command. You can change the level of detail in the EXPLAIN output by setting the \
 EXPLAIN_LEVEL query option.",
   "When you set a query option it lasts for the duration of the Impala shell session."
-  ]
+]
 
 HEADER_DIVIDER =\
   "***********************************************************************************"
 
+
 def _format_tip(tip):
   """Takes a tip string and splits it on word boundaries so that it fits neatly inside the
   shell header."""
   return '\n'.join([l for l in textwrap.wrap(tip, len(HEADER_DIVIDER))])
 
+
 WELCOME_STRING = """\
 ***********************************************************************************
 Welcome to the Impala shell.
@@ -1607,9 +1608,9 @@ Welcome to the Impala shell.
   % (VERSION_STRING, _format_tip(random.choice(TIPS)))
 
 
-def parse_query_text(query_text, utf8_encode_policy='strict'):
-  """Parse query file text to extract queries and encode into utf-8"""
-  query_list = [q.encode('utf-8', utf8_encode_policy) for q in sqlparse.split(query_text)]
+def parse_query_text(query_text):
+  """Parse query file text to extract queries"""
+  query_list = sqlparse.split(query_text)
   # Remove trailing comments in the input, if any. We do this because sqlparse splits the
   # input at query boundaries and associates the query only with preceding comments
   # (following comments are associated with the next query). This is a problem with
@@ -1627,6 +1628,7 @@ def parse_query_text(query_text, utf8_encode_policy='strict'):
     query_list.pop()
   return query_list
 
+
 def parse_variables(keyvals):
   """Parse variable assignments passed as arguments in the command line"""
   kv_pattern = r'(%s)=(.*)$' % (ImpalaShell.VALID_VAR_NAME_PATTERN,)
@@ -1645,9 +1647,17 @@ def parse_variables(keyvals):
 
 
 def replace_variables(set_variables, input_string):
-  """Replaces variable within the string with their corresponding values using the
-     given set_variables."""
+  """
+  Replaces variable within the input_string with their corresponding values
+  from the given set_variables.
+  """
   errors = False
+
+  # In the case of a byte sequence, re.findall() will choke under python 3
+  if sys.version_info.major > 2:
+    if not isinstance(input_string, str):
+      input_string = input_string.decode('utf-8')
+
   matches = set([v.upper() for v in re.findall(r'(?<!\\)\${([^}]+)}', input_string)])
   for name in matches:
     value = None
@@ -1687,21 +1697,21 @@ def get_var_name(name):
       return var_name
   return None
 
+
 def execute_queries_non_interactive_mode(options, query_options):
   """Run queries in non-interactive mode. Return True on success. Logs the
   error and returns False otherwise."""
   if options.query_file:
-    try:
-      # "-" here signifies input from STDIN
-      if options.query_file == "-":
-        query_file_handle = sys.stdin
-      else:
-        query_file_handle = open(options.query_file, 'r')
-    except Exception, e:
-      print("Could not open file '%s': %s" % (options.query_file, e), file=sys.stderr)
-      return False
-
-    query_text = query_file_handle.read()
+    # "-" here signifies input from STDIN
+    if options.query_file == "-":
+      query_text = sys.stdin.read()
+    else:
+      try:
+        with open(options.query_file, 'r') as query_file_handle:
+          query_text = query_file_handle.read()
+      except Exception as e:
+        print("Could not open file '%s': %s" % (options.query_file, e), file=sys.stderr)
+        return False
   elif options.query:
     query_text = options.query
   else:
@@ -1712,6 +1722,7 @@ def execute_queries_non_interactive_mode(options, query_options):
     return (shell.execute_query_list(shell.cmdqueue) and
             shell.execute_query_list(queries))
 
+
 def get_intro(options):
   """Get introduction message for start-up. The last character should not be a return."""
   if not options.verbose:
@@ -1801,7 +1812,11 @@ def impala_shell_main():
     return
 
   if options.write_delimited:
-    delim = options.output_delimiter.decode('string-escape')
+    if isinstance(options.output_delimiter, str):
+      delim_sequence = bytearray(options.output_delimiter, 'utf-8')
+    else:
+      delim_sequence = options.output_delimiter
+    delim = delim_sequence.decode('unicode_escape')
     if len(delim) != 1:
       print("Illegal delimiter %s, the delimiter "
             "must be a 1-character string." % delim, file=sys.stderr)
@@ -1823,9 +1838,15 @@ def impala_shell_main():
           "mechanism (-l)", file=sys.stderr)
     raise FatalShellException()
 
+  start_msg = "Starting Impala"
+
+  py_version_msg = "using Python {0}.{1}.{2}".format(
+    sys.version_info.major, sys.version_info.minor, sys.version_info.micro)
+
   if options.use_kerberos:
     if options.verbose:
-      print("Starting Impala Shell using Kerberos authentication", file=sys.stderr)
+      kerb_msg = "with Kerberos authentication"
+      print("{0} {1} {2}".format(start_msg, kerb_msg, py_version_msg), file=sys.stderr)
       print("Using service name '%s'" % options.kerberos_service_name, file=sys.stderr)
     # Check if the user has a ticket in the credentials cache
     try:
@@ -1838,10 +1859,12 @@ def impala_shell_main():
       raise FatalShellException()
   elif options.use_ldap:
     if options.verbose:
-      print("Starting Impala Shell using LDAP-based authentication", file=sys.stderr)
+      ldap_msg = "with LDAP-based authentication"
+      print("{0} {1} {2}".format(start_msg, ldap_msg, py_version_msg), file=sys.stderr)
   else:
     if options.verbose:
-      print("Starting Impala Shell without Kerberos authentication", file=sys.stderr)
+      no_auth_msg = "with no authentication"
+      print("{0} {1} {2}".format(start_msg, no_auth_msg, py_version_msg), file=sys.stderr)
 
   options.ldap_password = None
   if options.use_ldap and options.ldap_password_cmd:
diff --git a/shell/impala_shell_config_defaults.py b/shell/impala_shell_config_defaults.py
index 73bfbcb..113e20c 100644
--- a/shell/impala_shell_config_defaults.py
+++ b/shell/impala_shell_config_defaults.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python
+# -*- coding: utf-8 -*-
 #
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
@@ -18,6 +19,7 @@
 # under the License.
 
 # default options used by the Impala shell stored in a dict
+from __future__ import print_function, unicode_literals
 
 import getpass
 import os
diff --git a/shell/make_shell_tarball.sh b/shell/make_shell_tarball.sh
index 1f027f0..d5c0c2c 100755
--- a/shell/make_shell_tarball.sh
+++ b/shell/make_shell_tarball.sh
@@ -49,6 +49,9 @@ SHELL_HOME=${IMPALA_HOME}/shell
 BUILD_DIR=${SHELL_HOME}/build
 TARBALL_ROOT=${BUILD_DIR}/impala-shell-${VERSION}
 
+IMPALA_THRIFT_PY_VERSION="${IMPALA_THRIFT11_VERSION}"
+THRIFT_GEN_PY_DIR="${SHELL_HOME}/build/thrift-11-gen/gen-py"
+
 echo "Deleting all files in ${TARBALL_ROOT}/{gen-py,lib,ext-py}"
 rm -rf ${TARBALL_ROOT}/lib/* 2>&1 > /dev/null
 rm -rf ${TARBALL_ROOT}/gen-py/* 2>&1 > /dev/null
@@ -56,8 +59,8 @@ rm -rf ${TARBALL_ROOT}/ext-py/* 2>&1 > /dev/null
 mkdir -p ${TARBALL_ROOT}/lib
 mkdir -p ${TARBALL_ROOT}/ext-py
 
-rm -f ${SHELL_HOME}/gen-py/impala_build_version.py
-cat > ${SHELL_HOME}/gen-py/impala_build_version.py <<EOF
+rm -f ${THRIFT_GEN_PY_DIR}/impala_build_version.py
+cat > ${THRIFT_GEN_PY_DIR}/impala_build_version.py <<EOF
 #
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
@@ -106,14 +109,18 @@ done
 
 # Copy all the shell files into the build dir
 # The location of python libs for thrift is different in rhel/centos/sles
-if [ -d ${THRIFT_HOME}/python/lib/python*/site-packages/thrift ]; then
-  cp -r ${THRIFT_HOME}/python/lib/python*/site-packages/thrift\
+
+THRIFT_PY_ROOT="${IMPALA_TOOLCHAIN}/thrift-${IMPALA_THRIFT_PY_VERSION}"
+
+if [ -d ${THRIFT_PY_ROOT}/python/lib/python*/site-packages/thrift ]; then
+  cp -r ${THRIFT_PY_ROOT}/python/lib/python*/site-packages/thrift\
         ${TARBALL_ROOT}/lib
 else
-  cp -r ${THRIFT_HOME}/python/lib64/python*/site-packages/thrift\
+  cp -r ${THRIFT_PY_ROOT}/python/lib64/python*/site-packages/thrift\
         ${TARBALL_ROOT}/lib
 fi
-cp -r ${SHELL_HOME}/gen-py ${TARBALL_ROOT}
+
+cp -r ${THRIFT_GEN_PY_DIR} ${TARBALL_ROOT}
 cp ${SHELL_HOME}/option_parser.py ${TARBALL_ROOT}/lib
 cp ${SHELL_HOME}/impala_shell_config_defaults.py ${TARBALL_ROOT}/lib
 cp ${SHELL_HOME}/impala_client.py ${TARBALL_ROOT}/lib
diff --git a/shell/option_parser.py b/shell/option_parser.py
index 4f4d79b..3a4f090 100755
--- a/shell/option_parser.py
+++ b/shell/option_parser.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python
+# -*- coding: utf-8 -*-
 #
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
@@ -26,7 +27,7 @@
 # [impala.query_options]
 # EXPLAIN_LEVEL=2
 # MT_DOP=2
-from __future__ import print_function
+from __future__ import print_function, unicode_literals
 
 import sys
 
diff --git a/shell/packaging/make_python_package.sh b/shell/packaging/make_python_package.sh
index ee62572..4e86b6f 100755
--- a/shell/packaging/make_python_package.sh
+++ b/shell/packaging/make_python_package.sh
@@ -42,11 +42,14 @@ PACKAGE_DIR="${STAGING_DIR}"/impala_shell_package
 MODULE_LIB_DIR="${PACKAGE_DIR}"/impala_shell
 NO_CLEAN_DIST="${NO_CLEAN_DIST:-}"
 
+THRIFT_GEN_PY_DIR=${SHELL_HOME}/build/thrift-11-gen/gen-py
+THRIFT_PY_ROOT="${IMPALA_TOOLCHAIN}/thrift-${IMPALA_THRIFT11_VERSION}"
+
 assemble_package_files() {
   mkdir -p "${MODULE_LIB_DIR}"
 
-  cp -r "${SHELL_HOME}/gen-py"/* "${MODULE_LIB_DIR}"
-  cp -r "${THRIFT_HOME}/python/lib/python2.7/site-packages/thrift" "${MODULE_LIB_DIR}"
+  cp -r "${THRIFT_GEN_PY_DIR}"/* "${MODULE_LIB_DIR}"
+  cp -r "${THRIFT_PY_ROOT}/python/lib/python2.7/site-packages/thrift" "${MODULE_LIB_DIR}"
 
   cp "${WORKING_DIR}/__init__.py" "${MODULE_LIB_DIR}"
   cp "${SHELL_HOME}/compatibility.py" "${MODULE_LIB_DIR}"
diff --git a/shell/packaging/requirements.txt b/shell/packaging/requirements.txt
index 30af6b1..84bfe80 100644
--- a/shell/packaging/requirements.txt
+++ b/shell/packaging/requirements.txt
@@ -1,8 +1,9 @@
 bitarray==1.0.1
-prettytable==0.7.1
+configparser==4.0.2
+prettytable==0.7.2
 sasl==0.2.1
 setuptools>=36.8.0
 six==1.14.0
 sqlparse==0.3.1
-thrift==0.9.3
+thrift==0.11.0
 thrift_sasl==0.4.2
diff --git a/shell/packaging/setup.py b/shell/packaging/setup.py
index 173e0d8..410c727 100644
--- a/shell/packaging/setup.py
+++ b/shell/packaging/setup.py
@@ -135,7 +135,7 @@ def get_version():
 
 setup(
   name='impala_shell',
-  python_requires='>2.6, <3.0.0',
+  python_requires='>2.6',
   version=get_version(),
   description='Impala Shell',
   long_description_content_type='text/markdown',
diff --git a/shell/pkg_resources.py b/shell/pkg_resources.py
index 977de19..70ecc44 100644
--- a/shell/pkg_resources.py
+++ b/shell/pkg_resources.py
@@ -1,3 +1,5 @@
+from __future__ import print_function, unicode_literals
+
 """
   This file is redistributed under the Python Software Foundation License:
   http://docs.python.org/2/license.html
diff --git a/shell/shell_output.py b/shell/shell_output.py
index d9d3625..c6c2e99 100644
--- a/shell/shell_output.py
+++ b/shell/shell_output.py
@@ -1,4 +1,5 @@
 #!/usr/bin/env python
+# -*- coding: utf-8 -*-
 #
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
@@ -16,7 +17,7 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-from __future__ import print_function
+from __future__ import print_function, unicode_literals
 
 import csv
 import re
@@ -33,16 +34,13 @@ class PrettyOutputFormatter(object):
     self.prettytable = prettytable
 
   def format(self, rows):
-    """Returns string containing UTF-8-encoded representation of the table data."""
+    """Returns string containing representation of the table data."""
     # Clear rows that already exist in the table.
     self.prettytable.clear_rows()
     try:
-      map(self.prettytable.add_row, rows)
-      # PrettyTable.get_string() converts UTF-8-encoded strs added via add_row() into
-      # Python unicode strings. We need to convert it back to a UTF-8-encoded str for
-      # output, since Python won't do the encoding automatically when outputting to a
-      # non-terminal (see IMPALA-2717).
-      return self.prettytable.get_string().encode('utf-8')
+      for row in rows:
+        self.prettytable.add_row(row)
+      return self.prettytable.get_string()
     except Exception as e:
       # beeswax returns each row as a tab separated string. If a string column
       # value in a row has tabs, it will break the row split. Default to displaying
@@ -57,19 +55,31 @@ class PrettyOutputFormatter(object):
 class DelimitedOutputFormatter(object):
   def __init__(self, field_delim="\t"):
     if field_delim:
-      self.field_delim = field_delim.decode('string-escape')
+      if sys.version_info.major > 2:
+        # strings do not have a 'decode' method in python 3
+        field_delim_bytes = bytearray(field_delim, 'utf-8')
+        self.field_delim = field_delim_bytes.decode('unicode_escape')
+      else:
+        self.field_delim = field_delim.decode('unicode_escape')
       # IMPALA-8652, the delimiter should be a 1-character string and verified already
       assert len(self.field_delim) == 1
 
   def format(self, rows):
     """Returns string containing UTF-8-encoded representation of the table data."""
     # csv.writer expects a file handle to the input.
-    # cStringIO is used as the temporary buffer.
     temp_buffer = StringIO()
-    writer = csv.writer(temp_buffer, delimiter=self.field_delim,
+    if sys.version_info.major == 2:
+      # csv.writer in python2 requires an ascii string delimiter
+      delim = self.field_delim.encode('ascii', 'ignore')
+    else:
+      delim = self.field_delim
+    writer = csv.writer(temp_buffer, delimiter=delim,
                         lineterminator='\n', quoting=csv.QUOTE_MINIMAL)
-    writer.writerows(rows)
-    rows = temp_buffer.getvalue().rstrip('\n')
+    for row in rows:
+      if sys.version_info.major == 2:
+        row = [val.encode('utf-8') for val in row]
+      writer.writerow(row)
+    rows = temp_buffer.getvalue().rstrip()
     temp_buffer.close()
     return rows
 
@@ -82,25 +92,23 @@ class OutputStream(object):
     `data` is a list of lists.
     """
     self.formatter = formatter
-    self.handle = sys.stdout
     self.filename = filename
-    if self.filename:
-      try:
-        self.handle = open(self.filename, 'ab')
-      except IOError as err:
-        print("Error opening file %s: %s" % (self.filename, str(err)),
-              file=self.handle)
-        print(sys.stderr, "Writing to stdout", file=self.handle)
 
   def write(self, data):
-    print(self.formatter.format(data), file=self.handle)
-    self.handle.flush()
-
-  def __del__(self):
-    # If the output file cannot be opened, OutputStream defaults to sys.stdout.
-    # Don't close the file handle if it points to sys.stdout.
-    if self.filename and self.handle != sys.stdout:
-      self.handle.close()
+    formatted_data = self.formatter.format(data)
+    if self.filename is not None:
+      try:
+        with open(self.filename, 'ab') as out_file:
+          # Note that instances of this class do not persist, so it's fine to
+          # close the we close the file handle after each write.
+          out_file.write(formatted_data.encode('utf-8'))  # file opened in binary mode
+      except IOError as err:
+        file_err_msg = "Error opening file %s: %s" % (self.filename, str(err))
+        print('{0} (falling back to stderr)'.format(file_err_msg), file=sys.stderr)
+        print(formatted_data, file=sys.stderr)
+    else:
+      # If filename is None, then just print to stdout
+      print(formatted_data)
 
 
 class OverwritingStdErrOutputStream(object):
diff --git a/tests/conftest.py b/tests/conftest.py
index d84f8e3..b544c5e 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -148,6 +148,16 @@ def pytest_addoption(parser):
                    help="If set to N/M (e.g., 3/5), will split the tests into "
                    "M partitions and run the Nth partition. 1-indexed.")
 
+  parser.addoption("--shell_executable", default=None,
+                   help="Full path to the impala-shell executable. Useful for running"
+                   "shell/ e2e tests against a version of the impala-shell that has been "
+                   "installed as a python package in a separate virtualenv. The python "
+                   "version in the target virtualenv does not need to the match the "
+                   "version used to execute the tests, but if the tested environment is "
+                   "using python 3, also be sure to also set the environment variable "
+                   "USE_THRIFT11_GEN_PY=true in the test env. "
+                   "(See $IMPALA_HOME/bin/set-pythonpath.sh.)")
+
 
 def pytest_assertrepr_compare(op, left, right):
   """
diff --git a/tests/shell/test_shell_commandline.py b/tests/shell/test_shell_commandline.py
index 014094e..42d8560 100644
--- a/tests/shell/test_shell_commandline.py
+++ b/tests/shell/test_shell_commandline.py
@@ -1,4 +1,5 @@
-# encoding=utf-8
+#!/usr/bin/env impala-python
+# -*- coding: utf-8 -*-
 #
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
@@ -35,7 +36,7 @@ from tests.common.skip import SkipIf
 from tests.common.test_dimensions import create_client_protocol_dimension
 from time import sleep, time
 from util import (get_impalad_host_port, assert_var_substitution, run_impala_shell_cmd,
-                  ImpalaShell, IMPALA_SHELL_EXECUTABLE)
+                  ImpalaShell, IMPALA_SHELL_EXECUTABLE, SHELL_IS_PYTHON_2)
 from contextlib import closing
 
 
@@ -155,7 +156,7 @@ class TestImpalaShell(ImpalaTestSuite):
 
   def test_unsecure_message(self, vector):
     results = run_impala_shell_cmd(vector, [], wait_until_connected=False)
-    assert "Starting Impala Shell without Kerberos authentication" in results.stderr
+    assert "with no authentication" in results.stderr
 
   def test_print_header(self, vector, populated_table):
     args = ['--print_header', '-B', '--output_delim=,', '-q',
@@ -184,7 +185,7 @@ class TestImpalaShell(ImpalaTestSuite):
     results = run_impala_shell_cmd(vector, args, expect_success=False)
     # Check that impala is using the right service name.
     assert "Using service name 'impala'" in results.stderr
-    assert "Starting Impala Shell using Kerberos authentication" in results.stderr
+    assert "with Kerberos authentication" in results.stderr
     # Check that Impala warns the user if klist does not exist on the system, or if
     # no kerberos tickets are initialized.
     try:
@@ -253,7 +254,7 @@ class TestImpalaShell(ImpalaTestSuite):
   def test_removed_query_option(self, vector):
     """Test that removed query options produce warning."""
     result = run_impala_shell_cmd(vector, ['-q', 'set disable_cached_reads=true'],
-        expect_success=True)
+                                  expect_success=True)
     assert "Ignoring removed query option: 'disable_cached_reads'" in result.stderr,\
         result.stderr
 
@@ -393,6 +394,7 @@ class TestImpalaShell(ImpalaTestSuite):
   def test_query_cancellation_during_wait_to_finish(self, vector):
     """IMPALA-1144: Test cancellation (CTRL+C) while the query is in the
     wait_to_finish state"""
+
     # A select where wait_to_finish takes several seconds
     stmt = "select * from tpch.customer c1, tpch.customer c2, " + \
            "tpch.customer c3 order by c1.c_name"
@@ -472,7 +474,9 @@ class TestImpalaShell(ImpalaTestSuite):
   def test_international_characters_prettyprint_tabs(self, vector):
     """IMPALA-2717: ensure we can handle international characters in pretty-printed
     output when pretty-printing falls back to delimited output."""
+
     args = ['-q', "select '{0}\\t'".format(RUSSIAN_CHARS.encode('utf-8'))]
+
     result = run_impala_shell_cmd(vector, args)
     protocol = vector.get_value('protocol')
     if protocol == 'beeswax':
@@ -528,6 +532,7 @@ class TestImpalaShell(ImpalaTestSuite):
     assert 'WARNING:' not in result.stderr, \
       "A valid config file should not trigger any warning: {0}".format(result.stderr)
     assert 'Query: select 1' in result.stderr
+
     # override option in config file through command line
     args = ['--config_file={0}/good_impalarc'.format(QUERY_FILE_PATH), '--query=select 2']
     result = run_impala_shell_cmd(vector, args)
@@ -555,6 +560,7 @@ class TestImpalaShell(ImpalaTestSuite):
     # specified config file does not exist
     args = ['--config_file=%s/does_not_exist' % QUERY_FILE_PATH]
     run_impala_shell_cmd(vector, args, expect_success=False)
+
     # bad formatting of config file
     args = ['--config_file=%s/bad_impalarc' % QUERY_FILE_PATH]
     run_impala_shell_cmd(vector, args, expect_success=False)
@@ -631,14 +637,21 @@ class TestImpalaShell(ImpalaTestSuite):
 
   def test_ldap_password_from_shell(self, vector):
     args = ['-l', '--auth_creds_ok_in_clear']
-    result = run_impala_shell_cmd(vector, args + ['--ldap_password_cmd=cmddoesntexist'],
-                                  expect_success=False)
-    assert ("Error retrieving LDAP password (command was: 'cmddoesntexist', exception "
-            "was: '[Errno 2] No such file or directory')") in result.stderr
-    result = run_impala_shell_cmd(
-        vector, args + ['--ldap_password_cmd=cat filedoesntexist'], expect_success=False)
-    assert ("Error retrieving LDAP password (command was 'cat filedoesntexist', error "
-            "was: 'cat: filedoesntexist: No such file or directory')") in result.stderr
+
+    result_1 = run_impala_shell_cmd(vector, args + ['--ldap_password_cmd=cmddoesntexist'],
+                                    expect_success=False)
+
+    assert "Error retrieving LDAP password" in result_1.stderr
+    assert "command was: 'cmddoesntexist'" in result_1.stderr
+    assert "No such file or directory" in result_1.stderr
+
+    result_2 = run_impala_shell_cmd(vector,
+                                    args + ['--ldap_password_cmd=cat filedoesntexist'],
+                                    expect_success=False)
+
+    assert "Error retrieving LDAP password" in result_2.stderr
+    assert "command was 'cat filedoesntexist'" in result_2.stderr
+    assert "No such file or directory" in result_2.stderr
 
     # TODO: Without an Impala daemon with LDAP authentication enabled, we can't test the
     # positive case where the password is correct.
@@ -989,5 +1002,16 @@ class TestImpalaShell(ImpalaTestSuite):
     assert "0\tNULL\tNULL" in result.stdout, result.stdout
     assert "1\t1\t10.1" in result.stdout, result.stdout
     assert "2\t2\t20.2" in result.stdout, result.stdout
-    assert "3\t3\t30.3" in result.stdout, result.stdout
+
+    if (vector.get_value("protocol") in ('hs2', 'hs2-http')) and not SHELL_IS_PYTHON_2:
+      # The HS2 client returns binary values for float/double types, and these must
+      # be converted to strings for display. However, due to differences between the
+      # way that python2 and python3 represent floating point values, the output
+      # from the shell will differ with regard to which version of python the
+      # shell is running under.
+      assert "3\t3\t30.299999999999997" in result.stdout, result.stdout
+    else:
+      # python 2, or python 3 with beeswax protocol
+      assert "3\t3\t30.3" in result.stdout, result.stdout
+
     assert "4\t4\t40.4" in result.stdout, result.stdout
diff --git a/tests/shell/test_shell_interactive.py b/tests/shell/test_shell_interactive.py
index cac263b..ceee1fd 100755
--- a/tests/shell/test_shell_interactive.py
+++ b/tests/shell/test_shell_interactive.py
@@ -1,5 +1,5 @@
 #!/usr/bin/env impala-python
-# encoding=utf-8
+# -*- coding: utf-8 -*-
 #
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
@@ -55,6 +55,7 @@ PROMPT_REGEX = r'\[[^:]+:2(1|8)0[0-9][0-9]\]'
 
 LOG = logging.getLogger('test_shell_interactive')
 
+
 @pytest.fixture
 def tmp_history_file(request):
   """
@@ -116,7 +117,7 @@ class TestImpalaShellInteractive(ImpalaTestSuite):
     try:
       actual = impala_service.wait_for_metric_value(metric_name, expected)
     except AssertionError:
-      LOG.exception("Error: " % err)
+      LOG.exception("Error: %s" % err)
       raise
     assert actual == expected, err
 
@@ -553,9 +554,9 @@ class TestImpalaShellInteractive(ImpalaTestSuite):
     # Positive tests
     # set live_summary and live_progress as True with config file
     rcfile_path = os.path.join(QUERY_FILE_PATH, 'good_impalarc3')
-    args = ['--config_file=%s' % rcfile_path]
+    cmd_line_args = ['--config_file=%s' % rcfile_path]
     cmds = "set all;"
-    result = run_impala_shell_interactive(vector, cmds, shell_args=args)
+    result = run_impala_shell_interactive(vector, cmds, shell_args=cmd_line_args)
     assert 'WARNING:' not in result.stderr, \
       "A valid config file should not trigger any warning: {0}".format(result.stderr)
     assert "\tLIVE_SUMMARY: True" in result.stdout
@@ -697,6 +698,7 @@ class TestImpalaShellInteractive(ImpalaTestSuite):
       result = run_impala_shell_interactive(vector,
         "sOuRcE shell_case_sensitive.cmds; SeLeCt 'second command'")
       print result.stderr
+
       assert "Query: uSe FUNCTIONAL" in result.stderr
       assert "Query: ShOw TABLES" in result.stderr
       assert "alltypes" in result.stdout
@@ -726,32 +728,25 @@ class TestImpalaShellInteractive(ImpalaTestSuite):
     result = run_impala_shell_interactive(vector, '/* comment */\n'
                                           'select * from {0};'.format(table))
     assert 'Fetched 1 row(s)' in result.stderr
-
     result = run_impala_shell_interactive(vector, '/* comment1 */\n'
                                           '-- comment2\n'
                                           'select * from {0};'.format(table))
     assert 'Fetched 1 row(s)' in result.stderr
-
     result = run_impala_shell_interactive(vector, '/* comment1\n'
                                           'comment2 */ select * from {0};'.format(table))
     assert 'Fetched 1 row(s)' in result.stderr
-
     result = run_impala_shell_interactive(vector, '/* select * from {0} */ '
                                           'select * from {0};'.format(table))
     assert 'Fetched 1 row(s)' in result.stderr
-
     result = run_impala_shell_interactive(vector, '/* comment */ help use')
     assert 'Executes a USE... query' in result.stdout
-
     result = run_impala_shell_interactive(vector, '-- comment\n'
                                           ' help use;')
     assert 'Executes a USE... query' in result.stdout
-
     result = run_impala_shell_interactive(vector, '/* comment1 */\n'
                                           '-- comment2\n'
                                           'desc {0};'.format(table))
     assert 'Fetched 1 row(s)' in result.stderr
-
     result = run_impala_shell_interactive(vector, '/* comment1 */\n'
                                           '-- comment2\n'
                                           'help use;')
@@ -797,6 +792,31 @@ class TestImpalaShellInteractive(ImpalaTestSuite):
 
   def test_fix_infinite_loop(self, vector):
     # IMPALA-6337: Fix infinite loop.
+
+    # In case of TL;DR:
+    # - see IMPALA-9362 for details
+    # - see tests/shell/util.py for explanation of IMPALA_SHELL_EXECUTABLE
+    if os.getenv("IMPALA_HOME") not in IMPALA_SHELL_EXECUTABLE:
+      # The fix for IMPALA-6337 involved patching our internal verison of
+      # sqlparse 0.1.19 in ${IMPALA_HOME}/shell/ext-py. However, when we
+      # create the the stand-alone python package of the impala-shell for PyPI,
+      # we don't include the bundled 3rd party libs -- we expect users to
+      # install 3rd upstream libraries from PyPI.
+      #
+      # We could try to bundle sqlparse with the PyPI package, but there we
+      # run into the issue that the our bundled version is not python 3
+      # compatible. The real fix for this would be to upgrade to sqlparse 0.3.0,
+      # but that's not without complications. See IMPALA-9362 for details.
+      #
+      # For the time being, what this means is that IMPALA-6337 is fixed for
+      # people who are running the shell locally from any host/node that's part
+      # of a cluster where Impala is installed, but if they are running a
+      # standalone version of the shell on a client outside of a cluster, then
+      # they will still be relying on the upstream version of sqlparse 0.1.19,
+      # and so they may still be affected by the IMPALA-6337.
+      #
+      pytest.skip("Test will fail if shell is not part of dev environment.")
+
     result = run_impala_shell_interactive(vector, "select 1 + 1; \"\n;\";")
     assert '| 2     |' in result.stdout
     result = run_impala_shell_interactive(vector, "select '1234'\";\n;\n\";")
@@ -933,9 +953,10 @@ class TestImpalaShellInteractive(ImpalaTestSuite):
     for keyword in ["insert", "upsert", "update", "delete", "\\'insert\\'",
                     "\\'upsert\\'", "\\'update\\'", "\\'delete\\'"]:
       p = ImpalaShell(vector)
-      p.send_cmd("with foo as "
-                 "(select * from functional.alltypestiny where string_col='%s') "
-                 "select * from foo limit 1" % keyword)
+      cmd = ("with foo as "
+             "(select * from functional.alltypestiny where string_col='%s') "
+             "select * from foo limit 1" % keyword)
+      p.send_cmd(cmd)
       result = p.get_result()
       assert "Fetched 0 row" in result.stderr
 
diff --git a/tests/shell/util.py b/tests/shell/util.py
index 64a2925..f0217e5 100755
--- a/tests/shell/util.py
+++ b/tests/shell/util.py
@@ -1,5 +1,5 @@
 #!/usr/bin/env impala-python
-# encoding=utf-8
+# -*- coding: utf-8 -*-
 #
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
@@ -18,10 +18,12 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import logging
 import os
 import pytest
 import re
 import shlex
+import sys
 import time
 from subprocess import Popen, PIPE
 
@@ -30,18 +32,75 @@ from tests.common.environ import (IMPALA_LOCAL_BUILD_VERSION,
 from tests.common.impala_test_suite import (IMPALAD_BEESWAX_HOST_PORT,
     IMPALAD_HS2_HOST_PORT, IMPALAD_HS2_HTTP_HOST_PORT)
 
+logging.basicConfig()
+LOG = logging.getLogger('tests/shell/util.py')
+LOG.addHandler(logging.StreamHandler())
 
 SHELL_HISTORY_FILE = os.path.expanduser("~/.impalahistory")
 IMPALA_HOME = os.environ['IMPALA_HOME']
 
-if ImpalaTestClusterProperties.get_instance().is_remote_cluster():
-  # With remote cluster testing, we cannot assume that the shell was built locally.
-  IMPALA_SHELL_EXECUTABLE = os.path.join(IMPALA_HOME, "bin/impala-shell.sh")
-else:
-  # Test the locally built shell distribution.
-  IMPALA_SHELL_EXECUTABLE = os.path.join(
-      IMPALA_HOME, "shell/build", "impala-shell-" + IMPALA_LOCAL_BUILD_VERSION,
-      "impala-shell")
+# Note that pytest.config.getoption is deprecated usage. We use this
+# in a couple of other places. Ultimately, it needs to be addressed if
+# we ever want to get off of pytest 2.9.2.
+IMPALA_SHELL_EXECUTABLE = pytest.config.getoption('shell_executable')
+
+if IMPALA_SHELL_EXECUTABLE is None:
+  if ImpalaTestClusterProperties.get_instance().is_remote_cluster():
+    # With remote cluster testing, we cannot assume that the shell was built locally.
+    IMPALA_SHELL_EXECUTABLE = os.path.join(IMPALA_HOME, "bin/impala-shell.sh")
+  else:
+    # Test the locally built shell distribution.
+    IMPALA_SHELL_EXECUTABLE = os.path.join(
+        IMPALA_HOME, "shell/build", "impala-shell-" + IMPALA_LOCAL_BUILD_VERSION,
+        "impala-shell")
+
+
+def get_python_version_for_shell_env():
+  """
+  Return the version of python belonging to the tested IMPALA_SHELL_EXECUTABLE.
+
+  We need this because some tests behave differently based on the version of
+  python being used to execute the impala-shell. However, since the test
+  framework itself is still being run with python2.7.x, sys.version_info
+  alone can't help us to determine the python version for the environment of
+  the shell executable. Instead, we have to invoke the shell, and then parse
+  the python version from the output. This information is present even in the
+  case of a fatal shell exception, e.g., not being unable to establish a
+  connection to an impalad.
+
+  Moreover, if the python version for the shell being executed is 3 or higher,
+  and USE_THRIFT11_GEN_PY is not True, the test run should exit. Using older
+  versions of thrift with python 3 will cause hard-to-triage failures because
+  older thrift gen-py files are not py3 compatible.
+  """
+  version_check = Popen([IMPALA_SHELL_EXECUTABLE, '-q', 'version()'],
+                        stdout=PIPE, stderr=PIPE)
+  stdout, stderr = version_check.communicate()
+
+  if "No module named \'ttypes\'" in stderr:
+    # If USE_THRIFT11_GEN_PY is not true, and the shell executable is from a
+    # python 3 (or higher) environment, this is the specific error message that
+    # we'll encounter, related to python3 not allowing implicit relative imports.
+    if os.getenv("USE_THRIFT11_GEN_PY") != "true":
+      sys.exit("If tested shell environment has python 3 or higher, "
+               "the USE_THRIFT11_GEN_PY env variable must be 'true'")
+
+  # e.g. Starting Impala with Kerberos authentication using Python 3.7.6
+  start_msg_line = stderr.split('\n')[0]
+  py_version = start_msg_line.split()[-1]   # e.g. 3.7.6
+  try:
+    major_version, minor_version, micro_version = py_version.split('.')
+    ret_val = int(major_version)
+  except (ValueError, UnboundLocalError) as e:
+    LOG.error(stderr)
+    sys.exit("Could not determine python version in shell env: {}".format(str(e)))
+
+  return ret_val
+
+
+# Since both test_shell_commandline and test_shell_interactive import from
+# this file, this check will be forced before any tests are run.
+SHELL_IS_PYTHON_2 = True if (get_python_version_for_shell_env() == 2) else False
 
 
 def assert_var_substitution(result):