You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2017/11/07 10:45:38 UTC
spark git commit: [SPARK-22376][TESTS] Makes dev/run-tests.py script compatible with Python 3

Repository: spark
Updated Branches:
  refs/heads/master ed1478cfe -> 160a54061


[SPARK-22376][TESTS] Makes dev/run-tests.py script compatible with Python 3

## What changes were proposed in this pull request?

This PR proposes to fix `dev/run-tests.py` script to support Python 3.

Here are some backgrounds. Up to my knowledge,

In Python 2,
- `unicode` is NOT `str` in Python 2 (`type("foo") != type(u"foo")`).
- `str` has an alias, `bytes` in Python 2 (`type("foo") == type(b"foo")`).

In Python 3,
- `unicode` was (roughly) replaced by `str` in Python 3 (`type("foo") == type(u"foo")`).
- `str` is NOT `bytes` in Python 3 (`type("foo") != type(b"foo")`).

So, this PR fixes:

  1. Use `b''` instead of `''` so that both `str` in Python 2 and `bytes` in Python 3 can be hanlded. `sbt_proc.stdout.readline()` returns `str` (which has an alias, `bytes`) in Python 2 and `bytes` in Python 3

  2. Similarily, use `b''` instead of `''` so that both `str` in Python 2 and `bytes` in Python 3 can be hanlded. `re.compile` with `str` pattern does not seem supporting to match `bytes` in Python 3:

Actually, this change is recommended up to my knowledge - https://docs.python.org/3/howto/pyporting.html#text-versus-binary-data:

> Mark all binary literals with a b prefix, textual literals with a u prefix

## How was this patch tested?

I manually tested this via Python 3 with few additional changes to reduce the elapsed time.

Author: hyukjinkwon <gu...@gmail.com>

Closes #19665 from HyukjinKwon/SPARK-22376.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/160a5406
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/160a5406
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/160a5406

Branch: refs/heads/master
Commit: 160a540610051ee3233ea24102533b08b69f03fc
Parents: ed1478c
Author: hyukjinkwon <gu...@gmail.com>
Authored: Tue Nov 7 19:45:34 2017 +0900
Committer: hyukjinkwon <gu...@gmail.com>
Committed: Tue Nov 7 19:45:34 2017 +0900

----------------------------------------------------------------------
 dev/run-tests.py | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/160a5406/dev/run-tests.py
----------------------------------------------------------------------
diff --git a/dev/run-tests.py b/dev/run-tests.py
index 72d148d..ef0e788 100755
--- a/dev/run-tests.py
+++ b/dev/run-tests.py
@@ -276,9 +276,9 @@ def exec_sbt(sbt_args=()):
 
     sbt_cmd = [os.path.join(SPARK_HOME, "build", "sbt")] + sbt_args
 
-    sbt_output_filter = re.compile("^.*[info].*Resolving" + "|" +
-                                   "^.*[warn].*Merging" + "|" +
-                                   "^.*[info].*Including")
+    sbt_output_filter = re.compile(b"^.*[info].*Resolving" + b"|" +
+                                   b"^.*[warn].*Merging" + b"|" +
+                                   b"^.*[info].*Including")
 
     # NOTE: echo "q" is needed because sbt on encountering a build file
     # with failure (either resolution or compilation) prompts the user for
@@ -289,7 +289,7 @@ def exec_sbt(sbt_args=()):
                                 stdin=echo_proc.stdout,
                                 stdout=subprocess.PIPE)
     echo_proc.wait()
-    for line in iter(sbt_proc.stdout.readline, ''):
+    for line in iter(sbt_proc.stdout.readline, b''):
         if not sbt_output_filter.match(line):
             print(line, end='')
     retcode = sbt_proc.wait()


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org