You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Marcelo Vanzin <va...@cloudera.com.INVALID> on 2019/11/08 17:30:53 UTC

dev/merge_spark_pr.py broken on python 2

Hey all,

Something broke that script when running with python 2.

I know we want to deprecate python 2, but in that case, scripts should
at least be changed to use "python3" in the shebang line...

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: dev/merge_spark_pr.py broken on python 2

Posted by Hyukjin Kwon <gu...@gmail.com>.
Yeah.. let's stick to Python 3 in general ..
I plan to drop Python 2 completely right after Spark 3.0 release.

The exception you face .. seems like run_cmd now produces unicode instead
of bytes in Python 2 with the merge script. Later, seems this unicode is
attempted to be casted to bytes implicitly by %-formatting - IIRC implicit
cast uses its default encoding which is ascii in Python.


On Sat, 9 Nov 2019, 03:32 Marcelo Vanzin, <va...@cloudera.com.invalid>
wrote:

> I remember merging PRs with non-ascii chars in the past...
>
> Anyway, for these scripts, might be easier to just use python3 for
> everything, instead of trying to keep them working on two different
> versions.
>
> On Fri, Nov 8, 2019 at 10:28 AM Sean Owen <sr...@gmail.com> wrote:
> >
> > Ah OK. I think it's the same type of issue that the last change
> > actually was trying to fix for Python 2. Here it seems like the author
> > name might have non-ASCII chars?
> > I don't immediately know enough to know how to resolve that for Python
> > 2. Something with how raw_input works, I take it. You could 'fix' the
> > author name if that's the case, or just use python 3.
> >
> > On Fri, Nov 8, 2019 at 12:20 PM Marcelo Vanzin <va...@cloudera.com>
> wrote:
> > >
> > > Something related to non-ASCII characters. Worked fine with python 3.
> > >
> > > git branch -D PR_TOOL_MERGE_PR_26426_MASTER
> > > Traceback (most recent call last):
> > >   File "./dev/merge_spark_pr.py", line 577, in <module>
> > >     main()
> > >   File "./dev/merge_spark_pr.py", line 552, in main
> > >     merge_hash = merge_pr(pr_num, target_ref, title, body,
> pr_repo_desc)
> > >   File "./dev/merge_spark_pr.py", line 147, in merge_pr
> > >     distinct_authors[0])
> > > UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
> > > position 65: ordinal not in range(128)
> > > M       docs/running-on-kubernetes.md
> > > Already on 'master'
> > > Your branch is up to date with 'apache-github/master'.
> > > error: cannot pull with rebase: Your index contains uncommitted
> changes.
> > > error: please commit or stash them.
> > >
> > > On Fri, Nov 8, 2019 at 10:17 AM Sean Owen <sr...@gmail.com> wrote:
> > > >
> > > > Hm, the last change was on Oct 1, and should have actually helped it
> > > > still work with Python 2:
> > > >
> https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c
> > > >
> > > > Hasn't otherwise changed in a while. What's the error?
> > > >
> > > > On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
> > > > <va...@cloudera.com.invalid> wrote:
> > > > >
> > > > > Hey all,
> > > > >
> > > > > Something broke that script when running with python 2.
> > > > >
> > > > > I know we want to deprecate python 2, but in that case, scripts
> should
> > > > > at least be changed to use "python3" in the shebang line...
> > > > >
> > > > > --
> > > > > Marcelo
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > > > To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
> > > > >
> > >
> > >
> > >
> > > --
> > > Marcelo
>
>
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: dev/merge_spark_pr.py broken on python 2

Posted by Marcelo Vanzin <va...@cloudera.com.INVALID>.
I remember merging PRs with non-ascii chars in the past...

Anyway, for these scripts, might be easier to just use python3 for
everything, instead of trying to keep them working on two different
versions.

On Fri, Nov 8, 2019 at 10:28 AM Sean Owen <sr...@gmail.com> wrote:
>
> Ah OK. I think it's the same type of issue that the last change
> actually was trying to fix for Python 2. Here it seems like the author
> name might have non-ASCII chars?
> I don't immediately know enough to know how to resolve that for Python
> 2. Something with how raw_input works, I take it. You could 'fix' the
> author name if that's the case, or just use python 3.
>
> On Fri, Nov 8, 2019 at 12:20 PM Marcelo Vanzin <va...@cloudera.com> wrote:
> >
> > Something related to non-ASCII characters. Worked fine with python 3.
> >
> > git branch -D PR_TOOL_MERGE_PR_26426_MASTER
> > Traceback (most recent call last):
> >   File "./dev/merge_spark_pr.py", line 577, in <module>
> >     main()
> >   File "./dev/merge_spark_pr.py", line 552, in main
> >     merge_hash = merge_pr(pr_num, target_ref, title, body, pr_repo_desc)
> >   File "./dev/merge_spark_pr.py", line 147, in merge_pr
> >     distinct_authors[0])
> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
> > position 65: ordinal not in range(128)
> > M       docs/running-on-kubernetes.md
> > Already on 'master'
> > Your branch is up to date with 'apache-github/master'.
> > error: cannot pull with rebase: Your index contains uncommitted changes.
> > error: please commit or stash them.
> >
> > On Fri, Nov 8, 2019 at 10:17 AM Sean Owen <sr...@gmail.com> wrote:
> > >
> > > Hm, the last change was on Oct 1, and should have actually helped it
> > > still work with Python 2:
> > > https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c
> > >
> > > Hasn't otherwise changed in a while. What's the error?
> > >
> > > On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
> > > <va...@cloudera.com.invalid> wrote:
> > > >
> > > > Hey all,
> > > >
> > > > Something broke that script when running with python 2.
> > > >
> > > > I know we want to deprecate python 2, but in that case, scripts should
> > > > at least be changed to use "python3" in the shebang line...
> > > >
> > > > --
> > > > Marcelo
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
> > > >
> >
> >
> >
> > --
> > Marcelo



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: dev/merge_spark_pr.py broken on python 2

Posted by Sean Owen <sr...@gmail.com>.
Ah OK. I think it's the same type of issue that the last change
actually was trying to fix for Python 2. Here it seems like the author
name might have non-ASCII chars?
I don't immediately know enough to know how to resolve that for Python
2. Something with how raw_input works, I take it. You could 'fix' the
author name if that's the case, or just use python 3.

On Fri, Nov 8, 2019 at 12:20 PM Marcelo Vanzin <va...@cloudera.com> wrote:
>
> Something related to non-ASCII characters. Worked fine with python 3.
>
> git branch -D PR_TOOL_MERGE_PR_26426_MASTER
> Traceback (most recent call last):
>   File "./dev/merge_spark_pr.py", line 577, in <module>
>     main()
>   File "./dev/merge_spark_pr.py", line 552, in main
>     merge_hash = merge_pr(pr_num, target_ref, title, body, pr_repo_desc)
>   File "./dev/merge_spark_pr.py", line 147, in merge_pr
>     distinct_authors[0])
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
> position 65: ordinal not in range(128)
> M       docs/running-on-kubernetes.md
> Already on 'master'
> Your branch is up to date with 'apache-github/master'.
> error: cannot pull with rebase: Your index contains uncommitted changes.
> error: please commit or stash them.
>
> On Fri, Nov 8, 2019 at 10:17 AM Sean Owen <sr...@gmail.com> wrote:
> >
> > Hm, the last change was on Oct 1, and should have actually helped it
> > still work with Python 2:
> > https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c
> >
> > Hasn't otherwise changed in a while. What's the error?
> >
> > On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
> > <va...@cloudera.com.invalid> wrote:
> > >
> > > Hey all,
> > >
> > > Something broke that script when running with python 2.
> > >
> > > I know we want to deprecate python 2, but in that case, scripts should
> > > at least be changed to use "python3" in the shebang line...
> > >
> > > --
> > > Marcelo
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
> > >
>
>
>
> --
> Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: dev/merge_spark_pr.py broken on python 2

Posted by Marcelo Vanzin <va...@cloudera.com.INVALID>.
Something related to non-ASCII characters. Worked fine with python 3.

git branch -D PR_TOOL_MERGE_PR_26426_MASTER
Traceback (most recent call last):
  File "./dev/merge_spark_pr.py", line 577, in <module>
    main()
  File "./dev/merge_spark_pr.py", line 552, in main
    merge_hash = merge_pr(pr_num, target_ref, title, body, pr_repo_desc)
  File "./dev/merge_spark_pr.py", line 147, in merge_pr
    distinct_authors[0])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in
position 65: ordinal not in range(128)
M       docs/running-on-kubernetes.md
Already on 'master'
Your branch is up to date with 'apache-github/master'.
error: cannot pull with rebase: Your index contains uncommitted changes.
error: please commit or stash them.

On Fri, Nov 8, 2019 at 10:17 AM Sean Owen <sr...@gmail.com> wrote:
>
> Hm, the last change was on Oct 1, and should have actually helped it
> still work with Python 2:
> https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c
>
> Hasn't otherwise changed in a while. What's the error?
>
> On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
> <va...@cloudera.com.invalid> wrote:
> >
> > Hey all,
> >
> > Something broke that script when running with python 2.
> >
> > I know we want to deprecate python 2, but in that case, scripts should
> > at least be changed to use "python3" in the shebang line...
> >
> > --
> > Marcelo
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
> >



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: dev/merge_spark_pr.py broken on python 2

Posted by Sean Owen <sr...@gmail.com>.
Hm, the last change was on Oct 1, and should have actually helped it
still work with Python 2:
https://github.com/apache/spark/commit/2ec3265ae76fc1e136e44c240c476ce572b679df#diff-c321b6c82ebb21d8fd225abea9b7b74c

Hasn't otherwise changed in a while. What's the error?

On Fri, Nov 8, 2019 at 11:37 AM Marcelo Vanzin
<va...@cloudera.com.invalid> wrote:
>
> Hey all,
>
> Something broke that script when running with python 2.
>
> I know we want to deprecate python 2, but in that case, scripts should
> at least be changed to use "python3" in the shebang line...
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org