You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by holdenk <gi...@git.apache.org> on 2017/08/27 08:05:18 UTC

[GitHub] beam pull request #3772: [WIP][BEAM-2784] Run python 2 to 3 migration and fi...

GitHub user holdenk opened a pull request:

    https://github.com/apache/beam/pull/3772

    [WIP][BEAM-2784] Run python 2 to 3 migration and fix resulting Python 2 errors

    This is a WIP pull request that is the result of running the Python 2 to 3 migration and then manually fixing the errors that broke in Python 2.7 support after the auto migration. This does not mean the code is ready for Python 3 after this -- although I am working on a follow up for that in BEAM-1373.
    
    Most of the 2/3 compatibility is handled with the future library, although in some places where we interact with libraries that use six to achieve 2/3 compatibility I've used six as well so that we can inter-operate.
    
    This pull request is rather large, but because it involves changing the types of many things it's would be challenging to isolate this into smaller chunks could prove to be difficult.
    
    This PR is tagged as WIP since I still need to go through it again by hand and review the automatic changes (just because the tests pass with the fixes doesn't mean its all good), but if some of the other folks who care about BEAM on Py3 are around I'd appreciate some idea if this is going in the direction we want. There are also a large number of style issues to fix from automated tooling as well.
    
    In addition to the automated futurization isort & limited autopep8 was also applied (and then manually modified since we need to do some silly things to work between 2/3 with keeping raw & compatible objects around)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/holdenk/beam BEAM-2784-py2t3-support-python2-still-squash-pandas-r2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/3772.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3772
    
----
commit 897fbac648a79222530a23a2a530dba08c294410
Author: Holden Karau <ho...@us.ibm.com>
Date:   2017-08-19T05:19:10Z

    Initial first pass with futurize

commit 7527e21553d96bc39e469351dc6137b5f77dc928
Author: Holden Karau <ho...@us.ibm.com>
Date:   2017-08-19T05:21:18Z

    Fix up the errors introduced by the automatic conversion. This is a squash of a somewhat long process to fix all of these:
    
    Start manually fixing up the errors.
    
    Explicitly depend on the future package
    
    Attempt to fix encoder issues
    
    List was doing nothing
    
    Use has_attr func correctly
    
    More fixes
    
    Set default encoding to latin-1 in some places since otherwise we run into issues with using strings to hold codepoint above 128. Switch coder_impl to check for basestring
    
    Future holden says thanks for basestirng
    
    Use Bytes IO for avroio
    
    Try and fix assertItemsEqual py2 again
    
    be explicit about newint
    
    sys + b string
    
    sys import
    
    Explicit encode before putting into a pb2
    
    Use BytesIO
    
    Add missing import
    
    Fix newint tuple magic
    
    force positions to int
    
    Add ignore unicode prefix annotation from Spark
    
    Try assertItemsEqual again
    
    Stream change
    
    More coder fixes
    
    Rewrite exception messages for newint to int so we can be consistent between Py2 and Py3 with the type inferance
    
    try converting to a string early idk
    
    Progress around type hints being finicky now that we have future
    
    Move _rewrite_typehint_string around, try and cleanup some builtin imports
    
    Tentative: use basestring for str in Py2 which is a bit sketchy
    
    Remove from builtins import str and minor fixes
    
    Fix some more coding issues
    
    Add an explicit test for the new coder, include the new method from the changed inheritance, add workaround for unicode.
    
    Second missing next
    
    Change some ptransform tests
    
    Fix the unicode workaround import (oops)
    
    Fix raising the error during retry
    
    Ok so it "works" for the one test we've been iterating on but has lots of print debugging cause ohgod
    
    Remove a bunch of cruft debugging added in and some things that took us down the wrong path as well with intermediate helper transform type annotations
    
    Keep repr as is
    
    Make SequenceTypeConstraint an instance of IndexableTypeConstraint. Add explicit tests in trivial_inference_test for indexing. Now at 1260 tests passing and 256 skipped :)
    
    Small gcp related fixes
    
    gcsio test fixes
    
    gcsio and pylint fixes
    
    Import order fix
    
    autopep8 W391,W293,W291,E306,E305,E304,E303
    
    We don't support 3.4 yet
    
    Fix workaround for object magic
    
    Ok so the underlying transfer library is using six so lets use it too
    
    Ok make the tests more like normal
    
    Add in missing return and factor out format result since long line

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---