You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by holdenk <gi...@git.apache.org> on 2017/08/27 08:05:18 UTC
[GitHub] beam pull request #3772: [WIP][BEAM-2784] Run python 2 to 3 migration and fi...
GitHub user holdenk opened a pull request:
https://github.com/apache/beam/pull/3772
[WIP][BEAM-2784] Run python 2 to 3 migration and fix resulting Python 2 errors
This is a WIP pull request that is the result of running the Python 2 to 3 migration and then manually fixing the errors that broke in Python 2.7 support after the auto migration. This does not mean the code is ready for Python 3 after this -- although I am working on a follow up for that in BEAM-1373.
Most of the 2/3 compatibility is handled with the future library, although in some places where we interact with libraries that use six to achieve 2/3 compatibility I've used six as well so that we can inter-operate.
This pull request is rather large, but because it involves changing the types of many things it's would be challenging to isolate this into smaller chunks could prove to be difficult.
This PR is tagged as WIP since I still need to go through it again by hand and review the automatic changes (just because the tests pass with the fixes doesn't mean its all good), but if some of the other folks who care about BEAM on Py3 are around I'd appreciate some idea if this is going in the direction we want. There are also a large number of style issues to fix from automated tooling as well.
In addition to the automated futurization isort & limited autopep8 was also applied (and then manually modified since we need to do some silly things to work between 2/3 with keeping raw & compatible objects around)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/holdenk/beam BEAM-2784-py2t3-support-python2-still-squash-pandas-r2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/beam/pull/3772.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3772
----
commit 897fbac648a79222530a23a2a530dba08c294410
Author: Holden Karau <ho...@us.ibm.com>
Date: 2017-08-19T05:19:10Z
Initial first pass with futurize
commit 7527e21553d96bc39e469351dc6137b5f77dc928
Author: Holden Karau <ho...@us.ibm.com>
Date: 2017-08-19T05:21:18Z
Fix up the errors introduced by the automatic conversion. This is a squash of a somewhat long process to fix all of these:
Start manually fixing up the errors.
Explicitly depend on the future package
Attempt to fix encoder issues
List was doing nothing
Use has_attr func correctly
More fixes
Set default encoding to latin-1 in some places since otherwise we run into issues with using strings to hold codepoint above 128. Switch coder_impl to check for basestring
Future holden says thanks for basestirng
Use Bytes IO for avroio
Try and fix assertItemsEqual py2 again
be explicit about newint
sys + b string
sys import
Explicit encode before putting into a pb2
Use BytesIO
Add missing import
Fix newint tuple magic
force positions to int
Add ignore unicode prefix annotation from Spark
Try assertItemsEqual again
Stream change
More coder fixes
Rewrite exception messages for newint to int so we can be consistent between Py2 and Py3 with the type inferance
try converting to a string early idk
Progress around type hints being finicky now that we have future
Move _rewrite_typehint_string around, try and cleanup some builtin imports
Tentative: use basestring for str in Py2 which is a bit sketchy
Remove from builtins import str and minor fixes
Fix some more coding issues
Add an explicit test for the new coder, include the new method from the changed inheritance, add workaround for unicode.
Second missing next
Change some ptransform tests
Fix the unicode workaround import (oops)
Fix raising the error during retry
Ok so it "works" for the one test we've been iterating on but has lots of print debugging cause ohgod
Remove a bunch of cruft debugging added in and some things that took us down the wrong path as well with intermediate helper transform type annotations
Keep repr as is
Make SequenceTypeConstraint an instance of IndexableTypeConstraint. Add explicit tests in trivial_inference_test for indexing. Now at 1260 tests passing and 256 skipped :)
Small gcp related fixes
gcsio test fixes
gcsio and pylint fixes
Import order fix
autopep8 W391,W293,W291,E306,E305,E304,E303
We don't support 3.4 yet
Fix workaround for object magic
Ok so the underlying transfer library is using six so lets use it too
Ok make the tests more like normal
Add in missing return and factor out format result since long line
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---