You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "RK (JIRA)" <ji...@apache.org> on 2018/01/27 02:32:00 UTC

[jira] [Comment Edited] (BEAM-3106) Consider not pinning all python dependencies, or moving them to requirements.txt

    [ https://issues.apache.org/jira/browse/BEAM-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341912#comment-16341912 ] 

RK edited comment on BEAM-3106 at 1/27/18 2:31 AM:
---------------------------------------------------

This can result in some difficult-to-pin-down errors in the google-cloud-platform python libraries. For example, on a clean virtualenv:
{code:java}
pip install google-cloud-storage
{code}
Now:
{code:java}
from google.cloud.storage import Client
print(Client().bucket("gcp-public-data-landsat")\
    .blob("LE00/PRE/001/049/LE07_L1TP_001049_20160215_20161015_01_T1/LE07_L1TP_001049_20160215_20161015_01_T1_ANG.txt")\
    .download_as_string())
{code}
Works as expected, but after installing apache_beam[gcp] 
{code:java}
pip install apache_beam[gcp]
{code}
{code:java}
# same code as above
from google.cloud.storage import Client
print(Client().bucket("gcp-public-data-landsat")\
    .blob("LE00/PRE/001/049/LE07_L1TP_001049_20160215_20161015_01_T1/LE07_L1TP_001049_20160215_20161015_01_T1_ANG.txt")\
    .download_as_string())
# File "/Users/karbr001/Documents/tmp_gcs_test/env/lib/python2.7/site-packages/google_auth_httplib2.py", line 198, in request
#    uri, method, body=body, headers=request_headers, **kwargs)
# TypeError: request() got an unexpected keyword argument 'data'
{code}
[~altay] what's the temporary relief you mentioned? Is it just installing beam with a custom setup.py file that points bigquery 0.28?

 

 

 

 


was (Author: karbr89):
This can result in some difficult-to-pin-down errors in the google-cloud-platform python libraries. For example, on a clean virtualenv:

 
{code:java}
pip install google-cloud-storage
{code}
Now:

 

 
{code:java}
from google.cloud.storage import Client
print(Client().bucket("gcp-public-data-landsat")\
    .blob("LE00/PRE/001/049/LE07_L1TP_001049_20160215_20161015_01_T1/LE07_L1TP_001049_20160215_20161015_01_T1_ANG.txt")\
    .download_as_string())
{code}
Works as expected, but after installing apache_beam[gcp]

 

 
{code:java}
pip install apache_beam[gcp]
{code}
 
{code:java}
# same code as above
from google.cloud.storage import Client
print(Client().bucket("gcp-public-data-landsat")\
    .blob("LE00/PRE/001/049/LE07_L1TP_001049_20160215_20161015_01_T1/LE07_L1TP_001049_20160215_20161015_01_T1_ANG.txt")\
    .download_as_string())
# File "/Users/karbr001/Documents/tmp_gcs_test/env/lib/python2.7/site-packages/google_auth_httplib2.py", line 198, in request
#    uri, method, body=body, headers=request_headers, **kwargs)
# TypeError: request() got an unexpected keyword argument 'data'
{code}
[~altay] what's the temporary relief you mentioned? Is it just installing beam with a custom setup.py file that points bigquery 0.28?

 

 

 

 

> Consider not pinning all python dependencies, or moving them to requirements.txt
> --------------------------------------------------------------------------------
>
>                 Key: BEAM-3106
>                 URL: https://issues.apache.org/jira/browse/BEAM-3106
>             Project: Beam
>          Issue Type: Wish
>          Components: build-system
>    Affects Versions: 2.1.0
>         Environment: python
>            Reporter: Maximilian Roos
>            Priority: Major
>
> Currently all python dependencies are [pinned or capped|https://github.com/apache/beam/blob/master/sdks/python/setup.py#L97]
> While there's a good argument for supplying a `requirements.txt` with well tested dependencies, having them specified in `setup.py` forces them to an exact state on each install of Beam. This makes using Beam in any environment with other libraries nigh on impossible. 
> This is particularly severe for the `gcp` dependencies, where we have libraries that won't work with an older version (but Beam _does_ work with an newer version). We have to do a bunch of gymnastics to get the correct versions installed because of this. Unfortunately, airflow repeats this practice and conflicts on a number of dependencies, adding further complication (but, again there is no real conflict).
> I haven't seen this practice outside of the Apache & Google ecosystem - for example no libraries in numerical python do this. Here's a [discussion on SO|https://stackoverflow.com/questions/28509481/should-i-pin-my-python-dependencies-versions]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)