You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Alex Amato <aj...@google.com> on 2020/08/13 22:19:33 UTC

Using --sdk_location with python fails with a TypeError

I was trying to use the --sdk_location parameter in a python pipeline, to
allow users to run a snapshot SDK. Though it looks like it hit a type error
after downloading the .wdl file.

Perhaps this code is assuming that remote files downloaded are text type,
not bytes type? Have I done something wrong? Or is this a bug? Any ideas?

Thanks for taking a look,
Alex

Using the --sdk_location parameter (Full command line
<https://paste.googleplex.com/5792777008840704>)
--sdk_location=
https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl

INFO:apache_beam.runners.portability.stager:Failed to download Artifact
from
https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
Traceback (most recent call last):
  File
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
line 193, in _run_module_as_main
    "__main__", mod_spec)
  File
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
line 85, in _run_code
    exec(code, run_globals)
  File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
line 142, in <module>
    run()
  File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
line 121, in run
    result = p.run()
  File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
line 521, in run
    allow_proto_holders=True).run(False)
  File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
line 534, in run
    return self.runner.run_pipeline(self, self._options)
  File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
line 479, in run_pipeline
    artifacts=environments.python_sdk_dependencies(options)))
  File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
line 611, in python_sdk_dependencies
    staged_name in stager.Stager.create_job_resources(options, tmp_dir))
  File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
line 235, in create_job_resources
    resources.extend(Stager._create_beam_sdk(sdk_remote_location, temp_dir))
  File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
line 657, in _create_beam_sdk
    Stager._download_file(sdk_remote_location, local_download_file)
  File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
line 375, in _download_file
    f.write(content)
TypeError: write() argument must be str, not bytes

Re: Using --sdk_location with python fails with a TypeError

Posted by Valentyn Tymofieiev <va...@google.com>.
On Fri, Aug 14, 2020 at 10:52 AM Alex Amato <aj...@google.com> wrote:

> Thanks for the help. :).
>
> After downloading a file and passing that in as the --sdk_location flag I
> was able to start a Dataflow job and saw its worker logs state that it
> found the wheel
> [image: image.png]
>
> Though, the Dataflow UI states the SDK version I have installed on my
> machine where I launched the job.
>
This is WAI. SDK name is a part of job creation request, which is created
on your machine. The expectation is that there is the same SDK version
locally and remotely.


> I suspect that the Dataflow UI gets the version when the job is launched,
> and it's not inspecting the wheel for the version name. I suspect Dataflow
> just. doesn't handle this case.
> [image: image.png]
>
> I think this will suffice for now. Thank you
>
> On Thu, Aug 13, 2020 at 5:34 PM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>>
>>
>> On Thu, Aug 13, 2020 at 4:31 PM Alex Amato <aj...@google.com> wrote:
>>
>>> I changed the .wdl I was passing in to:
>>> --sdk_location=
>>> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache_beam-2.25.0.dev0-cp36-cp36m-macosx_10_9_x86_64.whl
>>>
>> note that this is a MacOS  whl, so it won't run with Dataflow, Dataflow
>> will require a linux wheel,  such as cp36-cp36m-manylinux1_x86_64.whl.
>>
>>>
>>>
>>> and also tried
>>>
>>> --sdk_location=
>>> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache-beam-2.25.0.dev0.zip
>>>
>>>
>>> python --version
>>>
>>> Python 3.6.8
>>>
>>> In both cases the same TypeError occurs.
>>> https://paste.googleplex.com/6275630654029824
>>>
>>
>> Looking closer,  I see that you hit a Python 3 bug[1] in a codepath that
>> is not exercised frequently, and a quick fix[2] shows that this codepath
>> does not work for passing wheels [3].
>>
>> A workaround that should work is to download the file first, and then
>> pass it in --sdk_location.
>>
>> Btw, the cost of passing source distribution is 1-2 minutes of SDK
>> installation time. To pass the wheel files, you need to pass a correct
>> wheel taking the python version and target platform into account.
>>
>> [1] https://issues.apache.org/jira/browse/BEAM-10704.
>> [2] https://github.com/apache/beam/pull/125791
>> <https://github.com/apache/beam/pull/12579>
>> [3] https://issues.apache.org/jira/browse/BEAM-10705
>>
>>>
>>>
>>>
>>> On Thu, Aug 13, 2020 at 3:52 PM Valentyn Tymofieiev <va...@google.com>
>>> wrote:
>>>
>>>> You are passing a python 2.7 wheel to a job that was launched on python
>>>> 3.6.
>>>>
>>>> You need to select a correct wheel for the platform or pass source
>>>> distribution (zip/tag.gz).
>>>>
>>>> On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:
>>>>
>>>>> I was trying to use the --sdk_location parameter in a python pipeline,
>>>>> to allow users to run a snapshot SDK. Though it looks like it hit a type
>>>>> error after downloading the .wdl file.
>>>>>
>>>>> Perhaps this code is assuming that remote files downloaded are text
>>>>> type, not bytes type? Have I done something wrong? Or is this a bug? Any
>>>>> ideas?
>>>>>
>>>>> Thanks for taking a look,
>>>>> Alex
>>>>>
>>>>> Using the --sdk_location parameter (Full command line
>>>>> <https://paste.googleplex.com/5792777008840704>)
>>>>> --sdk_location=
>>>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>>>
>>>>> INFO:apache_beam.runners.portability.stager:Failed to download
>>>>> Artifact from
>>>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>>> Traceback (most recent call last):
>>>>>   File
>>>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>>>> line 193, in _run_module_as_main
>>>>>     "__main__", mod_spec)
>>>>>   File
>>>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>>>> line 85, in _run_code
>>>>>     exec(code, run_globals)
>>>>>   File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>>>> line 142, in <module>
>>>>>     run()
>>>>>   File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>>>> line 121, in run
>>>>>     result = p.run()
>>>>>   File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>>>> line 521, in run
>>>>>     allow_proto_holders=True).run(False)
>>>>>   File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>>>> line 534, in run
>>>>>     return self.runner.run_pipeline(self, self._options)
>>>>>   File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>>>>> line 479, in run_pipeline
>>>>>     artifacts=environments.python_sdk_dependencies(options)))
>>>>>   File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
>>>>> line 611, in python_sdk_dependencies
>>>>>     staged_name in stager.Stager.create_job_resources(options,
>>>>> tmp_dir))
>>>>>   File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>>> line 235, in create_job_resources
>>>>>     resources.extend(Stager._create_beam_sdk(sdk_remote_location,
>>>>> temp_dir))
>>>>>   File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>>> line 657, in _create_beam_sdk
>>>>>     Stager._download_file(sdk_remote_location, local_download_file)
>>>>>   File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>>> line 375, in _download_file
>>>>>     f.write(content)
>>>>> TypeError: write() argument must be str, not bytes
>>>>>
>>>>>
>>>>>

Re: Using --sdk_location with python fails with a TypeError

Posted by Alex Amato <aj...@google.com>.
Thanks for the help. :).

After downloading a file and passing that in as the --sdk_location flag I
was able to start a Dataflow job and saw its worker logs state that it
found the wheel
[image: image.png]

Though, the Dataflow UI states the SDK version I have installed on my
machine where I launched the job.
I suspect that the Dataflow UI gets the version when the job is launched,
and it's not inspecting the wheel for the version name. I suspect Dataflow
just. doesn't handle this case.
[image: image.png]

I think this will suffice for now. Thank you

On Thu, Aug 13, 2020 at 5:34 PM Valentyn Tymofieiev <va...@google.com>
wrote:

>
>
> On Thu, Aug 13, 2020 at 4:31 PM Alex Amato <aj...@google.com> wrote:
>
>> I changed the .wdl I was passing in to:
>> --sdk_location=
>> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache_beam-2.25.0.dev0-cp36-cp36m-macosx_10_9_x86_64.whl
>>
> note that this is a MacOS  whl, so it won't run with Dataflow, Dataflow
> will require a linux wheel,  such as cp36-cp36m-manylinux1_x86_64.whl.
>
>>
>>
>> and also tried
>>
>> --sdk_location=
>> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache-beam-2.25.0.dev0.zip
>>
>>
>> python --version
>>
>> Python 3.6.8
>>
>> In both cases the same TypeError occurs.
>> https://paste.googleplex.com/6275630654029824
>>
>
> Looking closer,  I see that you hit a Python 3 bug[1] in a codepath that
> is not exercised frequently, and a quick fix[2] shows that this codepath
> does not work for passing wheels [3].
>
> A workaround that should work is to download the file first, and then pass
> it in --sdk_location.
>
> Btw, the cost of passing source distribution is 1-2 minutes of SDK
> installation time. To pass the wheel files, you need to pass a correct
> wheel taking the python version and target platform into account.
>
> [1] https://issues.apache.org/jira/browse/BEAM-10704.
> [2] https://github.com/apache/beam/pull/125791
> <https://github.com/apache/beam/pull/12579>
> [3] https://issues.apache.org/jira/browse/BEAM-10705
>
>>
>>
>>
>> On Thu, Aug 13, 2020 at 3:52 PM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> You are passing a python 2.7 wheel to a job that was launched on python
>>> 3.6.
>>>
>>> You need to select a correct wheel for the platform or pass source
>>> distribution (zip/tag.gz).
>>>
>>> On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:
>>>
>>>> I was trying to use the --sdk_location parameter in a python pipeline,
>>>> to allow users to run a snapshot SDK. Though it looks like it hit a type
>>>> error after downloading the .wdl file.
>>>>
>>>> Perhaps this code is assuming that remote files downloaded are text
>>>> type, not bytes type? Have I done something wrong? Or is this a bug? Any
>>>> ideas?
>>>>
>>>> Thanks for taking a look,
>>>> Alex
>>>>
>>>> Using the --sdk_location parameter (Full command line
>>>> <https://paste.googleplex.com/5792777008840704>)
>>>> --sdk_location=
>>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>>
>>>> INFO:apache_beam.runners.portability.stager:Failed to download Artifact
>>>> from
>>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>> Traceback (most recent call last):
>>>>   File
>>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>>> line 193, in _run_module_as_main
>>>>     "__main__", mod_spec)
>>>>   File
>>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>>> line 85, in _run_code
>>>>     exec(code, run_globals)
>>>>   File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>>> line 142, in <module>
>>>>     run()
>>>>   File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>>> line 121, in run
>>>>     result = p.run()
>>>>   File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>>> line 521, in run
>>>>     allow_proto_holders=True).run(False)
>>>>   File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>>> line 534, in run
>>>>     return self.runner.run_pipeline(self, self._options)
>>>>   File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>>>> line 479, in run_pipeline
>>>>     artifacts=environments.python_sdk_dependencies(options)))
>>>>   File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
>>>> line 611, in python_sdk_dependencies
>>>>     staged_name in stager.Stager.create_job_resources(options, tmp_dir))
>>>>   File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>> line 235, in create_job_resources
>>>>     resources.extend(Stager._create_beam_sdk(sdk_remote_location,
>>>> temp_dir))
>>>>   File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>> line 657, in _create_beam_sdk
>>>>     Stager._download_file(sdk_remote_location, local_download_file)
>>>>   File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>> line 375, in _download_file
>>>>     f.write(content)
>>>> TypeError: write() argument must be str, not bytes
>>>>
>>>>
>>>>

Re: Using --sdk_location with python fails with a TypeError

Posted by Valentyn Tymofieiev <va...@google.com>.
On Thu, Aug 13, 2020 at 4:31 PM Alex Amato <aj...@google.com> wrote:

> I changed the .wdl I was passing in to:
> --sdk_location=
> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache_beam-2.25.0.dev0-cp36-cp36m-macosx_10_9_x86_64.whl
>
note that this is a MacOS  whl, so it won't run with Dataflow, Dataflow
will require a linux wheel,  such as cp36-cp36m-manylinux1_x86_64.whl.

>
>
> and also tried
>
> --sdk_location=
> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache-beam-2.25.0.dev0.zip
>
>
> python --version
>
> Python 3.6.8
>
> In both cases the same TypeError occurs.
> https://paste.googleplex.com/6275630654029824
>

Looking closer,  I see that you hit a Python 3 bug[1] in a codepath that is
not exercised frequently, and a quick fix[2] shows that this codepath does
not work for passing wheels [3].

A workaround that should work is to download the file first, and then pass
it in --sdk_location.

Btw, the cost of passing source distribution is 1-2 minutes of SDK
installation time. To pass the wheel files, you need to pass a correct
wheel taking the python version and target platform into account.

[1] https://issues.apache.org/jira/browse/BEAM-10704.
[2] https://github.com/apache/beam/pull/125791
<https://github.com/apache/beam/pull/12579>
[3] https://issues.apache.org/jira/browse/BEAM-10705

>
>
>
> On Thu, Aug 13, 2020 at 3:52 PM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> You are passing a python 2.7 wheel to a job that was launched on python
>> 3.6.
>>
>> You need to select a correct wheel for the platform or pass source
>> distribution (zip/tag.gz).
>>
>> On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:
>>
>>> I was trying to use the --sdk_location parameter in a python pipeline,
>>> to allow users to run a snapshot SDK. Though it looks like it hit a type
>>> error after downloading the .wdl file.
>>>
>>> Perhaps this code is assuming that remote files downloaded are text
>>> type, not bytes type? Have I done something wrong? Or is this a bug? Any
>>> ideas?
>>>
>>> Thanks for taking a look,
>>> Alex
>>>
>>> Using the --sdk_location parameter (Full command line
>>> <https://paste.googleplex.com/5792777008840704>)
>>> --sdk_location=
>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>
>>> INFO:apache_beam.runners.portability.stager:Failed to download Artifact
>>> from
>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>> Traceback (most recent call last):
>>>   File
>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>> line 193, in _run_module_as_main
>>>     "__main__", mod_spec)
>>>   File
>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>> line 85, in _run_code
>>>     exec(code, run_globals)
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>> line 142, in <module>
>>>     run()
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>> line 121, in run
>>>     result = p.run()
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>> line 521, in run
>>>     allow_proto_holders=True).run(False)
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>> line 534, in run
>>>     return self.runner.run_pipeline(self, self._options)
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>>> line 479, in run_pipeline
>>>     artifacts=environments.python_sdk_dependencies(options)))
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
>>> line 611, in python_sdk_dependencies
>>>     staged_name in stager.Stager.create_job_resources(options, tmp_dir))
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>> line 235, in create_job_resources
>>>     resources.extend(Stager._create_beam_sdk(sdk_remote_location,
>>> temp_dir))
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>> line 657, in _create_beam_sdk
>>>     Stager._download_file(sdk_remote_location, local_download_file)
>>>   File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>> line 375, in _download_file
>>>     f.write(content)
>>> TypeError: write() argument must be str, not bytes
>>>
>>>
>>>

Re: Using --sdk_location with python fails with a TypeError

Posted by Alex Amato <aj...@google.com>.
I changed the .wdl I was passing in to:
--sdk_location=
https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache_beam-2.25.0.dev0-cp36-cp36m-macosx_10_9_x86_64.whl


and also tried

--sdk_location=
https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache-beam-2.25.0.dev0.zip


python --version

Python 3.6.8

In both cases the same TypeError occurs.
https://paste.googleplex.com/6275630654029824



On Thu, Aug 13, 2020 at 3:52 PM Valentyn Tymofieiev <va...@google.com>
wrote:

> You are passing a python 2.7 wheel to a job that was launched on python
> 3.6.
>
> You need to select a correct wheel for the platform or pass source
> distribution (zip/tag.gz).
>
> On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:
>
>> I was trying to use the --sdk_location parameter in a python pipeline, to
>> allow users to run a snapshot SDK. Though it looks like it hit a type error
>> after downloading the .wdl file.
>>
>> Perhaps this code is assuming that remote files downloaded are text type,
>> not bytes type? Have I done something wrong? Or is this a bug? Any ideas?
>>
>> Thanks for taking a look,
>> Alex
>>
>> Using the --sdk_location parameter (Full command line
>> <https://paste.googleplex.com/5792777008840704>)
>> --sdk_location=
>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>
>> INFO:apache_beam.runners.portability.stager:Failed to download Artifact
>> from
>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>> Traceback (most recent call last):
>>   File
>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>> line 193, in _run_module_as_main
>>     "__main__", mod_spec)
>>   File
>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>> line 85, in _run_code
>>     exec(code, run_globals)
>>   File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>> line 142, in <module>
>>     run()
>>   File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>> line 121, in run
>>     result = p.run()
>>   File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>> line 521, in run
>>     allow_proto_holders=True).run(False)
>>   File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>> line 534, in run
>>     return self.runner.run_pipeline(self, self._options)
>>   File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>> line 479, in run_pipeline
>>     artifacts=environments.python_sdk_dependencies(options)))
>>   File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
>> line 611, in python_sdk_dependencies
>>     staged_name in stager.Stager.create_job_resources(options, tmp_dir))
>>   File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>> line 235, in create_job_resources
>>     resources.extend(Stager._create_beam_sdk(sdk_remote_location,
>> temp_dir))
>>   File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>> line 657, in _create_beam_sdk
>>     Stager._download_file(sdk_remote_location, local_download_file)
>>   File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>> line 375, in _download_file
>>     f.write(content)
>> TypeError: write() argument must be str, not bytes
>>
>>
>>

Re: Using --sdk_location with python fails with a TypeError

Posted by Valentyn Tymofieiev <va...@google.com>.
You are passing a python 2.7 wheel to a job that was launched on python 3.6.

You need to select a correct wheel for the platform or pass source
distribution (zip/tag.gz).

On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:

> I was trying to use the --sdk_location parameter in a python pipeline, to
> allow users to run a snapshot SDK. Though it looks like it hit a type error
> after downloading the .wdl file.
>
> Perhaps this code is assuming that remote files downloaded are text type,
> not bytes type? Have I done something wrong? Or is this a bug? Any ideas?
>
> Thanks for taking a look,
> Alex
>
> Using the --sdk_location parameter (Full command line
> <https://paste.googleplex.com/5792777008840704>)
> --sdk_location=
> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>
> INFO:apache_beam.runners.portability.stager:Failed to download Artifact
> from
> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
> Traceback (most recent call last):
>   File
> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
> line 193, in _run_module_as_main
>     "__main__", mod_spec)
>   File
> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
> line 85, in _run_code
>     exec(code, run_globals)
>   File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
> line 142, in <module>
>     run()
>   File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
> line 121, in run
>     result = p.run()
>   File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
> line 521, in run
>     allow_proto_holders=True).run(False)
>   File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
> line 534, in run
>     return self.runner.run_pipeline(self, self._options)
>   File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
> line 479, in run_pipeline
>     artifacts=environments.python_sdk_dependencies(options)))
>   File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
> line 611, in python_sdk_dependencies
>     staged_name in stager.Stager.create_job_resources(options, tmp_dir))
>   File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
> line 235, in create_job_resources
>     resources.extend(Stager._create_beam_sdk(sdk_remote_location,
> temp_dir))
>   File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
> line 657, in _create_beam_sdk
>     Stager._download_file(sdk_remote_location, local_download_file)
>   File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
> line 375, in _download_file
>     f.write(content)
> TypeError: write() argument must be str, not bytes
>
>
>