You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Alex Amato <aj...@google.com> on 2020/08/13 22:19:33 UTC
Using --sdk_location with python fails with a TypeError
I was trying to use the --sdk_location parameter in a python pipeline, to
allow users to run a snapshot SDK. Though it looks like it hit a type error
after downloading the .wdl file.
Perhaps this code is assuming that remote files downloaded are text type,
not bytes type? Have I done something wrong? Or is this a bug? Any ideas?
Thanks for taking a look,
Alex
Using the --sdk_location parameter (Full command line
<https://paste.googleplex.com/5792777008840704>)
--sdk_location=
https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
INFO:apache_beam.runners.portability.stager:Failed to download Artifact
from
https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
Traceback (most recent call last):
File
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
line 193, in _run_module_as_main
"__main__", mod_spec)
File
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
line 85, in _run_code
exec(code, run_globals)
File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
line 142, in <module>
run()
File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
line 121, in run
result = p.run()
File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
line 521, in run
allow_proto_holders=True).run(False)
File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
line 534, in run
return self.runner.run_pipeline(self, self._options)
File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
line 479, in run_pipeline
artifacts=environments.python_sdk_dependencies(options)))
File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
line 611, in python_sdk_dependencies
staged_name in stager.Stager.create_job_resources(options, tmp_dir))
File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
line 235, in create_job_resources
resources.extend(Stager._create_beam_sdk(sdk_remote_location, temp_dir))
File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
line 657, in _create_beam_sdk
Stager._download_file(sdk_remote_location, local_download_file)
File
"/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
line 375, in _download_file
f.write(content)
TypeError: write() argument must be str, not bytes
Re: Using --sdk_location with python fails with a TypeError
Posted by Valentyn Tymofieiev <va...@google.com>.
On Fri, Aug 14, 2020 at 10:52 AM Alex Amato <aj...@google.com> wrote:
> Thanks for the help. :).
>
> After downloading a file and passing that in as the --sdk_location flag I
> was able to start a Dataflow job and saw its worker logs state that it
> found the wheel
> [image: image.png]
>
> Though, the Dataflow UI states the SDK version I have installed on my
> machine where I launched the job.
>
This is WAI. SDK name is a part of job creation request, which is created
on your machine. The expectation is that there is the same SDK version
locally and remotely.
> I suspect that the Dataflow UI gets the version when the job is launched,
> and it's not inspecting the wheel for the version name. I suspect Dataflow
> just. doesn't handle this case.
> [image: image.png]
>
> I think this will suffice for now. Thank you
>
> On Thu, Aug 13, 2020 at 5:34 PM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>>
>>
>> On Thu, Aug 13, 2020 at 4:31 PM Alex Amato <aj...@google.com> wrote:
>>
>>> I changed the .wdl I was passing in to:
>>> --sdk_location=
>>> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache_beam-2.25.0.dev0-cp36-cp36m-macosx_10_9_x86_64.whl
>>>
>> note that this is a MacOS whl, so it won't run with Dataflow, Dataflow
>> will require a linux wheel, such as cp36-cp36m-manylinux1_x86_64.whl.
>>
>>>
>>>
>>> and also tried
>>>
>>> --sdk_location=
>>> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache-beam-2.25.0.dev0.zip
>>>
>>>
>>> python --version
>>>
>>> Python 3.6.8
>>>
>>> In both cases the same TypeError occurs.
>>> https://paste.googleplex.com/6275630654029824
>>>
>>
>> Looking closer, I see that you hit a Python 3 bug[1] in a codepath that
>> is not exercised frequently, and a quick fix[2] shows that this codepath
>> does not work for passing wheels [3].
>>
>> A workaround that should work is to download the file first, and then
>> pass it in --sdk_location.
>>
>> Btw, the cost of passing source distribution is 1-2 minutes of SDK
>> installation time. To pass the wheel files, you need to pass a correct
>> wheel taking the python version and target platform into account.
>>
>> [1] https://issues.apache.org/jira/browse/BEAM-10704.
>> [2] https://github.com/apache/beam/pull/125791
>> <https://github.com/apache/beam/pull/12579>
>> [3] https://issues.apache.org/jira/browse/BEAM-10705
>>
>>>
>>>
>>>
>>> On Thu, Aug 13, 2020 at 3:52 PM Valentyn Tymofieiev <va...@google.com>
>>> wrote:
>>>
>>>> You are passing a python 2.7 wheel to a job that was launched on python
>>>> 3.6.
>>>>
>>>> You need to select a correct wheel for the platform or pass source
>>>> distribution (zip/tag.gz).
>>>>
>>>> On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:
>>>>
>>>>> I was trying to use the --sdk_location parameter in a python pipeline,
>>>>> to allow users to run a snapshot SDK. Though it looks like it hit a type
>>>>> error after downloading the .wdl file.
>>>>>
>>>>> Perhaps this code is assuming that remote files downloaded are text
>>>>> type, not bytes type? Have I done something wrong? Or is this a bug? Any
>>>>> ideas?
>>>>>
>>>>> Thanks for taking a look,
>>>>> Alex
>>>>>
>>>>> Using the --sdk_location parameter (Full command line
>>>>> <https://paste.googleplex.com/5792777008840704>)
>>>>> --sdk_location=
>>>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>>>
>>>>> INFO:apache_beam.runners.portability.stager:Failed to download
>>>>> Artifact from
>>>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>>> Traceback (most recent call last):
>>>>> File
>>>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>>>> line 193, in _run_module_as_main
>>>>> "__main__", mod_spec)
>>>>> File
>>>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>>>> line 85, in _run_code
>>>>> exec(code, run_globals)
>>>>> File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>>>> line 142, in <module>
>>>>> run()
>>>>> File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>>>> line 121, in run
>>>>> result = p.run()
>>>>> File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>>>> line 521, in run
>>>>> allow_proto_holders=True).run(False)
>>>>> File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>>>> line 534, in run
>>>>> return self.runner.run_pipeline(self, self._options)
>>>>> File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>>>>> line 479, in run_pipeline
>>>>> artifacts=environments.python_sdk_dependencies(options)))
>>>>> File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
>>>>> line 611, in python_sdk_dependencies
>>>>> staged_name in stager.Stager.create_job_resources(options,
>>>>> tmp_dir))
>>>>> File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>>> line 235, in create_job_resources
>>>>> resources.extend(Stager._create_beam_sdk(sdk_remote_location,
>>>>> temp_dir))
>>>>> File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>>> line 657, in _create_beam_sdk
>>>>> Stager._download_file(sdk_remote_location, local_download_file)
>>>>> File
>>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>>> line 375, in _download_file
>>>>> f.write(content)
>>>>> TypeError: write() argument must be str, not bytes
>>>>>
>>>>>
>>>>>
Re: Using --sdk_location with python fails with a TypeError
Posted by Alex Amato <aj...@google.com>.
Thanks for the help. :).
After downloading a file and passing that in as the --sdk_location flag I
was able to start a Dataflow job and saw its worker logs state that it
found the wheel
[image: image.png]
Though, the Dataflow UI states the SDK version I have installed on my
machine where I launched the job.
I suspect that the Dataflow UI gets the version when the job is launched,
and it's not inspecting the wheel for the version name. I suspect Dataflow
just. doesn't handle this case.
[image: image.png]
I think this will suffice for now. Thank you
On Thu, Aug 13, 2020 at 5:34 PM Valentyn Tymofieiev <va...@google.com>
wrote:
>
>
> On Thu, Aug 13, 2020 at 4:31 PM Alex Amato <aj...@google.com> wrote:
>
>> I changed the .wdl I was passing in to:
>> --sdk_location=
>> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache_beam-2.25.0.dev0-cp36-cp36m-macosx_10_9_x86_64.whl
>>
> note that this is a MacOS whl, so it won't run with Dataflow, Dataflow
> will require a linux wheel, such as cp36-cp36m-manylinux1_x86_64.whl.
>
>>
>>
>> and also tried
>>
>> --sdk_location=
>> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache-beam-2.25.0.dev0.zip
>>
>>
>> python --version
>>
>> Python 3.6.8
>>
>> In both cases the same TypeError occurs.
>> https://paste.googleplex.com/6275630654029824
>>
>
> Looking closer, I see that you hit a Python 3 bug[1] in a codepath that
> is not exercised frequently, and a quick fix[2] shows that this codepath
> does not work for passing wheels [3].
>
> A workaround that should work is to download the file first, and then pass
> it in --sdk_location.
>
> Btw, the cost of passing source distribution is 1-2 minutes of SDK
> installation time. To pass the wheel files, you need to pass a correct
> wheel taking the python version and target platform into account.
>
> [1] https://issues.apache.org/jira/browse/BEAM-10704.
> [2] https://github.com/apache/beam/pull/125791
> <https://github.com/apache/beam/pull/12579>
> [3] https://issues.apache.org/jira/browse/BEAM-10705
>
>>
>>
>>
>> On Thu, Aug 13, 2020 at 3:52 PM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> You are passing a python 2.7 wheel to a job that was launched on python
>>> 3.6.
>>>
>>> You need to select a correct wheel for the platform or pass source
>>> distribution (zip/tag.gz).
>>>
>>> On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:
>>>
>>>> I was trying to use the --sdk_location parameter in a python pipeline,
>>>> to allow users to run a snapshot SDK. Though it looks like it hit a type
>>>> error after downloading the .wdl file.
>>>>
>>>> Perhaps this code is assuming that remote files downloaded are text
>>>> type, not bytes type? Have I done something wrong? Or is this a bug? Any
>>>> ideas?
>>>>
>>>> Thanks for taking a look,
>>>> Alex
>>>>
>>>> Using the --sdk_location parameter (Full command line
>>>> <https://paste.googleplex.com/5792777008840704>)
>>>> --sdk_location=
>>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>>
>>>> INFO:apache_beam.runners.portability.stager:Failed to download Artifact
>>>> from
>>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>> Traceback (most recent call last):
>>>> File
>>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>>> line 193, in _run_module_as_main
>>>> "__main__", mod_spec)
>>>> File
>>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>>> line 85, in _run_code
>>>> exec(code, run_globals)
>>>> File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>>> line 142, in <module>
>>>> run()
>>>> File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>>> line 121, in run
>>>> result = p.run()
>>>> File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>>> line 521, in run
>>>> allow_proto_holders=True).run(False)
>>>> File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>>> line 534, in run
>>>> return self.runner.run_pipeline(self, self._options)
>>>> File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>>>> line 479, in run_pipeline
>>>> artifacts=environments.python_sdk_dependencies(options)))
>>>> File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
>>>> line 611, in python_sdk_dependencies
>>>> staged_name in stager.Stager.create_job_resources(options, tmp_dir))
>>>> File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>> line 235, in create_job_resources
>>>> resources.extend(Stager._create_beam_sdk(sdk_remote_location,
>>>> temp_dir))
>>>> File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>> line 657, in _create_beam_sdk
>>>> Stager._download_file(sdk_remote_location, local_download_file)
>>>> File
>>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>>> line 375, in _download_file
>>>> f.write(content)
>>>> TypeError: write() argument must be str, not bytes
>>>>
>>>>
>>>>
Re: Using --sdk_location with python fails with a TypeError
Posted by Valentyn Tymofieiev <va...@google.com>.
On Thu, Aug 13, 2020 at 4:31 PM Alex Amato <aj...@google.com> wrote:
> I changed the .wdl I was passing in to:
> --sdk_location=
> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache_beam-2.25.0.dev0-cp36-cp36m-macosx_10_9_x86_64.whl
>
note that this is a MacOS whl, so it won't run with Dataflow, Dataflow
will require a linux wheel, such as cp36-cp36m-manylinux1_x86_64.whl.
>
>
> and also tried
>
> --sdk_location=
> https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache-beam-2.25.0.dev0.zip
>
>
> python --version
>
> Python 3.6.8
>
> In both cases the same TypeError occurs.
> https://paste.googleplex.com/6275630654029824
>
Looking closer, I see that you hit a Python 3 bug[1] in a codepath that is
not exercised frequently, and a quick fix[2] shows that this codepath does
not work for passing wheels [3].
A workaround that should work is to download the file first, and then pass
it in --sdk_location.
Btw, the cost of passing source distribution is 1-2 minutes of SDK
installation time. To pass the wheel files, you need to pass a correct
wheel taking the python version and target platform into account.
[1] https://issues.apache.org/jira/browse/BEAM-10704.
[2] https://github.com/apache/beam/pull/125791
<https://github.com/apache/beam/pull/12579>
[3] https://issues.apache.org/jira/browse/BEAM-10705
>
>
>
> On Thu, Aug 13, 2020 at 3:52 PM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> You are passing a python 2.7 wheel to a job that was launched on python
>> 3.6.
>>
>> You need to select a correct wheel for the platform or pass source
>> distribution (zip/tag.gz).
>>
>> On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:
>>
>>> I was trying to use the --sdk_location parameter in a python pipeline,
>>> to allow users to run a snapshot SDK. Though it looks like it hit a type
>>> error after downloading the .wdl file.
>>>
>>> Perhaps this code is assuming that remote files downloaded are text
>>> type, not bytes type? Have I done something wrong? Or is this a bug? Any
>>> ideas?
>>>
>>> Thanks for taking a look,
>>> Alex
>>>
>>> Using the --sdk_location parameter (Full command line
>>> <https://paste.googleplex.com/5792777008840704>)
>>> --sdk_location=
>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>>
>>> INFO:apache_beam.runners.portability.stager:Failed to download Artifact
>>> from
>>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>> Traceback (most recent call last):
>>> File
>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>> line 193, in _run_module_as_main
>>> "__main__", mod_spec)
>>> File
>>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>>> line 85, in _run_code
>>> exec(code, run_globals)
>>> File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>> line 142, in <module>
>>> run()
>>> File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>>> line 121, in run
>>> result = p.run()
>>> File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>> line 521, in run
>>> allow_proto_holders=True).run(False)
>>> File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>>> line 534, in run
>>> return self.runner.run_pipeline(self, self._options)
>>> File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>>> line 479, in run_pipeline
>>> artifacts=environments.python_sdk_dependencies(options)))
>>> File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
>>> line 611, in python_sdk_dependencies
>>> staged_name in stager.Stager.create_job_resources(options, tmp_dir))
>>> File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>> line 235, in create_job_resources
>>> resources.extend(Stager._create_beam_sdk(sdk_remote_location,
>>> temp_dir))
>>> File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>> line 657, in _create_beam_sdk
>>> Stager._download_file(sdk_remote_location, local_download_file)
>>> File
>>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>>> line 375, in _download_file
>>> f.write(content)
>>> TypeError: write() argument must be str, not bytes
>>>
>>>
>>>
Re: Using --sdk_location with python fails with a TypeError
Posted by Alex Amato <aj...@google.com>.
I changed the .wdl I was passing in to:
--sdk_location=
https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache_beam-2.25.0.dev0-cp36-cp36m-macosx_10_9_x86_64.whl
and also tried
--sdk_location=
https://storage.googleapis.com/beam-wheels-staging/master/699f872ea1ef3bdb1588a029fc6b1e3185e986a6-207696119/apache-beam-2.25.0.dev0.zip
python --version
Python 3.6.8
In both cases the same TypeError occurs.
https://paste.googleplex.com/6275630654029824
On Thu, Aug 13, 2020 at 3:52 PM Valentyn Tymofieiev <va...@google.com>
wrote:
> You are passing a python 2.7 wheel to a job that was launched on python
> 3.6.
>
> You need to select a correct wheel for the platform or pass source
> distribution (zip/tag.gz).
>
> On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:
>
>> I was trying to use the --sdk_location parameter in a python pipeline, to
>> allow users to run a snapshot SDK. Though it looks like it hit a type error
>> after downloading the .wdl file.
>>
>> Perhaps this code is assuming that remote files downloaded are text type,
>> not bytes type? Have I done something wrong? Or is this a bug? Any ideas?
>>
>> Thanks for taking a look,
>> Alex
>>
>> Using the --sdk_location parameter (Full command line
>> <https://paste.googleplex.com/5792777008840704>)
>> --sdk_location=
>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>>
>> INFO:apache_beam.runners.portability.stager:Failed to download Artifact
>> from
>> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>> Traceback (most recent call last):
>> File
>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>> line 193, in _run_module_as_main
>> "__main__", mod_spec)
>> File
>> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
>> line 85, in _run_code
>> exec(code, run_globals)
>> File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>> line 142, in <module>
>> run()
>> File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
>> line 121, in run
>> result = p.run()
>> File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>> line 521, in run
>> allow_proto_holders=True).run(False)
>> File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
>> line 534, in run
>> return self.runner.run_pipeline(self, self._options)
>> File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>> line 479, in run_pipeline
>> artifacts=environments.python_sdk_dependencies(options)))
>> File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
>> line 611, in python_sdk_dependencies
>> staged_name in stager.Stager.create_job_resources(options, tmp_dir))
>> File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>> line 235, in create_job_resources
>> resources.extend(Stager._create_beam_sdk(sdk_remote_location,
>> temp_dir))
>> File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>> line 657, in _create_beam_sdk
>> Stager._download_file(sdk_remote_location, local_download_file)
>> File
>> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
>> line 375, in _download_file
>> f.write(content)
>> TypeError: write() argument must be str, not bytes
>>
>>
>>
Re: Using --sdk_location with python fails with a TypeError
Posted by Valentyn Tymofieiev <va...@google.com>.
You are passing a python 2.7 wheel to a job that was launched on python 3.6.
You need to select a correct wheel for the platform or pass source
distribution (zip/tag.gz).
On Thu, Aug 13, 2020, 15:20 Alex Amato <aj...@google.com> wrote:
> I was trying to use the --sdk_location parameter in a python pipeline, to
> allow users to run a snapshot SDK. Though it looks like it hit a type error
> after downloading the .wdl file.
>
> Perhaps this code is assuming that remote files downloaded are text type,
> not bytes type? Have I done something wrong? Or is this a bug? Any ideas?
>
> Thanks for taking a look,
> Alex
>
> Using the --sdk_location parameter (Full command line
> <https://paste.googleplex.com/5792777008840704>)
> --sdk_location=
> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
>
> INFO:apache_beam.runners.portability.stager:Failed to download Artifact
> from
> https://storage.googleapis.com/beam-wheels-staging/master/94f9e7fd4cae0f8aa6587d2cf14887f1c4827485-198203585/apache_beam-2.24.0.dev0-cp27-cp27m-macosx_10_9_x86_64.whl
> Traceback (most recent call last):
> File
> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
> line 193, in _run_module_as_main
> "__main__", mod_spec)
> File
> "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py",
> line 85, in _run_code
> exec(code, run_globals)
> File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
> line 142, in <module>
> run()
> File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/examples/wordcount.py",
> line 121, in run
> result = p.run()
> File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
> line 521, in run
> allow_proto_holders=True).run(False)
> File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/pipeline.py",
> line 534, in run
> return self.runner.run_pipeline(self, self._options)
> File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
> line 479, in run_pipeline
> artifacts=environments.python_sdk_dependencies(options)))
> File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/transforms/environments.py",
> line 611, in python_sdk_dependencies
> staged_name in stager.Stager.create_job_resources(options, tmp_dir))
> File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
> line 235, in create_job_resources
> resources.extend(Stager._create_beam_sdk(sdk_remote_location,
> temp_dir))
> File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
> line 657, in _create_beam_sdk
> Stager._download_file(sdk_remote_location, local_download_file)
> File
> "/Users/ajamato/beam/beam-sdk-download-test/venv/lib/python3.6/site-packages/apache_beam/runners/portability/stager.py",
> line 375, in _download_file
> f.write(content)
> TypeError: write() argument must be str, not bytes
>
>
>