You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Luke Cwik <lc...@google.com> on 2020/02/12 18:03:36 UTC

Re: daily dataflow job failing today

+dev <de...@beam.apache.org>

There was recently an update to add autoformatting to the Python SDK[1].

I'm seeing this during testing of a PR as well.

1:
https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E

On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <al...@betterup.co>
wrote:

> Some more information for this as I still can't get to fix it....
>
> This job is triggered using the beam[gcp] python sdk from a KubeFlow
> Pipelines component which runs on top of docker image:
> tensorflow/tensorflow:1.13.1-py3
>
> I just checked and that image hasn't been updated recently. I also
> redeployed my pipeline to another (older) deployment of KFP and it gives me
> the same error (which tells me this isn't an internal KFP problem)
>
> The exact same pipeline/code running on the exact same image has been
> running fine for days. Did anything changed on the beam/dataflow side since
> yesterday morning?
>
> Thanks for your help! this is a production pipeline that is not running
> for us :(
>
>
>
> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <al...@betterup.co>
> wrote:
>
>> Hi, I have a scheduled daily job that I have been running fine in
>> dataflow for days now.
>> We haven't changed anything on this code but this morning run failed  (it
>> couldn't even spin up the job)
>> The job submits a setup.py file (that also hasn't changed) but maybe is
>> causing the problem? (based on the error I'm getting)
>>
>> Anyone else having the same issue? or know how to fix it?
>> Thanks!
>>
>> ERROR: Complete output from command python setup.py egg_info:
>> 2 ERROR: Traceback (most recent call last):
>> 3 File "<string>", line 1, in <module>
>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in
>> <module>
>> 5 import pycodestyle
>> 6 ImportError: No module named 'pycodestyle'
>> 7 ----------------------------------------
>> 8ERROR: Command "python setup.py egg_info" failed with error code 1 in
>> /tmp/pip-install-42zyi89t/avro-python3/
>> 9 ERROR: Complete output from command python setup.py egg_info:
>> 10 ERROR: Traceback (most recent call last):
>> 11 File "<string>", line 1, in <module>
>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, in
>> <module>
>> 13 import pycodestyle
>> 14 ImportError: No module named 'pycodestyle'
>> 15 ----------------------------------------
>> 16ERROR: Command "python setup.py egg_info" failed with error code 1 in
>> /tmp/pip-install-wrqytf9a/avro-python3/
>>
>

Re: daily dataflow job failing today

Posted by Kenneth Knowles <ke...@apache.org>.
But pip doesn't try to reconcile user's requested version and Beam's listed
dep, right? (https://github.com/pypa/pip/issues/988 still open)

Kenn

On Thu, Feb 13, 2020 at 9:48 AM Ahmet Altay <al...@google.com> wrote:

> Thank you, Ismaël. I did not know that Avro was not using semantic
> versioning either.
>
> On Thu, Feb 13, 2020 at 9:44 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Thank you, Ismaël. Good to know Avro doesn't follow semantic versioning.
>> Replied on the PR.
>>
>> On Thu, Feb 13, 2020 at 5:24 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> For info Avro has published a new version 1.9.2.1 that fixes the issue:
>>> https://issues.apache.org/jira/browse/AVRO-2737
>>>
>>> I just submitted a PR to make the dependency consistent with Avro
>>> versioning and
>>> verify that everything works as intended with the upgraded dependency on
>>> the
>>> python SDK. Can you PTAL?
>>> https://github.com/apache/beam/pull/10851
>>>
>>>
>>> On Thu, Feb 13, 2020 at 9:39 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>>
>>>> > I can argue for not pinning and bounding with major version ranges.
>>>> This gives flexibility to users to mix other third party libraries that
>>>> share common dependencies with Beam. Our expectation is that dependencies
>>>> follow semantic versioning and do not introduce breaking changes unless
>>>> there is a major version change. A good example of this is Beam's
>>>> dependency on "pytz>=2018.3". It is a simple wrapper around a time zone
>>>> file. Latest version of the dependency is 2019.3, it is updated a few times
>>>> a year. Beam users do not have to update Beam just to be able to use a
>>>> later version of it since Beam does not pin it.
>>>>
>>>> Avro does not follow semantic versioning (the first number corresponds
>>>> to the version of the Avro binary format the release is compatible with,
>>>> the second correspond to the MAJOR and the third to the MINOR in semver),
>>>> so we should then fix the upper bound to 1.10.0 instead of 2.0.0
>>>> considering that 1.10.x before the summer and it may contain breaking
>>>> changes.
>>>>
>>>> > There is also a middle ground, where we can pin certain dependencies
>>>> if we are not confident about their releases. And allow ranges for rest of
>>>> the dependencies. In general, we are currently following this practice.
>>>>
>>>> I see your point, like many things in software it is all about
>>>> tradeoffs, and it is good to find a middle ground, do we have a robust
>>>> reproducible release experience, or do we deal with the annoyance of doing
>>>> manual minor version upgrades. Choices choices...
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Feb 13, 2020 at 2:26 AM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 12, 2020 at 12:54 PM Ismaël Mejía <ie...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Independently of the bug in the dependency release the fact that the
>>>>>> Beam Python
>>>>>> SDK does not have pinned fixed dependency numbers is error-prone. We
>>>>>> may
>>>>>> continue to have this kind of problems until we fix this (with other
>>>>>> dependencies too). In the Java SDK we do not accept such type of
>>>>>> dynamic
>>>>>> dependency numbers and python should probably follow this practice to
>>>>>> avoid
>>>>>> issues like the present one.
>>>>>>
>>>>>> Why don't we just do:
>>>>>>
>>>>>>     'avro-python3==1.9.1',
>>>>>>
>>>>>> instead of the current:
>>>>>>
>>>>>>     'avro-python3>=1.8.1,!=1.9.2,<2.0.0; python_version >= "3.0"',
>>>>>>
>>>>>
>>>>> I agree this is error prone. Your argument for pinning makes sense and
>>>>> I agree with it.
>>>>>
>>>>> I can argue for not pinning and bounding with major version ranges.
>>>>> This gives flexibility to users to mix other third party libraries that
>>>>> share common dependencies with Beam. Our expectation is that dependencies
>>>>> follow semantic versioning and do not introduce breaking changes unless
>>>>> there is a major version change. A good example of this is Beam's
>>>>> dependency on "pytz>=2018.3". It is a simple wrapper around a time zone
>>>>> file. Latest version of the dependency is 2019.3, it is updated a few times
>>>>> a year. Beam users do not have to update Beam just to be able to use a
>>>>> later version of it since Beam does not pin it.
>>>>>
>>>>> There is also a middle ground, where we can pin certain dependencies
>>>>> if we are not confident about their releases. And allow ranges for rest of
>>>>> the dependencies. In general, we are currently following this practice.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 9:14 PM Ahmet Altay <al...@google.com> wrote:
>>>>>>
>>>>>>> Related: we have dependencies on avro, avro-python3, and fastavro.
>>>>>>> fastavro supports both python 2 and 3. Could we reduce this dependency list
>>>>>>> and depend only on fastavro? If we need avro and avro-python3 for the
>>>>>>> purposes of testing only, we can move them to test only dependencies.
>>>>>>>
>>>>>>> +Chamikara Jayalath <ch...@google.com>, because I vaguely
>>>>>>> remember him working on this.
>>>>>>>
>>>>>>> The reason I am calling for this is the impact of bad dependency
>>>>>>> releases are high. All previously released Beam versions will be impacted.
>>>>>>> Reducing the dependency list will reduce the risk.
>>>>>>>
>>>>>>> Ahmet
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <al...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank you Valentyn!
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev <
>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>
>>>>>>>>> Yes, otherwise all Python tests will continue to fail until Avro
>>>>>>>>> comes up with a new release. Sent:
>>>>>>>>> https://github.com/apache/beam/pull/10844
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Should we update Beam's setup.py to skip this avro-python3
>>>>>>>>>> version?
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <
>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>
>>>>>>>>>>> makes sense. I'll add this workaround for now.
>>>>>>>>>>> Thanks so much for your help!
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <
>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies,
>>>>>>>>>>>> including (a working version) of avro-python3. So after reading your email
>>>>>>>>>>>> once again, I think in your case you were not able to install Beam SDK
>>>>>>>>>>>> locally. So a workaround for you would be to `pip install
>>>>>>>>>>>> avro-python3==1.9.1` or `pip install pycodestyle`  before installing Beam,
>>>>>>>>>>>> until AVRO-2737 is resolved.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Ah, there's already
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/AVRO-2737 and it
>>>>>>>>>>>>> received attention.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here's a short repro:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>>>>>>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>>>>>>>>>>> Collecting avro-python3
>>>>>>>>>>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>>>>>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>>>>>>>>>>      command: /usr/local/bin/python -c 'import sys,
>>>>>>>>>>>>>>> setuptools, tokenize; sys.argv[0] =
>>>>>>>>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>>>>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>>>>>>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>>>>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>>>>>>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>>>>>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>>>>>>>>>>     Complete output (5 lines):
>>>>>>>>>>>>>>>     Traceback (most recent call last):
>>>>>>>>>>>>>>>       File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>       File
>>>>>>>>>>>>>>> "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41, in <module>
>>>>>>>>>>>>>>>         import pycodestyle
>>>>>>>>>>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>     ----------------------------------------
>>>>>>>>>>>>>>> ERROR: Command errored out with exit status 1: python
>>>>>>>>>>>>>>> setup.py egg_info Check the logs for full command output.
>>>>>>>>>>>>>>> root@04b45a100d16:/#
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, it is a bug in the recent Avro release. We should
>>>>>>>>>>>>>>>> report it to the Avro maintainers. The workaround is to downgrade
>>>>>>>>>>>>>>>> avro-python3 to 1.9.1, for example via requirements.txt.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <
>>>>>>>>>>>>>>>> sniemitz@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>>>>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <
>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> There was recently an update to add autoformatting to the
>>>>>>>>>>>>>>>>>> Python SDK[1].
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 1:
>>>>>>>>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Some more information for this as I still can't get to
>>>>>>>>>>>>>>>>>>> fix it....
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk
>>>>>>>>>>>>>>>>>>> from a KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I just checked and that image hasn't been updated
>>>>>>>>>>>>>>>>>>> recently. I also redeployed my pipeline to another (older) deployment of
>>>>>>>>>>>>>>>>>>> KFP and it gives me the same error (which tells me this isn't an internal
>>>>>>>>>>>>>>>>>>> KFP problem)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The exact same pipeline/code running on the exact same
>>>>>>>>>>>>>>>>>>> image has been running fine for days. Did anything changed on the
>>>>>>>>>>>>>>>>>>> beam/dataflow side since yesterday morning?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for your help! this is a production pipeline that
>>>>>>>>>>>>>>>>>>> is not running for us :(
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been
>>>>>>>>>>>>>>>>>>>> running fine in dataflow for days now.
>>>>>>>>>>>>>>>>>>>> We haven't changed anything on this code but this
>>>>>>>>>>>>>>>>>>>> morning run failed  (it couldn't even spin up the job)
>>>>>>>>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't
>>>>>>>>>>>>>>>>>>>> changed) but maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix
>>>>>>>>>>>>>>>>>>>> it?
>>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>>>>>> 4 File
>>>>>>>>>>>>>>>>>>>> "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in <module>
>>>>>>>>>>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>>>>>> 12 File
>>>>>>>>>>>>>>>>>>>> "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, in <module>
>>>>>>>>>>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed
>>>>>>>>>>>>>>>>>>>> with error code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Ahmet Altay <al...@google.com>.
Thank you, Ismaël. I did not know that Avro was not using semantic
versioning either.

On Thu, Feb 13, 2020 at 9:44 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Thank you, Ismaël. Good to know Avro doesn't follow semantic versioning.
> Replied on the PR.
>
> On Thu, Feb 13, 2020 at 5:24 AM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> For info Avro has published a new version 1.9.2.1 that fixes the issue:
>> https://issues.apache.org/jira/browse/AVRO-2737
>>
>> I just submitted a PR to make the dependency consistent with Avro
>> versioning and
>> verify that everything works as intended with the upgraded dependency on
>> the
>> python SDK. Can you PTAL?
>> https://github.com/apache/beam/pull/10851
>>
>>
>> On Thu, Feb 13, 2020 at 9:39 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>>
>>> > I can argue for not pinning and bounding with major version ranges.
>>> This gives flexibility to users to mix other third party libraries that
>>> share common dependencies with Beam. Our expectation is that dependencies
>>> follow semantic versioning and do not introduce breaking changes unless
>>> there is a major version change. A good example of this is Beam's
>>> dependency on "pytz>=2018.3". It is a simple wrapper around a time zone
>>> file. Latest version of the dependency is 2019.3, it is updated a few times
>>> a year. Beam users do not have to update Beam just to be able to use a
>>> later version of it since Beam does not pin it.
>>>
>>> Avro does not follow semantic versioning (the first number corresponds
>>> to the version of the Avro binary format the release is compatible with,
>>> the second correspond to the MAJOR and the third to the MINOR in semver),
>>> so we should then fix the upper bound to 1.10.0 instead of 2.0.0
>>> considering that 1.10.x before the summer and it may contain breaking
>>> changes.
>>>
>>> > There is also a middle ground, where we can pin certain dependencies
>>> if we are not confident about their releases. And allow ranges for rest of
>>> the dependencies. In general, we are currently following this practice.
>>>
>>> I see your point, like many things in software it is all about
>>> tradeoffs, and it is good to find a middle ground, do we have a robust
>>> reproducible release experience, or do we deal with the annoyance of doing
>>> manual minor version upgrades. Choices choices...
>>>
>>>
>>>
>>>
>>> On Thu, Feb 13, 2020 at 2:26 AM Ahmet Altay <al...@google.com> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Feb 12, 2020 at 12:54 PM Ismaël Mejía <ie...@gmail.com>
>>>> wrote:
>>>>
>>>>> Independently of the bug in the dependency release the fact that the
>>>>> Beam Python
>>>>> SDK does not have pinned fixed dependency numbers is error-prone. We
>>>>> may
>>>>> continue to have this kind of problems until we fix this (with other
>>>>> dependencies too). In the Java SDK we do not accept such type of
>>>>> dynamic
>>>>> dependency numbers and python should probably follow this practice to
>>>>> avoid
>>>>> issues like the present one.
>>>>>
>>>>> Why don't we just do:
>>>>>
>>>>>     'avro-python3==1.9.1',
>>>>>
>>>>> instead of the current:
>>>>>
>>>>>     'avro-python3>=1.8.1,!=1.9.2,<2.0.0; python_version >= "3.0"',
>>>>>
>>>>
>>>> I agree this is error prone. Your argument for pinning makes sense and
>>>> I agree with it.
>>>>
>>>> I can argue for not pinning and bounding with major version ranges.
>>>> This gives flexibility to users to mix other third party libraries that
>>>> share common dependencies with Beam. Our expectation is that dependencies
>>>> follow semantic versioning and do not introduce breaking changes unless
>>>> there is a major version change. A good example of this is Beam's
>>>> dependency on "pytz>=2018.3". It is a simple wrapper around a time zone
>>>> file. Latest version of the dependency is 2019.3, it is updated a few times
>>>> a year. Beam users do not have to update Beam just to be able to use a
>>>> later version of it since Beam does not pin it.
>>>>
>>>> There is also a middle ground, where we can pin certain dependencies if
>>>> we are not confident about their releases. And allow ranges for rest of the
>>>> dependencies. In general, we are currently following this practice.
>>>>
>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 12, 2020 at 9:14 PM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>> Related: we have dependencies on avro, avro-python3, and fastavro.
>>>>>> fastavro supports both python 2 and 3. Could we reduce this dependency list
>>>>>> and depend only on fastavro? If we need avro and avro-python3 for the
>>>>>> purposes of testing only, we can move them to test only dependencies.
>>>>>>
>>>>>> +Chamikara Jayalath <ch...@google.com>, because I vaguely
>>>>>> remember him working on this.
>>>>>>
>>>>>> The reason I am calling for this is the impact of bad dependency
>>>>>> releases are high. All previously released Beam versions will be impacted.
>>>>>> Reducing the dependency list will reduce the risk.
>>>>>>
>>>>>> Ahmet
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <al...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank you Valentyn!
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev <
>>>>>>> valentyn@google.com> wrote:
>>>>>>>
>>>>>>>> Yes, otherwise all Python tests will continue to fail until Avro
>>>>>>>> comes up with a new release. Sent:
>>>>>>>> https://github.com/apache/beam/pull/10844
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Should we update Beam's setup.py to skip this avro-python3 version?
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <
>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>
>>>>>>>>>> makes sense. I'll add this workaround for now.
>>>>>>>>>> Thanks so much for your help!
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <
>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies,
>>>>>>>>>>> including (a working version) of avro-python3. So after reading your email
>>>>>>>>>>> once again, I think in your case you were not able to install Beam SDK
>>>>>>>>>>> locally. So a workaround for you would be to `pip install
>>>>>>>>>>> avro-python3==1.9.1` or `pip install pycodestyle`  before installing Beam,
>>>>>>>>>>> until AVRO-2737 is resolved.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Ah, there's already
>>>>>>>>>>>> https://issues.apache.org/jira/browse/AVRO-2737 and it
>>>>>>>>>>>> received attention.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here's a short repro:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>>>>>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>>>>>>>>>> Collecting avro-python3
>>>>>>>>>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>>>>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>>>>>>>>>      command: /usr/local/bin/python -c 'import sys,
>>>>>>>>>>>>>> setuptools, tokenize; sys.argv[0] =
>>>>>>>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>>>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>>>>>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>>>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>>>>>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>>>>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>>>>>>>>>     Complete output (5 lines):
>>>>>>>>>>>>>>     Traceback (most recent call last):
>>>>>>>>>>>>>>       File "<string>", line 1, in <module>
>>>>>>>>>>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py",
>>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>>>         import pycodestyle
>>>>>>>>>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>>>>>>>>>     ----------------------------------------
>>>>>>>>>>>>>> ERROR: Command errored out with exit status 1: python
>>>>>>>>>>>>>> setup.py egg_info Check the logs for full command output.
>>>>>>>>>>>>>> root@04b45a100d16:/#
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes, it is a bug in the recent Avro release. We should
>>>>>>>>>>>>>>> report it to the Avro maintainers. The workaround is to downgrade
>>>>>>>>>>>>>>> avro-python3 to 1.9.1, for example via requirements.txt.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <
>>>>>>>>>>>>>>> sniemitz@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>>>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> There was recently an update to add autoformatting to the
>>>>>>>>>>>>>>>>> Python SDK[1].
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1:
>>>>>>>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Some more information for this as I still can't get to
>>>>>>>>>>>>>>>>>> fix it....
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from
>>>>>>>>>>>>>>>>>> a KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I just checked and that image hasn't been updated
>>>>>>>>>>>>>>>>>> recently. I also redeployed my pipeline to another (older) deployment of
>>>>>>>>>>>>>>>>>> KFP and it gives me the same error (which tells me this isn't an internal
>>>>>>>>>>>>>>>>>> KFP problem)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The exact same pipeline/code running on the exact same
>>>>>>>>>>>>>>>>>> image has been running fine for days. Did anything changed on the
>>>>>>>>>>>>>>>>>> beam/dataflow side since yesterday morning?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks for your help! this is a production pipeline that
>>>>>>>>>>>>>>>>>> is not running for us :(
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been
>>>>>>>>>>>>>>>>>>> running fine in dataflow for days now.
>>>>>>>>>>>>>>>>>>> We haven't changed anything on this code but this
>>>>>>>>>>>>>>>>>>> morning run failed  (it couldn't even spin up the job)
>>>>>>>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't
>>>>>>>>>>>>>>>>>>> changed) but maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>>>>> 4 File
>>>>>>>>>>>>>>>>>>> "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in <module>
>>>>>>>>>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>>>>> 12 File
>>>>>>>>>>>>>>>>>>> "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, in <module>
>>>>>>>>>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Thank you, Ismaël. Good to know Avro doesn't follow semantic versioning.
Replied on the PR.

On Thu, Feb 13, 2020 at 5:24 AM Ismaël Mejía <ie...@gmail.com> wrote:

> For info Avro has published a new version 1.9.2.1 that fixes the issue:
> https://issues.apache.org/jira/browse/AVRO-2737
>
> I just submitted a PR to make the dependency consistent with Avro
> versioning and
> verify that everything works as intended with the upgraded dependency on
> the
> python SDK. Can you PTAL?
> https://github.com/apache/beam/pull/10851
>
>
> On Thu, Feb 13, 2020 at 9:39 AM Ismaël Mejía <ie...@gmail.com> wrote:
>
>>
>> > I can argue for not pinning and bounding with major version ranges.
>> This gives flexibility to users to mix other third party libraries that
>> share common dependencies with Beam. Our expectation is that dependencies
>> follow semantic versioning and do not introduce breaking changes unless
>> there is a major version change. A good example of this is Beam's
>> dependency on "pytz>=2018.3". It is a simple wrapper around a time zone
>> file. Latest version of the dependency is 2019.3, it is updated a few times
>> a year. Beam users do not have to update Beam just to be able to use a
>> later version of it since Beam does not pin it.
>>
>> Avro does not follow semantic versioning (the first number corresponds to
>> the version of the Avro binary format the release is compatible with, the
>> second correspond to the MAJOR and the third to the MINOR in semver), so we
>> should then fix the upper bound to 1.10.0 instead of 2.0.0 considering that
>> 1.10.x before the summer and it may contain breaking changes.
>>
>> > There is also a middle ground, where we can pin certain dependencies if
>> we are not confident about their releases. And allow ranges for rest of the
>> dependencies. In general, we are currently following this practice.
>>
>> I see your point, like many things in software it is all about tradeoffs,
>> and it is good to find a middle ground, do we have a robust reproducible
>> release experience, or do we deal with the annoyance of doing manual minor
>> version upgrades. Choices choices...
>>
>>
>>
>>
>> On Thu, Feb 13, 2020 at 2:26 AM Ahmet Altay <al...@google.com> wrote:
>>
>>>
>>>
>>> On Wed, Feb 12, 2020 at 12:54 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> Independently of the bug in the dependency release the fact that the
>>>> Beam Python
>>>> SDK does not have pinned fixed dependency numbers is error-prone. We may
>>>> continue to have this kind of problems until we fix this (with other
>>>> dependencies too). In the Java SDK we do not accept such type of dynamic
>>>> dependency numbers and python should probably follow this practice to
>>>> avoid
>>>> issues like the present one.
>>>>
>>>> Why don't we just do:
>>>>
>>>>     'avro-python3==1.9.1',
>>>>
>>>> instead of the current:
>>>>
>>>>     'avro-python3>=1.8.1,!=1.9.2,<2.0.0; python_version >= "3.0"',
>>>>
>>>
>>> I agree this is error prone. Your argument for pinning makes sense and I
>>> agree with it.
>>>
>>> I can argue for not pinning and bounding with major version ranges. This
>>> gives flexibility to users to mix other third party libraries that share
>>> common dependencies with Beam. Our expectation is that dependencies follow
>>> semantic versioning and do not introduce breaking changes unless there is a
>>> major version change. A good example of this is Beam's dependency on
>>> "pytz>=2018.3". It is a simple wrapper around a time zone file. Latest
>>> version of the dependency is 2019.3, it is updated a few times a year. Beam
>>> users do not have to update Beam just to be able to use a later version of
>>> it since Beam does not pin it.
>>>
>>> There is also a middle ground, where we can pin certain dependencies if
>>> we are not confident about their releases. And allow ranges for rest of the
>>> dependencies. In general, we are currently following this practice.
>>>
>>>
>>>>
>>>>
>>>> On Wed, Feb 12, 2020 at 9:14 PM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> Related: we have dependencies on avro, avro-python3, and fastavro.
>>>>> fastavro supports both python 2 and 3. Could we reduce this dependency list
>>>>> and depend only on fastavro? If we need avro and avro-python3 for the
>>>>> purposes of testing only, we can move them to test only dependencies.
>>>>>
>>>>> +Chamikara Jayalath <ch...@google.com>, because I vaguely
>>>>> remember him working on this.
>>>>>
>>>>> The reason I am calling for this is the impact of bad dependency
>>>>> releases are high. All previously released Beam versions will be impacted.
>>>>> Reducing the dependency list will reduce the risk.
>>>>>
>>>>> Ahmet
>>>>>
>>>>> On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>> Thank you Valentyn!
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev <
>>>>>> valentyn@google.com> wrote:
>>>>>>
>>>>>>> Yes, otherwise all Python tests will continue to fail until Avro
>>>>>>> comes up with a new release. Sent:
>>>>>>> https://github.com/apache/beam/pull/10844
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Should we update Beam's setup.py to skip this avro-python3 version?
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <
>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>
>>>>>>>>> makes sense. I'll add this workaround for now.
>>>>>>>>> Thanks so much for your help!
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <
>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies,
>>>>>>>>>> including (a working version) of avro-python3. So after reading your email
>>>>>>>>>> once again, I think in your case you were not able to install Beam SDK
>>>>>>>>>> locally. So a workaround for you would be to `pip install
>>>>>>>>>> avro-python3==1.9.1` or `pip install pycodestyle`  before installing Beam,
>>>>>>>>>> until AVRO-2737 is resolved.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Ah, there's already
>>>>>>>>>>> https://issues.apache.org/jira/browse/AVRO-2737 and it received
>>>>>>>>>>> attention.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Here's a short repro:
>>>>>>>>>>>>>
>>>>>>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>>>>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>>>>>>>>> Collecting avro-python3
>>>>>>>>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>>>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>>>>>>>>      command: /usr/local/bin/python -c 'import sys,
>>>>>>>>>>>>> setuptools, tokenize; sys.argv[0] =
>>>>>>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>>>>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>>>>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>>>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>>>>>>>>     Complete output (5 lines):
>>>>>>>>>>>>>     Traceback (most recent call last):
>>>>>>>>>>>>>       File "<string>", line 1, in <module>
>>>>>>>>>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py",
>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>>         import pycodestyle
>>>>>>>>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>>>>>>>>     ----------------------------------------
>>>>>>>>>>>>> ERROR: Command errored out with exit status 1: python setup.py
>>>>>>>>>>>>> egg_info Check the logs for full command output.
>>>>>>>>>>>>> root@04b45a100d16:/#
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, it is a bug in the recent Avro release. We should report
>>>>>>>>>>>>>> it to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>>>>>>>>>>> 1.9.1, for example via requirements.txt.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <
>>>>>>>>>>>>>> sniemitz@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There was recently an update to add autoformatting to the
>>>>>>>>>>>>>>>> Python SDK[1].
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1:
>>>>>>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Some more information for this as I still can't get to fix
>>>>>>>>>>>>>>>>> it....
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from
>>>>>>>>>>>>>>>>> a KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I just checked and that image hasn't been updated
>>>>>>>>>>>>>>>>> recently. I also redeployed my pipeline to another (older) deployment of
>>>>>>>>>>>>>>>>> KFP and it gives me the same error (which tells me this isn't an internal
>>>>>>>>>>>>>>>>> KFP problem)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The exact same pipeline/code running on the exact same
>>>>>>>>>>>>>>>>> image has been running fine for days. Did anything changed on the
>>>>>>>>>>>>>>>>> beam/dataflow side since yesterday morning?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for your help! this is a production pipeline that
>>>>>>>>>>>>>>>>> is not running for us :(
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been running
>>>>>>>>>>>>>>>>>> fine in dataflow for days now.
>>>>>>>>>>>>>>>>>> We haven't changed anything on this code but this morning
>>>>>>>>>>>>>>>>>> run failed  (it couldn't even spin up the job)
>>>>>>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't
>>>>>>>>>>>>>>>>>> changed) but maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>>>> 4 File
>>>>>>>>>>>>>>>>>> "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in <module>
>>>>>>>>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>>>> 12 File
>>>>>>>>>>>>>>>>>> "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, in <module>
>>>>>>>>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Ismaël Mejía <ie...@gmail.com>.
For info Avro has published a new version 1.9.2.1 that fixes the issue:
https://issues.apache.org/jira/browse/AVRO-2737

I just submitted a PR to make the dependency consistent with Avro
versioning and
verify that everything works as intended with the upgraded dependency on
the
python SDK. Can you PTAL?
https://github.com/apache/beam/pull/10851


On Thu, Feb 13, 2020 at 9:39 AM Ismaël Mejía <ie...@gmail.com> wrote:

>
> > I can argue for not pinning and bounding with major version ranges. This
> gives flexibility to users to mix other third party libraries that share
> common dependencies with Beam. Our expectation is that dependencies follow
> semantic versioning and do not introduce breaking changes unless there is a
> major version change. A good example of this is Beam's dependency on
> "pytz>=2018.3". It is a simple wrapper around a time zone file. Latest
> version of the dependency is 2019.3, it is updated a few times a year. Beam
> users do not have to update Beam just to be able to use a later version of
> it since Beam does not pin it.
>
> Avro does not follow semantic versioning (the first number corresponds to
> the version of the Avro binary format the release is compatible with, the
> second correspond to the MAJOR and the third to the MINOR in semver), so we
> should then fix the upper bound to 1.10.0 instead of 2.0.0 considering that
> 1.10.x before the summer and it may contain breaking changes.
>
> > There is also a middle ground, where we can pin certain dependencies if
> we are not confident about their releases. And allow ranges for rest of the
> dependencies. In general, we are currently following this practice.
>
> I see your point, like many things in software it is all about tradeoffs,
> and it is good to find a middle ground, do we have a robust reproducible
> release experience, or do we deal with the annoyance of doing manual minor
> version upgrades. Choices choices...
>
>
>
>
> On Thu, Feb 13, 2020 at 2:26 AM Ahmet Altay <al...@google.com> wrote:
>
>>
>>
>> On Wed, Feb 12, 2020 at 12:54 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> Independently of the bug in the dependency release the fact that the
>>> Beam Python
>>> SDK does not have pinned fixed dependency numbers is error-prone. We may
>>> continue to have this kind of problems until we fix this (with other
>>> dependencies too). In the Java SDK we do not accept such type of dynamic
>>> dependency numbers and python should probably follow this practice to
>>> avoid
>>> issues like the present one.
>>>
>>> Why don't we just do:
>>>
>>>     'avro-python3==1.9.1',
>>>
>>> instead of the current:
>>>
>>>     'avro-python3>=1.8.1,!=1.9.2,<2.0.0; python_version >= "3.0"',
>>>
>>
>> I agree this is error prone. Your argument for pinning makes sense and I
>> agree with it.
>>
>> I can argue for not pinning and bounding with major version ranges. This
>> gives flexibility to users to mix other third party libraries that share
>> common dependencies with Beam. Our expectation is that dependencies follow
>> semantic versioning and do not introduce breaking changes unless there is a
>> major version change. A good example of this is Beam's dependency on
>> "pytz>=2018.3". It is a simple wrapper around a time zone file. Latest
>> version of the dependency is 2019.3, it is updated a few times a year. Beam
>> users do not have to update Beam just to be able to use a later version of
>> it since Beam does not pin it.
>>
>> There is also a middle ground, where we can pin certain dependencies if
>> we are not confident about their releases. And allow ranges for rest of the
>> dependencies. In general, we are currently following this practice.
>>
>>
>>>
>>>
>>> On Wed, Feb 12, 2020 at 9:14 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> Related: we have dependencies on avro, avro-python3, and fastavro.
>>>> fastavro supports both python 2 and 3. Could we reduce this dependency list
>>>> and depend only on fastavro? If we need avro and avro-python3 for the
>>>> purposes of testing only, we can move them to test only dependencies.
>>>>
>>>> +Chamikara Jayalath <ch...@google.com>, because I vaguely remember
>>>> him working on this.
>>>>
>>>> The reason I am calling for this is the impact of bad dependency
>>>> releases are high. All previously released Beam versions will be impacted.
>>>> Reducing the dependency list will reduce the risk.
>>>>
>>>> Ahmet
>>>>
>>>> On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> Thank you Valentyn!
>>>>>
>>>>> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev <
>>>>> valentyn@google.com> wrote:
>>>>>
>>>>>> Yes, otherwise all Python tests will continue to fail until Avro
>>>>>> comes up with a new release. Sent:
>>>>>> https://github.com/apache/beam/pull/10844
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Should we update Beam's setup.py to skip this avro-python3 version?
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <
>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>
>>>>>>>> makes sense. I'll add this workaround for now.
>>>>>>>> Thanks so much for your help!
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <
>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>
>>>>>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies, including
>>>>>>>>> (a working version) of avro-python3. So after reading your email once
>>>>>>>>> again, I think in your case you were not able to install Beam SDK locally.
>>>>>>>>> So a workaround for you would be to `pip install avro-python3==1.9.1` or
>>>>>>>>> `pip install pycodestyle`  before installing Beam, until AVRO-2737
>>>>>>>>> is resolved.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Ah, there's already
>>>>>>>>>> https://issues.apache.org/jira/browse/AVRO-2737 and it received
>>>>>>>>>> attention.
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Here's a short repro:
>>>>>>>>>>>>
>>>>>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>>>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>>>>>>>> Collecting avro-python3
>>>>>>>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>>>>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>>>>>>>>>> tokenize; sys.argv[0] =
>>>>>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>>>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>>>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>>>>>>>     Complete output (5 lines):
>>>>>>>>>>>>     Traceback (most recent call last):
>>>>>>>>>>>>       File "<string>", line 1, in <module>
>>>>>>>>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py",
>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>         import pycodestyle
>>>>>>>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>>>>>>>     ----------------------------------------
>>>>>>>>>>>> ERROR: Command errored out with exit status 1: python setup.py
>>>>>>>>>>>> egg_info Check the logs for full command output.
>>>>>>>>>>>> root@04b45a100d16:/#
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, it is a bug in the recent Avro release. We should report
>>>>>>>>>>>>> it to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>>>>>>>>>> 1.9.1, for example via requirements.txt.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <
>>>>>>>>>>>>> sniemitz@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There was recently an update to add autoformatting to the
>>>>>>>>>>>>>>> Python SDK[1].
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1:
>>>>>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Some more information for this as I still can't get to fix
>>>>>>>>>>>>>>>> it....
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>>>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I just checked and that image hasn't been updated recently.
>>>>>>>>>>>>>>>> I also redeployed my pipeline to another (older) deployment of KFP and it
>>>>>>>>>>>>>>>> gives me the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The exact same pipeline/code running on the exact same
>>>>>>>>>>>>>>>> image has been running fine for days. Did anything changed on the
>>>>>>>>>>>>>>>> beam/dataflow side since yesterday morning?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for your help! this is a production pipeline that is
>>>>>>>>>>>>>>>> not running for us :(
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been running
>>>>>>>>>>>>>>>>> fine in dataflow for days now.
>>>>>>>>>>>>>>>>> We haven't changed anything on this code but this morning
>>>>>>>>>>>>>>>>> run failed  (it couldn't even spin up the job)
>>>>>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't changed)
>>>>>>>>>>>>>>>>> but maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py",
>>>>>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>>> 12 File
>>>>>>>>>>>>>>>>> "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, in <module>
>>>>>>>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Ismaël Mejía <ie...@gmail.com>.
> I can argue for not pinning and bounding with major version ranges. This
gives flexibility to users to mix other third party libraries that share
common dependencies with Beam. Our expectation is that dependencies follow
semantic versioning and do not introduce breaking changes unless there is a
major version change. A good example of this is Beam's dependency on
"pytz>=2018.3". It is a simple wrapper around a time zone file. Latest
version of the dependency is 2019.3, it is updated a few times a year. Beam
users do not have to update Beam just to be able to use a later version of
it since Beam does not pin it.

Avro does not follow semantic versioning (the first number corresponds to
the version of the Avro binary format the release is compatible with, the
second correspond to the MAJOR and the third to the MINOR in semver), so we
should then fix the upper bound to 1.10.0 instead of 2.0.0 considering that
1.10.x before the summer and it may contain breaking changes.

> There is also a middle ground, where we can pin certain dependencies if
we are not confident about their releases. And allow ranges for rest of the
dependencies. In general, we are currently following this practice.

I see your point, like many things in software it is all about tradeoffs,
and it is good to find a middle ground, do we have a robust reproducible
release experience, or do we deal with the annoyance of doing manual minor
version upgrades. Choices choices...




On Thu, Feb 13, 2020 at 2:26 AM Ahmet Altay <al...@google.com> wrote:

>
>
> On Wed, Feb 12, 2020 at 12:54 PM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> Independently of the bug in the dependency release the fact that the Beam
>> Python
>> SDK does not have pinned fixed dependency numbers is error-prone. We may
>> continue to have this kind of problems until we fix this (with other
>> dependencies too). In the Java SDK we do not accept such type of dynamic
>> dependency numbers and python should probably follow this practice to
>> avoid
>> issues like the present one.
>>
>> Why don't we just do:
>>
>>     'avro-python3==1.9.1',
>>
>> instead of the current:
>>
>>     'avro-python3>=1.8.1,!=1.9.2,<2.0.0; python_version >= "3.0"',
>>
>
> I agree this is error prone. Your argument for pinning makes sense and I
> agree with it.
>
> I can argue for not pinning and bounding with major version ranges. This
> gives flexibility to users to mix other third party libraries that share
> common dependencies with Beam. Our expectation is that dependencies follow
> semantic versioning and do not introduce breaking changes unless there is a
> major version change. A good example of this is Beam's dependency on
> "pytz>=2018.3". It is a simple wrapper around a time zone file. Latest
> version of the dependency is 2019.3, it is updated a few times a year. Beam
> users do not have to update Beam just to be able to use a later version of
> it since Beam does not pin it.
>
> There is also a middle ground, where we can pin certain dependencies if we
> are not confident about their releases. And allow ranges for rest of the
> dependencies. In general, we are currently following this practice.
>
>
>>
>>
>> On Wed, Feb 12, 2020 at 9:14 PM Ahmet Altay <al...@google.com> wrote:
>>
>>> Related: we have dependencies on avro, avro-python3, and fastavro.
>>> fastavro supports both python 2 and 3. Could we reduce this dependency list
>>> and depend only on fastavro? If we need avro and avro-python3 for the
>>> purposes of testing only, we can move them to test only dependencies.
>>>
>>> +Chamikara Jayalath <ch...@google.com>, because I vaguely remember
>>> him working on this.
>>>
>>> The reason I am calling for this is the impact of bad dependency
>>> releases are high. All previously released Beam versions will be impacted.
>>> Reducing the dependency list will reduce the risk.
>>>
>>> Ahmet
>>>
>>> On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> Thank you Valentyn!
>>>>
>>>> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev <
>>>> valentyn@google.com> wrote:
>>>>
>>>>> Yes, otherwise all Python tests will continue to fail until Avro comes
>>>>> up with a new release. Sent: https://github.com/apache/beam/pull/10844
>>>>>
>>>>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>> Should we update Beam's setup.py to skip this avro-python3 version?
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <
>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>
>>>>>>> makes sense. I'll add this workaround for now.
>>>>>>> Thanks so much for your help!
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <
>>>>>>> valentyn@google.com> wrote:
>>>>>>>
>>>>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies, including
>>>>>>>> (a working version) of avro-python3. So after reading your email once
>>>>>>>> again, I think in your case you were not able to install Beam SDK locally.
>>>>>>>> So a workaround for you would be to `pip install avro-python3==1.9.1` or
>>>>>>>> `pip install pycodestyle`  before installing Beam, until AVRO-2737
>>>>>>>> is resolved.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>
>>>>>>>>> Ah, there's already
>>>>>>>>> https://issues.apache.org/jira/browse/AVRO-2737 and it received
>>>>>>>>> attention.
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Here's a short repro:
>>>>>>>>>>>
>>>>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>>>>>>> Collecting avro-python3
>>>>>>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>>>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>>>>>>>>> tokenize; sys.argv[0] =
>>>>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>>>>>>     Complete output (5 lines):
>>>>>>>>>>>     Traceback (most recent call last):
>>>>>>>>>>>       File "<string>", line 1, in <module>
>>>>>>>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py",
>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>         import pycodestyle
>>>>>>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>>>>>>     ----------------------------------------
>>>>>>>>>>> ERROR: Command errored out with exit status 1: python setup.py
>>>>>>>>>>> egg_info Check the logs for full command output.
>>>>>>>>>>> root@04b45a100d16:/#
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yes, it is a bug in the recent Avro release. We should report
>>>>>>>>>>>> it to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>>>>>>>>> 1.9.1, for example via requirements.txt.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <
>>>>>>>>>>>> sniemitz@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There was recently an update to add autoformatting to the
>>>>>>>>>>>>>> Python SDK[1].
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1:
>>>>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Some more information for this as I still can't get to fix
>>>>>>>>>>>>>>> it....
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I just checked and that image hasn't been updated recently.
>>>>>>>>>>>>>>> I also redeployed my pipeline to another (older) deployment of KFP and it
>>>>>>>>>>>>>>> gives me the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The exact same pipeline/code running on the exact same image
>>>>>>>>>>>>>>> has been running fine for days. Did anything changed on the beam/dataflow
>>>>>>>>>>>>>>> side since yesterday morning?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for your help! this is a production pipeline that is
>>>>>>>>>>>>>>> not running for us :(
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been running
>>>>>>>>>>>>>>>> fine in dataflow for days now.
>>>>>>>>>>>>>>>> We haven't changed anything on this code but this morning
>>>>>>>>>>>>>>>> run failed  (it couldn't even spin up the job)
>>>>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't changed)
>>>>>>>>>>>>>>>> but maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py",
>>>>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py",
>>>>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Ahmet Altay <al...@google.com>.
On Wed, Feb 12, 2020 at 12:54 PM Ismaël Mejía <ie...@gmail.com> wrote:

> Independently of the bug in the dependency release the fact that the Beam
> Python
> SDK does not have pinned fixed dependency numbers is error-prone. We may
> continue to have this kind of problems until we fix this (with other
> dependencies too). In the Java SDK we do not accept such type of dynamic
> dependency numbers and python should probably follow this practice to avoid
> issues like the present one.
>
> Why don't we just do:
>
>     'avro-python3==1.9.1',
>
> instead of the current:
>
>     'avro-python3>=1.8.1,!=1.9.2,<2.0.0; python_version >= "3.0"',
>

I agree this is error prone. Your argument for pinning makes sense and I
agree with it.

I can argue for not pinning and bounding with major version ranges. This
gives flexibility to users to mix other third party libraries that share
common dependencies with Beam. Our expectation is that dependencies follow
semantic versioning and do not introduce breaking changes unless there is a
major version change. A good example of this is Beam's dependency on
"pytz>=2018.3". It is a simple wrapper around a time zone file. Latest
version of the dependency is 2019.3, it is updated a few times a year. Beam
users do not have to update Beam just to be able to use a later version of
it since Beam does not pin it.

There is also a middle ground, where we can pin certain dependencies if we
are not confident about their releases. And allow ranges for rest of the
dependencies. In general, we are currently following this practice.


>
>
> On Wed, Feb 12, 2020 at 9:14 PM Ahmet Altay <al...@google.com> wrote:
>
>> Related: we have dependencies on avro, avro-python3, and fastavro.
>> fastavro supports both python 2 and 3. Could we reduce this dependency list
>> and depend only on fastavro? If we need avro and avro-python3 for the
>> purposes of testing only, we can move them to test only dependencies.
>>
>> +Chamikara Jayalath <ch...@google.com>, because I vaguely remember
>> him working on this.
>>
>> The reason I am calling for this is the impact of bad dependency releases
>> are high. All previously released Beam versions will be impacted. Reducing
>> the dependency list will reduce the risk.
>>
>> Ahmet
>>
>> On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <al...@google.com> wrote:
>>
>>> Thank you Valentyn!
>>>
>>> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev <
>>> valentyn@google.com> wrote:
>>>
>>>> Yes, otherwise all Python tests will continue to fail until Avro comes
>>>> up with a new release. Sent: https://github.com/apache/beam/pull/10844
>>>>
>>>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> Should we update Beam's setup.py to skip this avro-python3 version?
>>>>>
>>>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <
>>>>> alan.krumholz@betterup.co> wrote:
>>>>>
>>>>>> makes sense. I'll add this workaround for now.
>>>>>> Thanks so much for your help!
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <
>>>>>> valentyn@google.com> wrote:
>>>>>>
>>>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies, including
>>>>>>> (a working version) of avro-python3. So after reading your email once
>>>>>>> again, I think in your case you were not able to install Beam SDK locally.
>>>>>>> So a workaround for you would be to `pip install avro-python3==1.9.1` or
>>>>>>> `pip install pycodestyle`  before installing Beam, until AVRO-2737
>>>>>>> is resolved.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>>>>>> valentyn@google.com> wrote:
>>>>>>>
>>>>>>>> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
>>>>>>>> it received attention.
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>
>>>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Here's a short repro:
>>>>>>>>>>
>>>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>>>>>> Collecting avro-python3
>>>>>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>>>>>>>> tokenize; sys.argv[0] =
>>>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>>>>>     Complete output (5 lines):
>>>>>>>>>>     Traceback (most recent call last):
>>>>>>>>>>       File "<string>", line 1, in <module>
>>>>>>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py",
>>>>>>>>>> line 41, in <module>
>>>>>>>>>>         import pycodestyle
>>>>>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>>>>>     ----------------------------------------
>>>>>>>>>> ERROR: Command errored out with exit status 1: python setup.py
>>>>>>>>>> egg_info Check the logs for full command output.
>>>>>>>>>> root@04b45a100d16:/#
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yes, it is a bug in the recent Avro release. We should report it
>>>>>>>>>>> to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>>>>>>>> 1.9.1, for example via requirements.txt.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <
>>>>>>>>>>> sniemitz@apache.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>>>>>
>>>>>>>>>>>>> There was recently an update to add autoformatting to the
>>>>>>>>>>>>> Python SDK[1].
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1:
>>>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some more information for this as I still can't get to fix
>>>>>>>>>>>>>> it....
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I just checked and that image hasn't been updated recently. I
>>>>>>>>>>>>>> also redeployed my pipeline to another (older) deployment of KFP and it
>>>>>>>>>>>>>> gives me the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The exact same pipeline/code running on the exact same image
>>>>>>>>>>>>>> has been running fine for days. Did anything changed on the beam/dataflow
>>>>>>>>>>>>>> side since yesterday morning?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for your help! this is a production pipeline that is
>>>>>>>>>>>>>> not running for us :(
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been running
>>>>>>>>>>>>>>> fine in dataflow for days now.
>>>>>>>>>>>>>>> We haven't changed anything on this code but this morning
>>>>>>>>>>>>>>> run failed  (it couldn't even spin up the job)
>>>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't changed)
>>>>>>>>>>>>>>> but maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py",
>>>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py",
>>>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Ismaël Mejía <ie...@gmail.com>.
Independently of the bug in the dependency release the fact that the Beam
Python
SDK does not have pinned fixed dependency numbers is error-prone. We may
continue to have this kind of problems until we fix this (with other
dependencies too). In the Java SDK we do not accept such type of dynamic
dependency numbers and python should probably follow this practice to avoid
issues like the present one.

Why don't we just do:

    'avro-python3==1.9.1',

instead of the current:

    'avro-python3>=1.8.1,!=1.9.2,<2.0.0; python_version >= "3.0"',


On Wed, Feb 12, 2020 at 9:14 PM Ahmet Altay <al...@google.com> wrote:

> Related: we have dependencies on avro, avro-python3, and fastavro.
> fastavro supports both python 2 and 3. Could we reduce this dependency list
> and depend only on fastavro? If we need avro and avro-python3 for the
> purposes of testing only, we can move them to test only dependencies.
>
> +Chamikara Jayalath <ch...@google.com>, because I vaguely remember
> him working on this.
>
> The reason I am calling for this is the impact of bad dependency releases
> are high. All previously released Beam versions will be impacted. Reducing
> the dependency list will reduce the risk.
>
> Ahmet
>
> On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <al...@google.com> wrote:
>
>> Thank you Valentyn!
>>
>> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Yes, otherwise all Python tests will continue to fail until Avro comes
>>> up with a new release. Sent: https://github.com/apache/beam/pull/10844
>>>
>>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> Should we update Beam's setup.py to skip this avro-python3 version?
>>>>
>>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <
>>>> alan.krumholz@betterup.co> wrote:
>>>>
>>>>> makes sense. I'll add this workaround for now.
>>>>> Thanks so much for your help!
>>>>>
>>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <
>>>>> valentyn@google.com> wrote:
>>>>>
>>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies, including (a
>>>>>> working version) of avro-python3. So after reading your email once again, I
>>>>>> think in your case you were not able to install Beam SDK locally. So a
>>>>>> workaround for you would be to `pip install avro-python3==1.9.1` or `pip
>>>>>> install pycodestyle`  before installing Beam, until AVRO-2737 is resolved.
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>>>>> valentyn@google.com> wrote:
>>>>>>
>>>>>>> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
>>>>>>> it received attention.
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>>>>> valentyn@google.com> wrote:
>>>>>>>
>>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>
>>>>>>>>> Here's a short repro:
>>>>>>>>>
>>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>>>>> Collecting avro-python3
>>>>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>>>>>>> tokenize; sys.argv[0] =
>>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>>>>     Complete output (5 lines):
>>>>>>>>>     Traceback (most recent call last):
>>>>>>>>>       File "<string>", line 1, in <module>
>>>>>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line
>>>>>>>>> 41, in <module>
>>>>>>>>>         import pycodestyle
>>>>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>>>>     ----------------------------------------
>>>>>>>>> ERROR: Command errored out with exit status 1: python setup.py
>>>>>>>>> egg_info Check the logs for full command output.
>>>>>>>>> root@04b45a100d16:/#
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Yes, it is a bug in the recent Avro release. We should report it
>>>>>>>>>> to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>>>>>>> 1.9.1, for example via requirements.txt.
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <
>>>>>>>>>> sniemitz@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>>>>
>>>>>>>>>>>> There was recently an update to add autoformatting to the
>>>>>>>>>>>> Python SDK[1].
>>>>>>>>>>>>
>>>>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>>>>
>>>>>>>>>>>> 1:
>>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Some more information for this as I still can't get to fix
>>>>>>>>>>>>> it....
>>>>>>>>>>>>>
>>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>>>>
>>>>>>>>>>>>> I just checked and that image hasn't been updated recently. I
>>>>>>>>>>>>> also redeployed my pipeline to another (older) deployment of KFP and it
>>>>>>>>>>>>> gives me the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>>>>>>
>>>>>>>>>>>>> The exact same pipeline/code running on the exact same image
>>>>>>>>>>>>> has been running fine for days. Did anything changed on the beam/dataflow
>>>>>>>>>>>>> side since yesterday morning?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for your help! this is a production pipeline that is
>>>>>>>>>>>>> not running for us :(
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been running
>>>>>>>>>>>>>> fine in dataflow for days now.
>>>>>>>>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>>>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't changed)
>>>>>>>>>>>>>> but maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py",
>>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>>>>>>> code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py
>>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py",
>>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with
>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Ahmet Altay <al...@google.com>.
Related: we have dependencies on avro, avro-python3, and fastavro. fastavro
supports both python 2 and 3. Could we reduce this dependency list and
depend only on fastavro? If we need avro and avro-python3 for the purposes
of testing only, we can move them to test only dependencies.

+Chamikara Jayalath <ch...@google.com>, because I vaguely remember him
working on this.

The reason I am calling for this is the impact of bad dependency releases
are high. All previously released Beam versions will be impacted. Reducing
the dependency list will reduce the risk.

Ahmet

On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <al...@google.com> wrote:

> Thank you Valentyn!
>
> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Yes, otherwise all Python tests will continue to fail until Avro comes up
>> with a new release. Sent: https://github.com/apache/beam/pull/10844
>>
>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com> wrote:
>>
>>> Should we update Beam's setup.py to skip this avro-python3 version?
>>>
>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <
>>> alan.krumholz@betterup.co> wrote:
>>>
>>>> makes sense. I'll add this workaround for now.
>>>> Thanks so much for your help!
>>>>
>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <
>>>> valentyn@google.com> wrote:
>>>>
>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies, including (a
>>>>> working version) of avro-python3. So after reading your email once again, I
>>>>> think in your case you were not able to install Beam SDK locally. So a
>>>>> workaround for you would be to `pip install avro-python3==1.9.1` or `pip
>>>>> install pycodestyle`  before installing Beam, until AVRO-2737 is resolved.
>>>>>
>>>>>
>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>>>> valentyn@google.com> wrote:
>>>>>
>>>>>> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
>>>>>> it received attention.
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>>>> valentyn@google.com> wrote:
>>>>>>
>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>> valentyn@google.com> wrote:
>>>>>>>
>>>>>>>> Here's a short repro:
>>>>>>>>
>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>>>> Collecting avro-python3
>>>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>>>>>> tokenize; sys.argv[0] =
>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>>>     Complete output (5 lines):
>>>>>>>>     Traceback (most recent call last):
>>>>>>>>       File "<string>", line 1, in <module>
>>>>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line
>>>>>>>> 41, in <module>
>>>>>>>>         import pycodestyle
>>>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>>>     ----------------------------------------
>>>>>>>> ERROR: Command errored out with exit status 1: python setup.py
>>>>>>>> egg_info Check the logs for full command output.
>>>>>>>> root@04b45a100d16:/#
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>>> valentyn@google.com> wrote:
>>>>>>>>
>>>>>>>>> Yes, it is a bug in the recent Avro release. We should report it
>>>>>>>>> to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>>>>>> 1.9.1, for example via requirements.txt.
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <
>>>>>>>>> sniemitz@apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>>>
>>>>>>>>>>> There was recently an update to add autoformatting to the Python
>>>>>>>>>>> SDK[1].
>>>>>>>>>>>
>>>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>>>
>>>>>>>>>>> 1:
>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Some more information for this as I still can't get to fix
>>>>>>>>>>>> it....
>>>>>>>>>>>>
>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>>>
>>>>>>>>>>>> I just checked and that image hasn't been updated recently. I
>>>>>>>>>>>> also redeployed my pipeline to another (older) deployment of KFP and it
>>>>>>>>>>>> gives me the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>>>>>
>>>>>>>>>>>> The exact same pipeline/code running on the exact same image
>>>>>>>>>>>> has been running fine for days. Did anything changed on the beam/dataflow
>>>>>>>>>>>> side since yesterday morning?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>>>>>>>> running for us :(
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been running fine
>>>>>>>>>>>>> in dataflow for days now.
>>>>>>>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't changed) but
>>>>>>>>>>>>> maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>
>>>>>>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py",
>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>>>>>> code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py
>>>>>>>>>>>>> egg_info:
>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py",
>>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>>>>>> code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>>>
>>>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Ahmet Altay <al...@google.com>.
Thank you Valentyn!

On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Yes, otherwise all Python tests will continue to fail until Avro comes up
> with a new release. Sent: https://github.com/apache/beam/pull/10844
>
> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com> wrote:
>
>> Should we update Beam's setup.py to skip this avro-python3 version?
>>
>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <al...@betterup.co>
>> wrote:
>>
>>> makes sense. I'll add this workaround for now.
>>> Thanks so much for your help!
>>>
>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <
>>> valentyn@google.com> wrote:
>>>
>>>> Alan, Dataflow workers preinstall Beam SDK dependencies, including (a
>>>> working version) of avro-python3. So after reading your email once again, I
>>>> think in your case you were not able to install Beam SDK locally. So a
>>>> workaround for you would be to `pip install avro-python3==1.9.1` or `pip
>>>> install pycodestyle`  before installing Beam, until AVRO-2737 is resolved.
>>>>
>>>>
>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>>> valentyn@google.com> wrote:
>>>>
>>>>> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
>>>>> it received attention.
>>>>>
>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>>> valentyn@google.com> wrote:
>>>>>
>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>> valentyn@google.com> wrote:
>>>>>>
>>>>>>> Here's a short repro:
>>>>>>>
>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>>> Collecting avro-python3
>>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>>>>> tokenize; sys.argv[0] =
>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>>     Complete output (5 lines):
>>>>>>>     Traceback (most recent call last):
>>>>>>>       File "<string>", line 1, in <module>
>>>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line
>>>>>>> 41, in <module>
>>>>>>>         import pycodestyle
>>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>>     ----------------------------------------
>>>>>>> ERROR: Command errored out with exit status 1: python setup.py
>>>>>>> egg_info Check the logs for full command output.
>>>>>>> root@04b45a100d16:/#
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>>> valentyn@google.com> wrote:
>>>>>>>
>>>>>>>> Yes, it is a bug in the recent Avro release. We should report it
>>>>>>>> to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>>>>> 1.9.1, for example via requirements.txt.
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>>
>>>>>>>>>> There was recently an update to add autoformatting to the Python
>>>>>>>>>> SDK[1].
>>>>>>>>>>
>>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>>
>>>>>>>>>> 1:
>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>
>>>>>>>>>>> Some more information for this as I still can't get to fix it....
>>>>>>>>>>>
>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>>
>>>>>>>>>>> I just checked and that image hasn't been updated recently. I
>>>>>>>>>>> also redeployed my pipeline to another (older) deployment of KFP and it
>>>>>>>>>>> gives me the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>>>>
>>>>>>>>>>> The exact same pipeline/code running on the exact same image has
>>>>>>>>>>> been running fine for days. Did anything changed on the beam/dataflow side
>>>>>>>>>>> since yesterday morning?
>>>>>>>>>>>
>>>>>>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>>>>>>> running for us :(
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been running fine
>>>>>>>>>>>> in dataflow for days now.
>>>>>>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>>>>>>> The job submits a setup.py file (that also hasn't changed) but
>>>>>>>>>>>> maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>>
>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line
>>>>>>>>>>>> 41, in <module>
>>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>>>>> code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py",
>>>>>>>>>>>> line 41, in <module>
>>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>>>>> code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>>
>>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Yes, otherwise all Python tests will continue to fail until Avro comes up
with a new release. Sent: https://github.com/apache/beam/pull/10844

On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <al...@google.com> wrote:

> Should we update Beam's setup.py to skip this avro-python3 version?
>
> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <al...@betterup.co>
> wrote:
>
>> makes sense. I'll add this workaround for now.
>> Thanks so much for your help!
>>
>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Alan, Dataflow workers preinstall Beam SDK dependencies, including (a
>>> working version) of avro-python3. So after reading your email once again, I
>>> think in your case you were not able to install Beam SDK locally. So a
>>> workaround for you would be to `pip install avro-python3==1.9.1` or `pip
>>> install pycodestyle`  before installing Beam, until AVRO-2737 is resolved.
>>>
>>>
>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <
>>> valentyn@google.com> wrote:
>>>
>>>> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
>>>> it received attention.
>>>>
>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>>> valentyn@google.com> wrote:
>>>>
>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>>
>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>> valentyn@google.com> wrote:
>>>>>
>>>>>> Here's a short repro:
>>>>>>
>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>>> Collecting avro-python3
>>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>>     ERROR: Command errored out with exit status 1:
>>>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>>>> tokenize; sys.argv[0] =
>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>>     Complete output (5 lines):
>>>>>>     Traceback (most recent call last):
>>>>>>       File "<string>", line 1, in <module>
>>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line
>>>>>> 41, in <module>
>>>>>>         import pycodestyle
>>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>>     ----------------------------------------
>>>>>> ERROR: Command errored out with exit status 1: python setup.py
>>>>>> egg_info Check the logs for full command output.
>>>>>> root@04b45a100d16:/#
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>>> valentyn@google.com> wrote:
>>>>>>
>>>>>>> Yes, it is a bug in the recent Avro release. We should report it
>>>>>>> to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>>>> 1.9.1, for example via requirements.txt.
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>>>>>>
>>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>>
>>>>>>>>> There was recently an update to add autoformatting to the Python
>>>>>>>>> SDK[1].
>>>>>>>>>
>>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>>
>>>>>>>>> 1:
>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>
>>>>>>>>>> Some more information for this as I still can't get to fix it....
>>>>>>>>>>
>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>>
>>>>>>>>>> I just checked and that image hasn't been updated recently. I
>>>>>>>>>> also redeployed my pipeline to another (older) deployment of KFP and it
>>>>>>>>>> gives me the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>>>
>>>>>>>>>> The exact same pipeline/code running on the exact same image has
>>>>>>>>>> been running fine for days. Did anything changed on the beam/dataflow side
>>>>>>>>>> since yesterday morning?
>>>>>>>>>>
>>>>>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>>>>>> running for us :(
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi, I have a scheduled daily job that I have been running fine
>>>>>>>>>>> in dataflow for days now.
>>>>>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>>>>>> The job submits a setup.py file (that also hasn't changed) but
>>>>>>>>>>> maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>>
>>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>>> Thanks!
>>>>>>>>>>>
>>>>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line
>>>>>>>>>>> 41, in <module>
>>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>>>> code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line
>>>>>>>>>>> 41, in <module>
>>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>>>> code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>>
>>>>>>>>>>

Re: daily dataflow job failing today

Posted by Ahmet Altay <al...@google.com>.
Should we update Beam's setup.py to skip this avro-python3 version?

On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz <al...@betterup.co>
wrote:

> makes sense. I'll add this workaround for now.
> Thanks so much for your help!
>
> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Alan, Dataflow workers preinstall Beam SDK dependencies, including (a
>> working version) of avro-python3. So after reading your email once again, I
>> think in your case you were not able to install Beam SDK locally. So a
>> workaround for you would be to `pip install avro-python3==1.9.1` or `pip
>> install pycodestyle`  before installing Beam, until AVRO-2737 is resolved.
>>
>>
>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
>>> it received attention.
>>>
>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <
>>> valentyn@google.com> wrote:
>>>
>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>>
>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>> valentyn@google.com> wrote:
>>>>
>>>>> Here's a short repro:
>>>>>
>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>>> root@04b45a100d16:/# pip install avro-python3
>>>>> Collecting avro-python3
>>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>>     ERROR: Command errored out with exit status 1:
>>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>>> tokenize; sys.argv[0] =
>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>>     Complete output (5 lines):
>>>>>     Traceback (most recent call last):
>>>>>       File "<string>", line 1, in <module>
>>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41,
>>>>> in <module>
>>>>>         import pycodestyle
>>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>>     ----------------------------------------
>>>>> ERROR: Command errored out with exit status 1: python setup.py
>>>>> egg_info Check the logs for full command output.
>>>>> root@04b45a100d16:/#
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>>> valentyn@google.com> wrote:
>>>>>
>>>>>> Yes, it is a bug in the recent Avro release. We should report it
>>>>>> to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>>> 1.9.1, for example via requirements.txt.
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>>>>>
>>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>>
>>>>>>>> There was recently an update to add autoformatting to the Python
>>>>>>>> SDK[1].
>>>>>>>>
>>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>>
>>>>>>>> 1:
>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>
>>>>>>>>> Some more information for this as I still can't get to fix it....
>>>>>>>>>
>>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>>
>>>>>>>>> I just checked and that image hasn't been updated recently. I also
>>>>>>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>>>>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>>
>>>>>>>>> The exact same pipeline/code running on the exact same image has
>>>>>>>>> been running fine for days. Did anything changed on the beam/dataflow side
>>>>>>>>> since yesterday morning?
>>>>>>>>>
>>>>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>>>>> running for us :(
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>>
>>>>>>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>>>>>>> dataflow for days now.
>>>>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>>>>> The job submits a setup.py file (that also hasn't changed) but
>>>>>>>>>> maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>>
>>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line
>>>>>>>>>> 41, in <module>
>>>>>>>>>> 5 import pycodestyle
>>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>>> 7 ----------------------------------------
>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>>> code 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line
>>>>>>>>>> 41, in <module>
>>>>>>>>>> 13 import pycodestyle
>>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>>> 15 ----------------------------------------
>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>>> code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>>
>>>>>>>>>

Re: daily dataflow job failing today

Posted by Alan Krumholz <al...@betterup.co>.
makes sense. I'll add this workaround for now.
Thanks so much for your help!

On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Alan, Dataflow workers preinstall Beam SDK dependencies, including (a
> working version) of avro-python3. So after reading your email once again, I
> think in your case you were not able to install Beam SDK locally. So a
> workaround for you would be to `pip install avro-python3==1.9.1` or `pip
> install pycodestyle`  before installing Beam, until AVRO-2737 is resolved.
>
>
> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
>> it received attention.
>>
>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>
>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>> valentyn@google.com> wrote:
>>>
>>>> Here's a short repro:
>>>>
>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>> root@04b45a100d16:/# pip install avro-python3
>>>> Collecting avro-python3
>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>     ERROR: Command errored out with exit status 1:
>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>> tokenize; sys.argv[0] =
>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>     Complete output (5 lines):
>>>>     Traceback (most recent call last):
>>>>       File "<string>", line 1, in <module>
>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41,
>>>> in <module>
>>>>         import pycodestyle
>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>     ----------------------------------------
>>>> ERROR: Command errored out with exit status 1: python setup.py egg_info
>>>> Check the logs for full command output.
>>>> root@04b45a100d16:/#
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>> valentyn@google.com> wrote:
>>>>
>>>>> Yes, it is a bug in the recent Avro release. We should report it
>>>>> to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>> 1.9.1, for example via requirements.txt.
>>>>>
>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>>>>
>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>
>>>>>>> There was recently an update to add autoformatting to the Python
>>>>>>> SDK[1].
>>>>>>>
>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>
>>>>>>> 1:
>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>
>>>>>>>> Some more information for this as I still can't get to fix it....
>>>>>>>>
>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>
>>>>>>>> I just checked and that image hasn't been updated recently. I also
>>>>>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>>>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>
>>>>>>>> The exact same pipeline/code running on the exact same image has
>>>>>>>> been running fine for days. Did anything changed on the beam/dataflow side
>>>>>>>> since yesterday morning?
>>>>>>>>
>>>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>>>> running for us :(
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>
>>>>>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>>>>>> dataflow for days now.
>>>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>>>> The job submits a setup.py file (that also hasn't changed) but
>>>>>>>>> maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>
>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line
>>>>>>>>> 41, in <module>
>>>>>>>>> 5 import pycodestyle
>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>> 7 ----------------------------------------
>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code
>>>>>>>>> 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line
>>>>>>>>> 41, in <module>
>>>>>>>>> 13 import pycodestyle
>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>> 15 ----------------------------------------
>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>> code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>
>>>>>>>>

Re: daily dataflow job failing today

Posted by Alan Krumholz <al...@betterup.co>.
makes sense. I'll add this workaround for now.
Thanks so much for your help!

On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Alan, Dataflow workers preinstall Beam SDK dependencies, including (a
> working version) of avro-python3. So after reading your email once again, I
> think in your case you were not able to install Beam SDK locally. So a
> workaround for you would be to `pip install avro-python3==1.9.1` or `pip
> install pycodestyle`  before installing Beam, until AVRO-2737 is resolved.
>
>
> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
>> it received attention.
>>
>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>>
>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>> valentyn@google.com> wrote:
>>>
>>>> Here's a short repro:
>>>>
>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>>> root@04b45a100d16:/# pip install avro-python3
>>>> Collecting avro-python3
>>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>>     ERROR: Command errored out with exit status 1:
>>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>>> tokenize; sys.argv[0] =
>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>>     Complete output (5 lines):
>>>>     Traceback (most recent call last):
>>>>       File "<string>", line 1, in <module>
>>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41,
>>>> in <module>
>>>>         import pycodestyle
>>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>>     ----------------------------------------
>>>> ERROR: Command errored out with exit status 1: python setup.py egg_info
>>>> Check the logs for full command output.
>>>> root@04b45a100d16:/#
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>>> valentyn@google.com> wrote:
>>>>
>>>>> Yes, it is a bug in the recent Avro release. We should report it
>>>>> to the Avro maintainers. The workaround is to downgrade avro-python3 to
>>>>> 1.9.1, for example via requirements.txt.
>>>>>
>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>>> added pycodestyle as a dependency, probably related?
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>>>>
>>>>>>> +dev <de...@beam.apache.org>
>>>>>>>
>>>>>>> There was recently an update to add autoformatting to the Python
>>>>>>> SDK[1].
>>>>>>>
>>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>>
>>>>>>> 1:
>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>
>>>>>>>> Some more information for this as I still can't get to fix it....
>>>>>>>>
>>>>>>>> This job is triggered using the beam[gcp] python sdk from a
>>>>>>>> KubeFlow Pipelines component which runs on top of docker image:
>>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>>
>>>>>>>> I just checked and that image hasn't been updated recently. I also
>>>>>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>>>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>>>>>
>>>>>>>> The exact same pipeline/code running on the exact same image has
>>>>>>>> been running fine for days. Did anything changed on the beam/dataflow side
>>>>>>>> since yesterday morning?
>>>>>>>>
>>>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>>>> running for us :(
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>>
>>>>>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>>>>>> dataflow for days now.
>>>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>>>> The job submits a setup.py file (that also hasn't changed) but
>>>>>>>>> maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>>
>>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line
>>>>>>>>> 41, in <module>
>>>>>>>>> 5 import pycodestyle
>>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>>> 7 ----------------------------------------
>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code
>>>>>>>>> 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line
>>>>>>>>> 41, in <module>
>>>>>>>>> 13 import pycodestyle
>>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>>> 15 ----------------------------------------
>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error
>>>>>>>>> code 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>>
>>>>>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Alan, Dataflow workers preinstall Beam SDK dependencies, including (a
working version) of avro-python3. So after reading your email once again, I
think in your case you were not able to install Beam SDK locally. So a
workaround for you would be to `pip install avro-python3==1.9.1` or `pip
install pycodestyle`  before installing Beam, until AVRO-2737 is resolved.


On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
> it received attention.
>
> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>
>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Here's a short repro:
>>>
>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>> root@04b45a100d16:/# pip install avro-python3
>>> Collecting avro-python3
>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>     ERROR: Command errored out with exit status 1:
>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>> tokenize; sys.argv[0] =
>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>     Complete output (5 lines):
>>>     Traceback (most recent call last):
>>>       File "<string>", line 1, in <module>
>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41,
>>> in <module>
>>>         import pycodestyle
>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>     ----------------------------------------
>>> ERROR: Command errored out with exit status 1: python setup.py egg_info
>>> Check the logs for full command output.
>>> root@04b45a100d16:/#
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>> valentyn@google.com> wrote:
>>>
>>>> Yes, it is a bug in the recent Avro release. We should report it to the
>>>> Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
>>>> example via requirements.txt.
>>>>
>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>>>> wrote:
>>>>
>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>> added pycodestyle as a dependency, probably related?
>>>>>
>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>>>
>>>>>> +dev <de...@beam.apache.org>
>>>>>>
>>>>>> There was recently an update to add autoformatting to the Python
>>>>>> SDK[1].
>>>>>>
>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>
>>>>>> 1:
>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>
>>>>>>> Some more information for this as I still can't get to fix it....
>>>>>>>
>>>>>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>>>>>> Pipelines component which runs on top of docker image:
>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>
>>>>>>> I just checked and that image hasn't been updated recently. I also
>>>>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>>>>
>>>>>>> The exact same pipeline/code running on the exact same image has
>>>>>>> been running fine for days. Did anything changed on the beam/dataflow side
>>>>>>> since yesterday morning?
>>>>>>>
>>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>>> running for us :(
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>
>>>>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>>>>> dataflow for days now.
>>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>>> The job submits a setup.py file (that also hasn't changed) but
>>>>>>>> maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>
>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41,
>>>>>>>> in <module>
>>>>>>>> 5 import pycodestyle
>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>> 7 ----------------------------------------
>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code
>>>>>>>> 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line
>>>>>>>> 41, in <module>
>>>>>>>> 13 import pycodestyle
>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>> 15 ----------------------------------------
>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error code
>>>>>>>> 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>
>>>>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Alan, Dataflow workers preinstall Beam SDK dependencies, including (a
working version) of avro-python3. So after reading your email once again, I
think in your case you were not able to install Beam SDK locally. So a
workaround for you would be to `pip install avro-python3==1.9.1` or `pip
install pycodestyle`  before installing Beam, until AVRO-2737 is resolved.


On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and
> it received attention.
>
> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Opened https://issues.apache.org/jira/browse/AVRO-2738
>>
>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Here's a short repro:
>>>
>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>>> root@04b45a100d16:/# pip install avro-python3
>>> Collecting avro-python3
>>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>>     ERROR: Command errored out with exit status 1:
>>>      command: /usr/local/bin/python -c 'import sys, setuptools,
>>> tokenize; sys.argv[0] =
>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>>     Complete output (5 lines):
>>>     Traceback (most recent call last):
>>>       File "<string>", line 1, in <module>
>>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41,
>>> in <module>
>>>         import pycodestyle
>>>     ModuleNotFoundError: No module named 'pycodestyle'
>>>     ----------------------------------------
>>> ERROR: Command errored out with exit status 1: python setup.py egg_info
>>> Check the logs for full command output.
>>> root@04b45a100d16:/#
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <
>>> valentyn@google.com> wrote:
>>>
>>>> Yes, it is a bug in the recent Avro release. We should report it to the
>>>> Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
>>>> example via requirements.txt.
>>>>
>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>>>> wrote:
>>>>
>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>>> added pycodestyle as a dependency, probably related?
>>>>>
>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>>>
>>>>>> +dev <de...@beam.apache.org>
>>>>>>
>>>>>> There was recently an update to add autoformatting to the Python
>>>>>> SDK[1].
>>>>>>
>>>>>> I'm seeing this during testing of a PR as well.
>>>>>>
>>>>>> 1:
>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>
>>>>>>> Some more information for this as I still can't get to fix it....
>>>>>>>
>>>>>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>>>>>> Pipelines component which runs on top of docker image:
>>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>>
>>>>>>> I just checked and that image hasn't been updated recently. I also
>>>>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>>>>
>>>>>>> The exact same pipeline/code running on the exact same image has
>>>>>>> been running fine for days. Did anything changed on the beam/dataflow side
>>>>>>> since yesterday morning?
>>>>>>>
>>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>>> running for us :(
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>>
>>>>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>>>>> dataflow for days now.
>>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>>> The job submits a setup.py file (that also hasn't changed) but
>>>>>>>> maybe is causing the problem? (based on the error I'm getting)
>>>>>>>>
>>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41,
>>>>>>>> in <module>
>>>>>>>> 5 import pycodestyle
>>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>>> 7 ----------------------------------------
>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code
>>>>>>>> 1 in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line
>>>>>>>> 41, in <module>
>>>>>>>> 13 import pycodestyle
>>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>>> 15 ----------------------------------------
>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error code
>>>>>>>> 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>>
>>>>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and it
received attention.

On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Opened https://issues.apache.org/jira/browse/AVRO-2738
>
> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Here's a short repro:
>>
>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>> root@04b45a100d16:/# pip install avro-python3
>> Collecting avro-python3
>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>     ERROR: Command errored out with exit status 1:
>>      command: /usr/local/bin/python -c 'import sys, setuptools, tokenize;
>> sys.argv[0] = '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>     Complete output (5 lines):
>>     Traceback (most recent call last):
>>       File "<string>", line 1, in <module>
>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41, in
>> <module>
>>         import pycodestyle
>>     ModuleNotFoundError: No module named 'pycodestyle'
>>     ----------------------------------------
>> ERROR: Command errored out with exit status 1: python setup.py egg_info
>> Check the logs for full command output.
>> root@04b45a100d16:/#
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Yes, it is a bug in the recent Avro release. We should report it to the
>>> Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
>>> example via requirements.txt.
>>>
>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>>> wrote:
>>>
>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>> added pycodestyle as a dependency, probably related?
>>>>
>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>>
>>>>> +dev <de...@beam.apache.org>
>>>>>
>>>>> There was recently an update to add autoformatting to the Python
>>>>> SDK[1].
>>>>>
>>>>> I'm seeing this during testing of a PR as well.
>>>>>
>>>>> 1:
>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>
>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>> alan.krumholz@betterup.co> wrote:
>>>>>
>>>>>> Some more information for this as I still can't get to fix it....
>>>>>>
>>>>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>>>>> Pipelines component which runs on top of docker image:
>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>
>>>>>> I just checked and that image hasn't been updated recently. I also
>>>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>>>
>>>>>> The exact same pipeline/code running on the exact same image has been
>>>>>> running fine for days. Did anything changed on the beam/dataflow side since
>>>>>> yesterday morning?
>>>>>>
>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>> running for us :(
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>
>>>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>>>> dataflow for days now.
>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>> The job submits a setup.py file (that also hasn't changed) but maybe
>>>>>>> is causing the problem? (based on the error I'm getting)
>>>>>>>
>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>> Thanks!
>>>>>>>
>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41,
>>>>>>> in <module>
>>>>>>> 5 import pycodestyle
>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>> 7 ----------------------------------------
>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>>>> in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41,
>>>>>>> in <module>
>>>>>>> 13 import pycodestyle
>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>> 15 ----------------------------------------
>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error code
>>>>>>> 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>
>>>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Ah, there's already https://issues.apache.org/jira/browse/AVRO-2737 and it
received attention.

On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Opened https://issues.apache.org/jira/browse/AVRO-2738
>
> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Here's a short repro:
>>
>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
>> root@04b45a100d16:/# pip install avro-python3
>> Collecting avro-python3
>>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>>     ERROR: Command errored out with exit status 1:
>>      command: /usr/local/bin/python -c 'import sys, setuptools, tokenize;
>> sys.argv[0] = '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
>> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>>     Complete output (5 lines):
>>     Traceback (most recent call last):
>>       File "<string>", line 1, in <module>
>>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41, in
>> <module>
>>         import pycodestyle
>>     ModuleNotFoundError: No module named 'pycodestyle'
>>     ----------------------------------------
>> ERROR: Command errored out with exit status 1: python setup.py egg_info
>> Check the logs for full command output.
>> root@04b45a100d16:/#
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
>> wrote:
>>
>>> Yes, it is a bug in the recent Avro release. We should report it to the
>>> Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
>>> example via requirements.txt.
>>>
>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>>> wrote:
>>>
>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>>> added pycodestyle as a dependency, probably related?
>>>>
>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>>
>>>>> +dev <de...@beam.apache.org>
>>>>>
>>>>> There was recently an update to add autoformatting to the Python
>>>>> SDK[1].
>>>>>
>>>>> I'm seeing this during testing of a PR as well.
>>>>>
>>>>> 1:
>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>>
>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>>> alan.krumholz@betterup.co> wrote:
>>>>>
>>>>>> Some more information for this as I still can't get to fix it....
>>>>>>
>>>>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>>>>> Pipelines component which runs on top of docker image:
>>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>>
>>>>>> I just checked and that image hasn't been updated recently. I also
>>>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>>>
>>>>>> The exact same pipeline/code running on the exact same image has been
>>>>>> running fine for days. Did anything changed on the beam/dataflow side since
>>>>>> yesterday morning?
>>>>>>
>>>>>> Thanks for your help! this is a production pipeline that is not
>>>>>> running for us :(
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>>> alan.krumholz@betterup.co> wrote:
>>>>>>
>>>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>>>> dataflow for days now.
>>>>>>> We haven't changed anything on this code but this morning run
>>>>>>> failed  (it couldn't even spin up the job)
>>>>>>> The job submits a setup.py file (that also hasn't changed) but maybe
>>>>>>> is causing the problem? (based on the error I'm getting)
>>>>>>>
>>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>>> Thanks!
>>>>>>>
>>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>>> 3 File "<string>", line 1, in <module>
>>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41,
>>>>>>> in <module>
>>>>>>> 5 import pycodestyle
>>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>>> 7 ----------------------------------------
>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>>>> in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>>> 11 File "<string>", line 1, in <module>
>>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41,
>>>>>>> in <module>
>>>>>>> 13 import pycodestyle
>>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>>> 15 ----------------------------------------
>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error code
>>>>>>> 1 in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>>
>>>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Opened https://issues.apache.org/jira/browse/AVRO-2738

On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Here's a short repro:
>
> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
> root@04b45a100d16:/# pip install avro-python3
> Collecting avro-python3
>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>     ERROR: Command errored out with exit status 1:
>      command: /usr/local/bin/python -c 'import sys, setuptools, tokenize;
> sys.argv[0] = '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>     Complete output (5 lines):
>     Traceback (most recent call last):
>       File "<string>", line 1, in <module>
>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41, in
> <module>
>         import pycodestyle
>     ModuleNotFoundError: No module named 'pycodestyle'
>     ----------------------------------------
> ERROR: Command errored out with exit status 1: python setup.py egg_info
> Check the logs for full command output.
> root@04b45a100d16:/#
>
>
>
>
>
>
>
>
>
> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Yes, it is a bug in the recent Avro release. We should report it to the
>> Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
>> example via requirements.txt.
>>
>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>> wrote:
>>
>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>> added pycodestyle as a dependency, probably related?
>>>
>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> +dev <de...@beam.apache.org>
>>>>
>>>> There was recently an update to add autoformatting to the Python SDK[1].
>>>>
>>>> I'm seeing this during testing of a PR as well.
>>>>
>>>> 1:
>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>
>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>> alan.krumholz@betterup.co> wrote:
>>>>
>>>>> Some more information for this as I still can't get to fix it....
>>>>>
>>>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>>>> Pipelines component which runs on top of docker image:
>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>
>>>>> I just checked and that image hasn't been updated recently. I also
>>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>>
>>>>> The exact same pipeline/code running on the exact same image has been
>>>>> running fine for days. Did anything changed on the beam/dataflow side since
>>>>> yesterday morning?
>>>>>
>>>>> Thanks for your help! this is a production pipeline that is not
>>>>> running for us :(
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>> alan.krumholz@betterup.co> wrote:
>>>>>
>>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>>> dataflow for days now.
>>>>>> We haven't changed anything on this code but this morning run failed
>>>>>> (it couldn't even spin up the job)
>>>>>> The job submits a setup.py file (that also hasn't changed) but maybe
>>>>>> is causing the problem? (based on the error I'm getting)
>>>>>>
>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>> Thanks!
>>>>>>
>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>> 3 File "<string>", line 1, in <module>
>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41,
>>>>>> in <module>
>>>>>> 5 import pycodestyle
>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>> 7 ----------------------------------------
>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>>> in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>> 11 File "<string>", line 1, in <module>
>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41,
>>>>>> in <module>
>>>>>> 13 import pycodestyle
>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>> 15 ----------------------------------------
>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>>> in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>
>>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Opened https://issues.apache.org/jira/browse/AVRO-2738

On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Here's a short repro:
>
> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
> root@04b45a100d16:/# pip install avro-python3
> Collecting avro-python3
>   Downloading avro-python3-1.9.2.tar.gz (37 kB)
>     ERROR: Command errored out with exit status 1:
>      command: /usr/local/bin/python -c 'import sys, setuptools, tokenize;
> sys.argv[0] = '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
> egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
>          cwd: /tmp/pip-install-mmy4vspt/avro-python3/
>     Complete output (5 lines):
>     Traceback (most recent call last):
>       File "<string>", line 1, in <module>
>       File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41, in
> <module>
>         import pycodestyle
>     ModuleNotFoundError: No module named 'pycodestyle'
>     ----------------------------------------
> ERROR: Command errored out with exit status 1: python setup.py egg_info
> Check the logs for full command output.
> root@04b45a100d16:/#
>
>
>
>
>
>
>
>
>
> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
> wrote:
>
>> Yes, it is a bug in the recent Avro release. We should report it to the
>> Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
>> example via requirements.txt.
>>
>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
>> wrote:
>>
>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>>> added pycodestyle as a dependency, probably related?
>>>
>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> +dev <de...@beam.apache.org>
>>>>
>>>> There was recently an update to add autoformatting to the Python SDK[1].
>>>>
>>>> I'm seeing this during testing of a PR as well.
>>>>
>>>> 1:
>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>>
>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <
>>>> alan.krumholz@betterup.co> wrote:
>>>>
>>>>> Some more information for this as I still can't get to fix it....
>>>>>
>>>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>>>> Pipelines component which runs on top of docker image:
>>>>> tensorflow/tensorflow:1.13.1-py3
>>>>>
>>>>> I just checked and that image hasn't been updated recently. I also
>>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>>
>>>>> The exact same pipeline/code running on the exact same image has been
>>>>> running fine for days. Did anything changed on the beam/dataflow side since
>>>>> yesterday morning?
>>>>>
>>>>> Thanks for your help! this is a production pipeline that is not
>>>>> running for us :(
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>>> alan.krumholz@betterup.co> wrote:
>>>>>
>>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>>> dataflow for days now.
>>>>>> We haven't changed anything on this code but this morning run failed
>>>>>> (it couldn't even spin up the job)
>>>>>> The job submits a setup.py file (that also hasn't changed) but maybe
>>>>>> is causing the problem? (based on the error I'm getting)
>>>>>>
>>>>>> Anyone else having the same issue? or know how to fix it?
>>>>>> Thanks!
>>>>>>
>>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>>> 2 ERROR: Traceback (most recent call last):
>>>>>> 3 File "<string>", line 1, in <module>
>>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41,
>>>>>> in <module>
>>>>>> 5 import pycodestyle
>>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>>> 7 ----------------------------------------
>>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>>> in /tmp/pip-install-42zyi89t/avro-python3/
>>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>>> 10 ERROR: Traceback (most recent call last):
>>>>>> 11 File "<string>", line 1, in <module>
>>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41,
>>>>>> in <module>
>>>>>> 13 import pycodestyle
>>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>>> 15 ----------------------------------------
>>>>>> 16ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>>> in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>>
>>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Here's a short repro:

:~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
root@04b45a100d16:/# pip install avro-python3
Collecting avro-python3
  Downloading avro-python3-1.9.2.tar.gz (37 kB)
    ERROR: Command errored out with exit status 1:
     command: /usr/local/bin/python -c 'import sys, setuptools, tokenize;
sys.argv[0] = '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
__file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
'"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
'"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
         cwd: /tmp/pip-install-mmy4vspt/avro-python3/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41, in
<module>
        import pycodestyle
    ModuleNotFoundError: No module named 'pycodestyle'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info
Check the logs for full command output.
root@04b45a100d16:/#









On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Yes, it is a bug in the recent Avro release. We should report it to the
> Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
> example via requirements.txt.
>
> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
> wrote:
>
>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>> added pycodestyle as a dependency, probably related?
>>
>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> +dev <de...@beam.apache.org>
>>>
>>> There was recently an update to add autoformatting to the Python SDK[1].
>>>
>>> I'm seeing this during testing of a PR as well.
>>>
>>> 1:
>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>
>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <al...@betterup.co>
>>> wrote:
>>>
>>>> Some more information for this as I still can't get to fix it....
>>>>
>>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>>> Pipelines component which runs on top of docker image:
>>>> tensorflow/tensorflow:1.13.1-py3
>>>>
>>>> I just checked and that image hasn't been updated recently. I also
>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>
>>>> The exact same pipeline/code running on the exact same image has been
>>>> running fine for days. Did anything changed on the beam/dataflow side since
>>>> yesterday morning?
>>>>
>>>> Thanks for your help! this is a production pipeline that is not running
>>>> for us :(
>>>>
>>>>
>>>>
>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>> alan.krumholz@betterup.co> wrote:
>>>>
>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>> dataflow for days now.
>>>>> We haven't changed anything on this code but this morning run failed
>>>>> (it couldn't even spin up the job)
>>>>> The job submits a setup.py file (that also hasn't changed) but maybe
>>>>> is causing the problem? (based on the error I'm getting)
>>>>>
>>>>> Anyone else having the same issue? or know how to fix it?
>>>>> Thanks!
>>>>>
>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>> 2 ERROR: Traceback (most recent call last):
>>>>> 3 File "<string>", line 1, in <module>
>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in
>>>>> <module>
>>>>> 5 import pycodestyle
>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>> 7 ----------------------------------------
>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>> in /tmp/pip-install-42zyi89t/avro-python3/
>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>> 10 ERROR: Traceback (most recent call last):
>>>>> 11 File "<string>", line 1, in <module>
>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41,
>>>>> in <module>
>>>>> 13 import pycodestyle
>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>> 15 ----------------------------------------
>>>>> 16ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>> in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>
>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Here's a short repro:

:~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch
root@04b45a100d16:/# pip install avro-python3
Collecting avro-python3
  Downloading avro-python3-1.9.2.tar.gz (37 kB)
    ERROR: Command errored out with exit status 1:
     command: /usr/local/bin/python -c 'import sys, setuptools, tokenize;
sys.argv[0] = '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';
__file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize,
'"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"',
'"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))'
egg_info --egg-base /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info
         cwd: /tmp/pip-install-mmy4vspt/avro-python3/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", line 41, in
<module>
        import pycodestyle
    ModuleNotFoundError: No module named 'pycodestyle'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info
Check the logs for full command output.
root@04b45a100d16:/#









On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev <va...@google.com>
wrote:

> Yes, it is a bug in the recent Avro release. We should report it to the
> Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
> example via requirements.txt.
>
> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org>
> wrote:
>
>> avro-python3 1.9.2 was released on pypi 4 hours ago, and
>> added pycodestyle as a dependency, probably related?
>>
>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> +dev <de...@beam.apache.org>
>>>
>>> There was recently an update to add autoformatting to the Python SDK[1].
>>>
>>> I'm seeing this during testing of a PR as well.
>>>
>>> 1:
>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>>
>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <al...@betterup.co>
>>> wrote:
>>>
>>>> Some more information for this as I still can't get to fix it....
>>>>
>>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>>> Pipelines component which runs on top of docker image:
>>>> tensorflow/tensorflow:1.13.1-py3
>>>>
>>>> I just checked and that image hasn't been updated recently. I also
>>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>>> the same error (which tells me this isn't an internal KFP problem)
>>>>
>>>> The exact same pipeline/code running on the exact same image has been
>>>> running fine for days. Did anything changed on the beam/dataflow side since
>>>> yesterday morning?
>>>>
>>>> Thanks for your help! this is a production pipeline that is not running
>>>> for us :(
>>>>
>>>>
>>>>
>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <
>>>> alan.krumholz@betterup.co> wrote:
>>>>
>>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>>> dataflow for days now.
>>>>> We haven't changed anything on this code but this morning run failed
>>>>> (it couldn't even spin up the job)
>>>>> The job submits a setup.py file (that also hasn't changed) but maybe
>>>>> is causing the problem? (based on the error I'm getting)
>>>>>
>>>>> Anyone else having the same issue? or know how to fix it?
>>>>> Thanks!
>>>>>
>>>>> ERROR: Complete output from command python setup.py egg_info:
>>>>> 2 ERROR: Traceback (most recent call last):
>>>>> 3 File "<string>", line 1, in <module>
>>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in
>>>>> <module>
>>>>> 5 import pycodestyle
>>>>> 6 ImportError: No module named 'pycodestyle'
>>>>> 7 ----------------------------------------
>>>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>> in /tmp/pip-install-42zyi89t/avro-python3/
>>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>>> 10 ERROR: Traceback (most recent call last):
>>>>> 11 File "<string>", line 1, in <module>
>>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41,
>>>>> in <module>
>>>>> 13 import pycodestyle
>>>>> 14 ImportError: No module named 'pycodestyle'
>>>>> 15 ----------------------------------------
>>>>> 16ERROR: Command "python setup.py egg_info" failed with error code 1
>>>>> in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>>
>>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Yes, it is a bug in the recent Avro release. We should report it to the
Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
example via requirements.txt.

On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org> wrote:

> avro-python3 1.9.2 was released on pypi 4 hours ago, and added pycodestyle
> as a dependency, probably related?
>
> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>
>> +dev <de...@beam.apache.org>
>>
>> There was recently an update to add autoformatting to the Python SDK[1].
>>
>> I'm seeing this during testing of a PR as well.
>>
>> 1:
>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>
>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <al...@betterup.co>
>> wrote:
>>
>>> Some more information for this as I still can't get to fix it....
>>>
>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>> Pipelines component which runs on top of docker image:
>>> tensorflow/tensorflow:1.13.1-py3
>>>
>>> I just checked and that image hasn't been updated recently. I also
>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>> the same error (which tells me this isn't an internal KFP problem)
>>>
>>> The exact same pipeline/code running on the exact same image has been
>>> running fine for days. Did anything changed on the beam/dataflow side since
>>> yesterday morning?
>>>
>>> Thanks for your help! this is a production pipeline that is not running
>>> for us :(
>>>
>>>
>>>
>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <al...@betterup.co>
>>> wrote:
>>>
>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>> dataflow for days now.
>>>> We haven't changed anything on this code but this morning run failed
>>>> (it couldn't even spin up the job)
>>>> The job submits a setup.py file (that also hasn't changed) but maybe is
>>>> causing the problem? (based on the error I'm getting)
>>>>
>>>> Anyone else having the same issue? or know how to fix it?
>>>> Thanks!
>>>>
>>>> ERROR: Complete output from command python setup.py egg_info:
>>>> 2 ERROR: Traceback (most recent call last):
>>>> 3 File "<string>", line 1, in <module>
>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in
>>>> <module>
>>>> 5 import pycodestyle
>>>> 6 ImportError: No module named 'pycodestyle'
>>>> 7 ----------------------------------------
>>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1 in
>>>> /tmp/pip-install-42zyi89t/avro-python3/
>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>> 10 ERROR: Traceback (most recent call last):
>>>> 11 File "<string>", line 1, in <module>
>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, in
>>>> <module>
>>>> 13 import pycodestyle
>>>> 14 ImportError: No module named 'pycodestyle'
>>>> 15 ----------------------------------------
>>>> 16ERROR: Command "python setup.py egg_info" failed with error code 1
>>>> in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>
>>>

Re: daily dataflow job failing today

Posted by Valentyn Tymofieiev <va...@google.com>.
Yes, it is a bug in the recent Avro release. We should report it to the
Avro maintainers. The workaround is to downgrade avro-python3 to 1.9.1, for
example via requirements.txt.

On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz <sn...@apache.org> wrote:

> avro-python3 1.9.2 was released on pypi 4 hours ago, and added pycodestyle
> as a dependency, probably related?
>
> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:
>
>> +dev <de...@beam.apache.org>
>>
>> There was recently an update to add autoformatting to the Python SDK[1].
>>
>> I'm seeing this during testing of a PR as well.
>>
>> 1:
>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>>
>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <al...@betterup.co>
>> wrote:
>>
>>> Some more information for this as I still can't get to fix it....
>>>
>>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>>> Pipelines component which runs on top of docker image:
>>> tensorflow/tensorflow:1.13.1-py3
>>>
>>> I just checked and that image hasn't been updated recently. I also
>>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>>> the same error (which tells me this isn't an internal KFP problem)
>>>
>>> The exact same pipeline/code running on the exact same image has been
>>> running fine for days. Did anything changed on the beam/dataflow side since
>>> yesterday morning?
>>>
>>> Thanks for your help! this is a production pipeline that is not running
>>> for us :(
>>>
>>>
>>>
>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <al...@betterup.co>
>>> wrote:
>>>
>>>> Hi, I have a scheduled daily job that I have been running fine in
>>>> dataflow for days now.
>>>> We haven't changed anything on this code but this morning run failed
>>>> (it couldn't even spin up the job)
>>>> The job submits a setup.py file (that also hasn't changed) but maybe is
>>>> causing the problem? (based on the error I'm getting)
>>>>
>>>> Anyone else having the same issue? or know how to fix it?
>>>> Thanks!
>>>>
>>>> ERROR: Complete output from command python setup.py egg_info:
>>>> 2 ERROR: Traceback (most recent call last):
>>>> 3 File "<string>", line 1, in <module>
>>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in
>>>> <module>
>>>> 5 import pycodestyle
>>>> 6 ImportError: No module named 'pycodestyle'
>>>> 7 ----------------------------------------
>>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1 in
>>>> /tmp/pip-install-42zyi89t/avro-python3/
>>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>>> 10 ERROR: Traceback (most recent call last):
>>>> 11 File "<string>", line 1, in <module>
>>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, in
>>>> <module>
>>>> 13 import pycodestyle
>>>> 14 ImportError: No module named 'pycodestyle'
>>>> 15 ----------------------------------------
>>>> 16ERROR: Command "python setup.py egg_info" failed with error code 1
>>>> in /tmp/pip-install-wrqytf9a/avro-python3/
>>>>
>>>

Re: daily dataflow job failing today

Posted by Steve Niemitz <sn...@apache.org>.
avro-python3 1.9.2 was released on pypi 4 hours ago, and added pycodestyle
as a dependency, probably related?

On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:

> +dev <de...@beam.apache.org>
>
> There was recently an update to add autoformatting to the Python SDK[1].
>
> I'm seeing this during testing of a PR as well.
>
> 1:
> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>
> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <al...@betterup.co>
> wrote:
>
>> Some more information for this as I still can't get to fix it....
>>
>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>> Pipelines component which runs on top of docker image:
>> tensorflow/tensorflow:1.13.1-py3
>>
>> I just checked and that image hasn't been updated recently. I also
>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>> the same error (which tells me this isn't an internal KFP problem)
>>
>> The exact same pipeline/code running on the exact same image has been
>> running fine for days. Did anything changed on the beam/dataflow side since
>> yesterday morning?
>>
>> Thanks for your help! this is a production pipeline that is not running
>> for us :(
>>
>>
>>
>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <al...@betterup.co>
>> wrote:
>>
>>> Hi, I have a scheduled daily job that I have been running fine in
>>> dataflow for days now.
>>> We haven't changed anything on this code but this morning run failed
>>> (it couldn't even spin up the job)
>>> The job submits a setup.py file (that also hasn't changed) but maybe is
>>> causing the problem? (based on the error I'm getting)
>>>
>>> Anyone else having the same issue? or know how to fix it?
>>> Thanks!
>>>
>>> ERROR: Complete output from command python setup.py egg_info:
>>> 2 ERROR: Traceback (most recent call last):
>>> 3 File "<string>", line 1, in <module>
>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in
>>> <module>
>>> 5 import pycodestyle
>>> 6 ImportError: No module named 'pycodestyle'
>>> 7 ----------------------------------------
>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1 in
>>> /tmp/pip-install-42zyi89t/avro-python3/
>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>> 10 ERROR: Traceback (most recent call last):
>>> 11 File "<string>", line 1, in <module>
>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, in
>>> <module>
>>> 13 import pycodestyle
>>> 14 ImportError: No module named 'pycodestyle'
>>> 15 ----------------------------------------
>>> 16ERROR: Command "python setup.py egg_info" failed with error code 1 in
>>> /tmp/pip-install-wrqytf9a/avro-python3/
>>>
>>

Re: daily dataflow job failing today

Posted by Steve Niemitz <sn...@apache.org>.
avro-python3 1.9.2 was released on pypi 4 hours ago, and added pycodestyle
as a dependency, probably related?

On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <lc...@google.com> wrote:

> +dev <de...@beam.apache.org>
>
> There was recently an update to add autoformatting to the Python SDK[1].
>
> I'm seeing this during testing of a PR as well.
>
> 1:
> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E
>
> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz <al...@betterup.co>
> wrote:
>
>> Some more information for this as I still can't get to fix it....
>>
>> This job is triggered using the beam[gcp] python sdk from a KubeFlow
>> Pipelines component which runs on top of docker image:
>> tensorflow/tensorflow:1.13.1-py3
>>
>> I just checked and that image hasn't been updated recently. I also
>> redeployed my pipeline to another (older) deployment of KFP and it gives me
>> the same error (which tells me this isn't an internal KFP problem)
>>
>> The exact same pipeline/code running on the exact same image has been
>> running fine for days. Did anything changed on the beam/dataflow side since
>> yesterday morning?
>>
>> Thanks for your help! this is a production pipeline that is not running
>> for us :(
>>
>>
>>
>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz <al...@betterup.co>
>> wrote:
>>
>>> Hi, I have a scheduled daily job that I have been running fine in
>>> dataflow for days now.
>>> We haven't changed anything on this code but this morning run failed
>>> (it couldn't even spin up the job)
>>> The job submits a setup.py file (that also hasn't changed) but maybe is
>>> causing the problem? (based on the error I'm getting)
>>>
>>> Anyone else having the same issue? or know how to fix it?
>>> Thanks!
>>>
>>> ERROR: Complete output from command python setup.py egg_info:
>>> 2 ERROR: Traceback (most recent call last):
>>> 3 File "<string>", line 1, in <module>
>>> 4 File "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, in
>>> <module>
>>> 5 import pycodestyle
>>> 6 ImportError: No module named 'pycodestyle'
>>> 7 ----------------------------------------
>>> 8ERROR: Command "python setup.py egg_info" failed with error code 1 in
>>> /tmp/pip-install-42zyi89t/avro-python3/
>>> 9 ERROR: Complete output from command python setup.py egg_info:
>>> 10 ERROR: Traceback (most recent call last):
>>> 11 File "<string>", line 1, in <module>
>>> 12 File "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, in
>>> <module>
>>> 13 import pycodestyle
>>> 14 ImportError: No module named 'pycodestyle'
>>> 15 ----------------------------------------
>>> 16ERROR: Command "python setup.py egg_info" failed with error code 1 in
>>> /tmp/pip-install-wrqytf9a/avro-python3/
>>>
>>