You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Miguel Hernández Sandoval <ro...@wizeline.com> on 2021/07/14 20:27:03 UTC

[Question] Bug in SdkHarness

Hi team,
I am currently working on solving this bug [1]. The ticket describes a
grpc.FutureTimeoutError in the SdkHarness.__init__ and it seems that it
gets stuck in this line [2] causing the timeout and making some tests fail.
Here [3] how the tests were run and the results.

I was wondering if you could give me any pointers on how to debug this and
check the gRPC server's activity to find out what is causing the timeout.

Thanks,
Mike

[1] https://issues.apache.org/jira/browse/BEAM-12163#
[2]
https://github.com/apache/beam/blob/0866bff5ba209f0e5608592a389e439fd26435eb/sdks/python/apache_beam/runners/worker/sdk_worker.py#L188
[3]
https://docs.google.com/document/d/1Pqd0-vuYHSjLr6yQvfGwiK3NcYypT-WrHfjCbP-Xob4/edit?usp=sharing

-- 
*This email and its contents (including any attachments) are being sent to
you on the condition of confidentiality and may be protected by legal
privilege. Access to this email by anyone other than the intended recipient
is unauthorized. If you are not the intended recipient, please immediately
notify the sender by replying to this message and delete the material
immediately from your system. Any further use, dissemination, distribution
or reproduction of this email is strictly prohibited. Further, no
representation is made with respect to any content contained in this email.*

Re: [Question] Bug in SdkHarness

Posted by Miguel Hernández Sandoval <ro...@wizeline.com>.
I hadn't seen it, but it seems that the server is not ready the second time
it runs the tests. I'll take a look into it.

On Thu, Jul 15, 2021 at 1:11 PM Luke Cwik <lc...@google.com> wrote:

> Were you able to see in the logs that the server was not yet
> ready/launched when the client was trying to connect?
>
> On Thu, Jul 15, 2021 at 11:00 AM Miguel Hernández Sandoval <
> rogelio.hernandez@wizeline.com> wrote:
>
>> Yes, I was able to reproduce the error locally and debugged it, but
>> haven't found anything.
>>
>> On Wed, Jul 14, 2021 at 4:38 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> Repro locally and run with a Python debugger?
>>>
>>> On Wed, Jul 14, 2021 at 1:46 PM Miguel Hernández Sandoval <
>>> rogelio.hernandez@wizeline.com> wrote:
>>>
>>>> I think it is, but not sure what's causing it. Here's also the latest
>>>> failure [1] of this kind.
>>>>
>>>> [1]
>>>> https://github.com/apache/beam/runs/3063305519?check_suite_focus=true#step:6:358
>>>>
>>>> On Wed, Jul 14, 2021 at 3:33 PM Luke Cwik <lc...@google.com> wrote:
>>>>
>>>>> Is this a timing issue where the server isn't yet ready to accept a
>>>>> client?
>>>>>
>>>>> On Wed, Jul 14, 2021 at 1:27 PM Miguel Hernández Sandoval <
>>>>> rogelio.hernandez@wizeline.com> wrote:
>>>>>
>>>>>> Hi team,
>>>>>> I am currently working on solving this bug [1]. The ticket describes
>>>>>> a grpc.FutureTimeoutError in the SdkHarness.__init__ and it seems that it
>>>>>> gets stuck in this line [2] causing the timeout and making some tests fail.
>>>>>> Here [3] how the tests were run and the results.
>>>>>>
>>>>>> I was wondering if you could give me any pointers on how to debug
>>>>>> this and check the gRPC server's activity to find out what is causing the
>>>>>> timeout.
>>>>>>
>>>>>> Thanks,
>>>>>> Mike
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-12163#
>>>>>> [2]
>>>>>> https://github.com/apache/beam/blob/0866bff5ba209f0e5608592a389e439fd26435eb/sdks/python/apache_beam/runners/worker/sdk_worker.py#L188
>>>>>> [3]
>>>>>> https://docs.google.com/document/d/1Pqd0-vuYHSjLr6yQvfGwiK3NcYypT-WrHfjCbP-Xob4/edit?usp=sharing
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *This email and its contents (including any attachments) are being
>>>>>> sent toyou on the condition of confidentiality and may be protected by
>>>>>> legalprivilege. Access to this email by anyone other than the intended
>>>>>> recipientis unauthorized. If you are not the intended recipient, please
>>>>>> immediatelynotify the sender by replying to this message and delete the
>>>>>> materialimmediately from your system. Any further use, dissemination,
>>>>>> distributionor reproduction of this email is strictly prohibited. Further,
>>>>>> norepresentation is made with respect to any content contained in this
>>>>>> email.*
>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Miguel Hernández Sandoval | WIZELINE
>>>>
>>>> Software Engineer
>>>>
>>>> rogelio.hernandez@wizeline.com
>>>>
>>>> Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *This email and its contents (including any attachments) are being sent
>>>> toyou on the condition of confidentiality and may be protected by
>>>> legalprivilege. Access to this email by anyone other than the intended
>>>> recipientis unauthorized. If you are not the intended recipient, please
>>>> immediatelynotify the sender by replying to this message and delete the
>>>> materialimmediately from your system. Any further use, dissemination,
>>>> distributionor reproduction of this email is strictly prohibited. Further,
>>>> norepresentation is made with respect to any content contained in this
>>>> email.*
>>>
>>>
>>
>> --
>>
>> Miguel Hernández Sandoval | WIZELINE
>>
>> Software Engineer
>>
>> rogelio.hernandez@wizeline.com
>>
>> Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.
>>
>>
>>
>>
>>
>>
>>
>>
>> *This email and its contents (including any attachments) are being sent
>> toyou on the condition of confidentiality and may be protected by
>> legalprivilege. Access to this email by anyone other than the intended
>> recipientis unauthorized. If you are not the intended recipient, please
>> immediatelynotify the sender by replying to this message and delete the
>> materialimmediately from your system. Any further use, dissemination,
>> distributionor reproduction of this email is strictly prohibited. Further,
>> norepresentation is made with respect to any content contained in this
>> email.*
>
>

-- 

Miguel Hernández Sandoval | WIZELINE

Software Engineer

rogelio.hernandez@wizeline.com

Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.

-- 
*This email and its contents (including any attachments) are being sent to
you on the condition of confidentiality and may be protected by legal
privilege. Access to this email by anyone other than the intended recipient
is unauthorized. If you are not the intended recipient, please immediately
notify the sender by replying to this message and delete the material
immediately from your system. Any further use, dissemination, distribution
or reproduction of this email is strictly prohibited. Further, no
representation is made with respect to any content contained in this email.*

Re: [Question] Bug in SdkHarness

Posted by Luke Cwik <lc...@google.com>.
Were you able to see in the logs that the server was not yet ready/launched
when the client was trying to connect?

On Thu, Jul 15, 2021 at 11:00 AM Miguel Hernández Sandoval <
rogelio.hernandez@wizeline.com> wrote:

> Yes, I was able to reproduce the error locally and debugged it, but
> haven't found anything.
>
> On Wed, Jul 14, 2021 at 4:38 PM Luke Cwik <lc...@google.com> wrote:
>
>> Repro locally and run with a Python debugger?
>>
>> On Wed, Jul 14, 2021 at 1:46 PM Miguel Hernández Sandoval <
>> rogelio.hernandez@wizeline.com> wrote:
>>
>>> I think it is, but not sure what's causing it. Here's also the latest
>>> failure [1] of this kind.
>>>
>>> [1]
>>> https://github.com/apache/beam/runs/3063305519?check_suite_focus=true#step:6:358
>>>
>>> On Wed, Jul 14, 2021 at 3:33 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> Is this a timing issue where the server isn't yet ready to accept a
>>>> client?
>>>>
>>>> On Wed, Jul 14, 2021 at 1:27 PM Miguel Hernández Sandoval <
>>>> rogelio.hernandez@wizeline.com> wrote:
>>>>
>>>>> Hi team,
>>>>> I am currently working on solving this bug [1]. The ticket describes a
>>>>> grpc.FutureTimeoutError in the SdkHarness.__init__ and it seems that it
>>>>> gets stuck in this line [2] causing the timeout and making some tests fail.
>>>>> Here [3] how the tests were run and the results.
>>>>>
>>>>> I was wondering if you could give me any pointers on how to debug this
>>>>> and check the gRPC server's activity to find out what is causing the
>>>>> timeout.
>>>>>
>>>>> Thanks,
>>>>> Mike
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/BEAM-12163#
>>>>> [2]
>>>>> https://github.com/apache/beam/blob/0866bff5ba209f0e5608592a389e439fd26435eb/sdks/python/apache_beam/runners/worker/sdk_worker.py#L188
>>>>> [3]
>>>>> https://docs.google.com/document/d/1Pqd0-vuYHSjLr6yQvfGwiK3NcYypT-WrHfjCbP-Xob4/edit?usp=sharing
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *This email and its contents (including any attachments) are being
>>>>> sent toyou on the condition of confidentiality and may be protected by
>>>>> legalprivilege. Access to this email by anyone other than the intended
>>>>> recipientis unauthorized. If you are not the intended recipient, please
>>>>> immediatelynotify the sender by replying to this message and delete the
>>>>> materialimmediately from your system. Any further use, dissemination,
>>>>> distributionor reproduction of this email is strictly prohibited. Further,
>>>>> norepresentation is made with respect to any content contained in this
>>>>> email.*
>>>>
>>>>
>>>
>>> --
>>>
>>> Miguel Hernández Sandoval | WIZELINE
>>>
>>> Software Engineer
>>>
>>> rogelio.hernandez@wizeline.com
>>>
>>> Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *This email and its contents (including any attachments) are being sent
>>> toyou on the condition of confidentiality and may be protected by
>>> legalprivilege. Access to this email by anyone other than the intended
>>> recipientis unauthorized. If you are not the intended recipient, please
>>> immediatelynotify the sender by replying to this message and delete the
>>> materialimmediately from your system. Any further use, dissemination,
>>> distributionor reproduction of this email is strictly prohibited. Further,
>>> norepresentation is made with respect to any content contained in this
>>> email.*
>>
>>
>
> --
>
> Miguel Hernández Sandoval | WIZELINE
>
> Software Engineer
>
> rogelio.hernandez@wizeline.com
>
> Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.
>
>
>
>
>
>
>
>
> *This email and its contents (including any attachments) are being sent
> toyou on the condition of confidentiality and may be protected by
> legalprivilege. Access to this email by anyone other than the intended
> recipientis unauthorized. If you are not the intended recipient, please
> immediatelynotify the sender by replying to this message and delete the
> materialimmediately from your system. Any further use, dissemination,
> distributionor reproduction of this email is strictly prohibited. Further,
> norepresentation is made with respect to any content contained in this
> email.*

Re: [Question] Bug in SdkHarness

Posted by Miguel Hernández Sandoval <ro...@wizeline.com>.
Yes, I was able to reproduce the error locally and debugged it, but haven't
found anything.

On Wed, Jul 14, 2021 at 4:38 PM Luke Cwik <lc...@google.com> wrote:

> Repro locally and run with a Python debugger?
>
> On Wed, Jul 14, 2021 at 1:46 PM Miguel Hernández Sandoval <
> rogelio.hernandez@wizeline.com> wrote:
>
>> I think it is, but not sure what's causing it. Here's also the latest
>> failure [1] of this kind.
>>
>> [1]
>> https://github.com/apache/beam/runs/3063305519?check_suite_focus=true#step:6:358
>>
>> On Wed, Jul 14, 2021 at 3:33 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> Is this a timing issue where the server isn't yet ready to accept a
>>> client?
>>>
>>> On Wed, Jul 14, 2021 at 1:27 PM Miguel Hernández Sandoval <
>>> rogelio.hernandez@wizeline.com> wrote:
>>>
>>>> Hi team,
>>>> I am currently working on solving this bug [1]. The ticket describes a
>>>> grpc.FutureTimeoutError in the SdkHarness.__init__ and it seems that it
>>>> gets stuck in this line [2] causing the timeout and making some tests fail.
>>>> Here [3] how the tests were run and the results.
>>>>
>>>> I was wondering if you could give me any pointers on how to debug this
>>>> and check the gRPC server's activity to find out what is causing the
>>>> timeout.
>>>>
>>>> Thanks,
>>>> Mike
>>>>
>>>> [1] https://issues.apache.org/jira/browse/BEAM-12163#
>>>> [2]
>>>> https://github.com/apache/beam/blob/0866bff5ba209f0e5608592a389e439fd26435eb/sdks/python/apache_beam/runners/worker/sdk_worker.py#L188
>>>> [3]
>>>> https://docs.google.com/document/d/1Pqd0-vuYHSjLr6yQvfGwiK3NcYypT-WrHfjCbP-Xob4/edit?usp=sharing
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *This email and its contents (including any attachments) are being sent
>>>> toyou on the condition of confidentiality and may be protected by
>>>> legalprivilege. Access to this email by anyone other than the intended
>>>> recipientis unauthorized. If you are not the intended recipient, please
>>>> immediatelynotify the sender by replying to this message and delete the
>>>> materialimmediately from your system. Any further use, dissemination,
>>>> distributionor reproduction of this email is strictly prohibited. Further,
>>>> norepresentation is made with respect to any content contained in this
>>>> email.*
>>>
>>>
>>
>> --
>>
>> Miguel Hernández Sandoval | WIZELINE
>>
>> Software Engineer
>>
>> rogelio.hernandez@wizeline.com
>>
>> Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.
>>
>>
>>
>>
>>
>>
>>
>>
>> *This email and its contents (including any attachments) are being sent
>> toyou on the condition of confidentiality and may be protected by
>> legalprivilege. Access to this email by anyone other than the intended
>> recipientis unauthorized. If you are not the intended recipient, please
>> immediatelynotify the sender by replying to this message and delete the
>> materialimmediately from your system. Any further use, dissemination,
>> distributionor reproduction of this email is strictly prohibited. Further,
>> norepresentation is made with respect to any content contained in this
>> email.*
>
>

-- 

Miguel Hernández Sandoval | WIZELINE

Software Engineer

rogelio.hernandez@wizeline.com

Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.

-- 
*This email and its contents (including any attachments) are being sent to
you on the condition of confidentiality and may be protected by legal
privilege. Access to this email by anyone other than the intended recipient
is unauthorized. If you are not the intended recipient, please immediately
notify the sender by replying to this message and delete the material
immediately from your system. Any further use, dissemination, distribution
or reproduction of this email is strictly prohibited. Further, no
representation is made with respect to any content contained in this email.*

Re: [Question] Bug in SdkHarness

Posted by Luke Cwik <lc...@google.com>.
Repro locally and run with a Python debugger?

On Wed, Jul 14, 2021 at 1:46 PM Miguel Hernández Sandoval <
rogelio.hernandez@wizeline.com> wrote:

> I think it is, but not sure what's causing it. Here's also the latest
> failure [1] of this kind.
>
> [1]
> https://github.com/apache/beam/runs/3063305519?check_suite_focus=true#step:6:358
>
> On Wed, Jul 14, 2021 at 3:33 PM Luke Cwik <lc...@google.com> wrote:
>
>> Is this a timing issue where the server isn't yet ready to accept a
>> client?
>>
>> On Wed, Jul 14, 2021 at 1:27 PM Miguel Hernández Sandoval <
>> rogelio.hernandez@wizeline.com> wrote:
>>
>>> Hi team,
>>> I am currently working on solving this bug [1]. The ticket describes a
>>> grpc.FutureTimeoutError in the SdkHarness.__init__ and it seems that it
>>> gets stuck in this line [2] causing the timeout and making some tests fail.
>>> Here [3] how the tests were run and the results.
>>>
>>> I was wondering if you could give me any pointers on how to debug this
>>> and check the gRPC server's activity to find out what is causing the
>>> timeout.
>>>
>>> Thanks,
>>> Mike
>>>
>>> [1] https://issues.apache.org/jira/browse/BEAM-12163#
>>> [2]
>>> https://github.com/apache/beam/blob/0866bff5ba209f0e5608592a389e439fd26435eb/sdks/python/apache_beam/runners/worker/sdk_worker.py#L188
>>> [3]
>>> https://docs.google.com/document/d/1Pqd0-vuYHSjLr6yQvfGwiK3NcYypT-WrHfjCbP-Xob4/edit?usp=sharing
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *This email and its contents (including any attachments) are being sent
>>> toyou on the condition of confidentiality and may be protected by
>>> legalprivilege. Access to this email by anyone other than the intended
>>> recipientis unauthorized. If you are not the intended recipient, please
>>> immediatelynotify the sender by replying to this message and delete the
>>> materialimmediately from your system. Any further use, dissemination,
>>> distributionor reproduction of this email is strictly prohibited. Further,
>>> norepresentation is made with respect to any content contained in this
>>> email.*
>>
>>
>
> --
>
> Miguel Hernández Sandoval | WIZELINE
>
> Software Engineer
>
> rogelio.hernandez@wizeline.com
>
> Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.
>
>
>
>
>
>
>
>
> *This email and its contents (including any attachments) are being sent
> toyou on the condition of confidentiality and may be protected by
> legalprivilege. Access to this email by anyone other than the intended
> recipientis unauthorized. If you are not the intended recipient, please
> immediatelynotify the sender by replying to this message and delete the
> materialimmediately from your system. Any further use, dissemination,
> distributionor reproduction of this email is strictly prohibited. Further,
> norepresentation is made with respect to any content contained in this
> email.*

Re: [Question] Bug in SdkHarness

Posted by Miguel Hernández Sandoval <ro...@wizeline.com>.
I think it is, but not sure what's causing it. Here's also the latest
failure [1] of this kind.

[1]
https://github.com/apache/beam/runs/3063305519?check_suite_focus=true#step:6:358

On Wed, Jul 14, 2021 at 3:33 PM Luke Cwik <lc...@google.com> wrote:

> Is this a timing issue where the server isn't yet ready to accept a client?
>
> On Wed, Jul 14, 2021 at 1:27 PM Miguel Hernández Sandoval <
> rogelio.hernandez@wizeline.com> wrote:
>
>> Hi team,
>> I am currently working on solving this bug [1]. The ticket describes a
>> grpc.FutureTimeoutError in the SdkHarness.__init__ and it seems that it
>> gets stuck in this line [2] causing the timeout and making some tests fail.
>> Here [3] how the tests were run and the results.
>>
>> I was wondering if you could give me any pointers on how to debug this
>> and check the gRPC server's activity to find out what is causing the
>> timeout.
>>
>> Thanks,
>> Mike
>>
>> [1] https://issues.apache.org/jira/browse/BEAM-12163#
>> [2]
>> https://github.com/apache/beam/blob/0866bff5ba209f0e5608592a389e439fd26435eb/sdks/python/apache_beam/runners/worker/sdk_worker.py#L188
>> [3]
>> https://docs.google.com/document/d/1Pqd0-vuYHSjLr6yQvfGwiK3NcYypT-WrHfjCbP-Xob4/edit?usp=sharing
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *This email and its contents (including any attachments) are being sent
>> toyou on the condition of confidentiality and may be protected by
>> legalprivilege. Access to this email by anyone other than the intended
>> recipientis unauthorized. If you are not the intended recipient, please
>> immediatelynotify the sender by replying to this message and delete the
>> materialimmediately from your system. Any further use, dissemination,
>> distributionor reproduction of this email is strictly prohibited. Further,
>> norepresentation is made with respect to any content contained in this
>> email.*
>
>

-- 

Miguel Hernández Sandoval | WIZELINE

Software Engineer

rogelio.hernandez@wizeline.com

Amado Nervo 2200, Esfera P6, Col. Jardines del Sol, 45050 Zapopan, Jal.

-- 
*This email and its contents (including any attachments) are being sent to
you on the condition of confidentiality and may be protected by legal
privilege. Access to this email by anyone other than the intended recipient
is unauthorized. If you are not the intended recipient, please immediately
notify the sender by replying to this message and delete the material
immediately from your system. Any further use, dissemination, distribution
or reproduction of this email is strictly prohibited. Further, no
representation is made with respect to any content contained in this email.*

Re: [Question] Bug in SdkHarness

Posted by Luke Cwik <lc...@google.com>.
Is this a timing issue where the server isn't yet ready to accept a client?

On Wed, Jul 14, 2021 at 1:27 PM Miguel Hernández Sandoval <
rogelio.hernandez@wizeline.com> wrote:

> Hi team,
> I am currently working on solving this bug [1]. The ticket describes a
> grpc.FutureTimeoutError in the SdkHarness.__init__ and it seems that it
> gets stuck in this line [2] causing the timeout and making some tests fail.
> Here [3] how the tests were run and the results.
>
> I was wondering if you could give me any pointers on how to debug this and
> check the gRPC server's activity to find out what is causing the timeout.
>
> Thanks,
> Mike
>
> [1] https://issues.apache.org/jira/browse/BEAM-12163#
> [2]
> https://github.com/apache/beam/blob/0866bff5ba209f0e5608592a389e439fd26435eb/sdks/python/apache_beam/runners/worker/sdk_worker.py#L188
> [3]
> https://docs.google.com/document/d/1Pqd0-vuYHSjLr6yQvfGwiK3NcYypT-WrHfjCbP-Xob4/edit?usp=sharing
>
>
>
>
>
>
>
>
>
> *This email and its contents (including any attachments) are being sent
> toyou on the condition of confidentiality and may be protected by
> legalprivilege. Access to this email by anyone other than the intended
> recipientis unauthorized. If you are not the intended recipient, please
> immediatelynotify the sender by replying to this message and delete the
> materialimmediately from your system. Any further use, dissemination,
> distributionor reproduction of this email is strictly prohibited. Further,
> norepresentation is made with respect to any content contained in this
> email.*