You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Evan Galpin <eg...@apache.org> on 2022/08/05 14:35:56 UTC

[Dataflow][Java][stateful] Workflow Failed when trying to introduce stateful RateLimit

Hi all,

I'm trying to create a RateLimit[1] transform that's based fairly heavily
on GroupIntoBatches[2]. I've been able to run unit tests using TestPipeline
to verify desired behaviour and have also run successfully using
DirectRunner.  However, when I submit the same job to Dataflow it
completely fails to start and only gives the error message "Workflow
Failed." The job builds/uploads/submits without error, but never starts and
gives no detail as to why.

Is there anything I can do to gain more insight about what is going wrong?
I've included a gist of the RateLimit[1] code in case there is anything
obvious wrong there.

Thanks in advance,
Evan

[1] https://gist.github.com/egalpin/162a04b896dc7be1d0899acf17e676b3
[2]
https://github.com/apache/beam/blob/c8d92b03b6b2029978dbc2bf824240232c5e61ac/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java

Re: [Dataflow][Java][stateful] Workflow Failed when trying to introduce stateful RateLimit

Posted by Evan Galpin <eg...@apache.org>.
Thanks for the suggestion Luke! Unfortunately it looks like v2 also fails
in the same way (pipeline does not start at all with "Workflow Failed"
error message).

Thanks,
Evan

On Fri, Aug 5, 2022 at 12:52 PM Luke Cwik via user <us...@beam.apache.org>
wrote:

> You could try Dataflow Runner v2. The difference in the implementation may
> allow you to work around what is impacting the pipelines.
>
> On Fri, Aug 5, 2022 at 9:40 AM Evan Galpin <eg...@apache.org> wrote:
>
>> Thanks Luke, I've opened a support case as well but thought it would be
>> prudent to ask here in case there was something obvious with the code.  Is
>> there any additional/optional validation that I can opt to use when
>> building and deploying the pipeline that might give hints? Otherwise I'll
>> just wait on the support case.
>>
>> Thanks,
>> Evan
>>
>> On Fri, Aug 5, 2022 at 11:22 AM Luke Cwik via user <us...@beam.apache.org>
>> wrote:
>>
>>> I took a look at the code and nothing obvious stood out to me in the
>>> code as this is a ParDo with OnWindowExpiration. Just to make sure, the
>>> rate limit is per key and would only be a global rate limit if there was a
>>> single key.
>>>
>>> Are the workers trying to start?
>>> * If no, then you would need to open a support case and share some
>>> job ids so that someone could debug internal service logs.
>>> * If yes, then did the workers start successfully?
>>> ** If no, logs should have some details as to why the worker couldn't
>>> start.
>>> ** If yes, are the workers getting work items?
>>> *** If no, then you would need to open a support case and share some
>>> job ids so that someone could debug internal service logs.
>>> *** If yes then the logs should have some details as to why the work
>>> items are failing.
>>>
>>>
>>> On Fri, Aug 5, 2022 at 7:36 AM Evan Galpin <eg...@apache.org> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I'm trying to create a RateLimit[1] transform that's based fairly
>>>> heavily on GroupIntoBatches[2]. I've been able to run unit tests using
>>>> TestPipeline to verify desired behaviour and have also run successfully
>>>> using DirectRunner.  However, when I submit the same job to Dataflow it
>>>> completely fails to start and only gives the error message "Workflow
>>>> Failed." The job builds/uploads/submits without error, but never starts and
>>>> gives no detail as to why.
>>>>
>>>> Is there anything I can do to gain more insight about what is going
>>>> wrong?  I've included a gist of the RateLimit[1] code in case there is
>>>> anything obvious wrong there.
>>>>
>>>> Thanks in advance,
>>>> Evan
>>>>
>>>> [1] https://gist.github.com/egalpin/162a04b896dc7be1d0899acf17e676b3
>>>> [2]
>>>> https://github.com/apache/beam/blob/c8d92b03b6b2029978dbc2bf824240232c5e61ac/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java
>>>>
>>>

Re: [Dataflow][Java][stateful] Workflow Failed when trying to introduce stateful RateLimit

Posted by Luke Cwik via user <us...@beam.apache.org>.
You could try Dataflow Runner v2. The difference in the implementation may
allow you to work around what is impacting the pipelines.

On Fri, Aug 5, 2022 at 9:40 AM Evan Galpin <eg...@apache.org> wrote:

> Thanks Luke, I've opened a support case as well but thought it would be
> prudent to ask here in case there was something obvious with the code.  Is
> there any additional/optional validation that I can opt to use when
> building and deploying the pipeline that might give hints? Otherwise I'll
> just wait on the support case.
>
> Thanks,
> Evan
>
> On Fri, Aug 5, 2022 at 11:22 AM Luke Cwik via user <us...@beam.apache.org>
> wrote:
>
>> I took a look at the code and nothing obvious stood out to me in the code
>> as this is a ParDo with OnWindowExpiration. Just to make sure, the rate
>> limit is per key and would only be a global rate limit if there was a
>> single key.
>>
>> Are the workers trying to start?
>> * If no, then you would need to open a support case and share some
>> job ids so that someone could debug internal service logs.
>> * If yes, then did the workers start successfully?
>> ** If no, logs should have some details as to why the worker couldn't
>> start.
>> ** If yes, are the workers getting work items?
>> *** If no, then you would need to open a support case and share some
>> job ids so that someone could debug internal service logs.
>> *** If yes then the logs should have some details as to why the work
>> items are failing.
>>
>>
>> On Fri, Aug 5, 2022 at 7:36 AM Evan Galpin <eg...@apache.org> wrote:
>>
>>> Hi all,
>>>
>>> I'm trying to create a RateLimit[1] transform that's based fairly
>>> heavily on GroupIntoBatches[2]. I've been able to run unit tests using
>>> TestPipeline to verify desired behaviour and have also run successfully
>>> using DirectRunner.  However, when I submit the same job to Dataflow it
>>> completely fails to start and only gives the error message "Workflow
>>> Failed." The job builds/uploads/submits without error, but never starts and
>>> gives no detail as to why.
>>>
>>> Is there anything I can do to gain more insight about what is going
>>> wrong?  I've included a gist of the RateLimit[1] code in case there is
>>> anything obvious wrong there.
>>>
>>> Thanks in advance,
>>> Evan
>>>
>>> [1] https://gist.github.com/egalpin/162a04b896dc7be1d0899acf17e676b3
>>> [2]
>>> https://github.com/apache/beam/blob/c8d92b03b6b2029978dbc2bf824240232c5e61ac/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java
>>>
>>

Re: [Dataflow][Java][stateful] Workflow Failed when trying to introduce stateful RateLimit

Posted by Evan Galpin <eg...@apache.org>.
Thanks Luke, I've opened a support case as well but thought it would be
prudent to ask here in case there was something obvious with the code.  Is
there any additional/optional validation that I can opt to use when
building and deploying the pipeline that might give hints? Otherwise I'll
just wait on the support case.

Thanks,
Evan

On Fri, Aug 5, 2022 at 11:22 AM Luke Cwik via user <us...@beam.apache.org>
wrote:

> I took a look at the code and nothing obvious stood out to me in the code
> as this is a ParDo with OnWindowExpiration. Just to make sure, the rate
> limit is per key and would only be a global rate limit if there was a
> single key.
>
> Are the workers trying to start?
> * If no, then you would need to open a support case and share some job ids
> so that someone could debug internal service logs.
> * If yes, then did the workers start successfully?
> ** If no, logs should have some details as to why the worker couldn't
> start.
> ** If yes, are the workers getting work items?
> *** If no, then you would need to open a support case and share some
> job ids so that someone could debug internal service logs.
> *** If yes then the logs should have some details as to why the work items
> are failing.
>
>
> On Fri, Aug 5, 2022 at 7:36 AM Evan Galpin <eg...@apache.org> wrote:
>
>> Hi all,
>>
>> I'm trying to create a RateLimit[1] transform that's based fairly heavily
>> on GroupIntoBatches[2]. I've been able to run unit tests using TestPipeline
>> to verify desired behaviour and have also run successfully using
>> DirectRunner.  However, when I submit the same job to Dataflow it
>> completely fails to start and only gives the error message "Workflow
>> Failed." The job builds/uploads/submits without error, but never starts and
>> gives no detail as to why.
>>
>> Is there anything I can do to gain more insight about what is going
>> wrong?  I've included a gist of the RateLimit[1] code in case there is
>> anything obvious wrong there.
>>
>> Thanks in advance,
>> Evan
>>
>> [1] https://gist.github.com/egalpin/162a04b896dc7be1d0899acf17e676b3
>> [2]
>> https://github.com/apache/beam/blob/c8d92b03b6b2029978dbc2bf824240232c5e61ac/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java
>>
>

Re: [Dataflow][Java][stateful] Workflow Failed when trying to introduce stateful RateLimit

Posted by Luke Cwik via user <us...@beam.apache.org>.
I took a look at the code and nothing obvious stood out to me in the code
as this is a ParDo with OnWindowExpiration. Just to make sure, the rate
limit is per key and would only be a global rate limit if there was a
single key.

Are the workers trying to start?
* If no, then you would need to open a support case and share some job ids
so that someone could debug internal service logs.
* If yes, then did the workers start successfully?
** If no, logs should have some details as to why the worker couldn't start.
** If yes, are the workers getting work items?
*** If no, then you would need to open a support case and share some
job ids so that someone could debug internal service logs.
*** If yes then the logs should have some details as to why the work items
are failing.


On Fri, Aug 5, 2022 at 7:36 AM Evan Galpin <eg...@apache.org> wrote:

> Hi all,
>
> I'm trying to create a RateLimit[1] transform that's based fairly heavily
> on GroupIntoBatches[2]. I've been able to run unit tests using TestPipeline
> to verify desired behaviour and have also run successfully using
> DirectRunner.  However, when I submit the same job to Dataflow it
> completely fails to start and only gives the error message "Workflow
> Failed." The job builds/uploads/submits without error, but never starts and
> gives no detail as to why.
>
> Is there anything I can do to gain more insight about what is going
> wrong?  I've included a gist of the RateLimit[1] code in case there is
> anything obvious wrong there.
>
> Thanks in advance,
> Evan
>
> [1] https://gist.github.com/egalpin/162a04b896dc7be1d0899acf17e676b3
> [2]
> https://github.com/apache/beam/blob/c8d92b03b6b2029978dbc2bf824240232c5e61ac/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java
>