You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@beam.apache.org by Victor Chen <ca...@victorplusc.com> on 2022/02/26 00:39:00 UTC

Errors when running FlinkRunner on a Dataproc cluster flink_master

Hi Beam Dev Team,

Ning, Sam, and I are looking to add supporting functionality to the
InteractiveRunner in the case where the underlying_runner is set to
FlinkRunner, and no flink_master is provided, but we are running into a few
issues that we were hoping someone from the dev team would be more familiar
with.

To provide some context, we are looking to automatically create a Dataproc
cluster with all configurations necessary for the FlinkRunner to run, and
we are currently able to get jobs to show up on the Flink dashboard
corresponding to the used cluster, which is available through the YARN
ResourceManager interface.

Currently, any pipeline that is being run on a provided Dataproc cluster
flink_master does not succeed, the job will show up on the Flink dashboard,
but fail after a few seconds. It seems that the recurring issues are the
'NoClassDefFoundError' error or the 'ServiceConfigurationError' error, as
shown in the examples below. These two errors show up using the same
configurations, with the NoClassDefFoundError seeming more common.

NoClassDefFoundError:
[image: image.png]

ServiceConfigurationError:
[image: image.png]

I have also attached the corresponding root exception logs from the Flink
dashboard.

The clusters are being configured with apache-beam[gcp]==2.35.0 installed,
a Dataproc image with Flink version 1.12.5 installed, and the optional
Docker and Flink Dataproc cluster components. The notebook that is being
used for these runs is also using Apache Beam 2.35.0. Is there additional
setup required for clusters to support Flink?

Are these issues that have been seen before?

Thank you in advance,
Victor

Re: Errors when running FlinkRunner on a Dataproc cluster flink_master

Posted by Kyle Weaver <kc...@google.com>.

Sorry I'm just now catching up on emails. This looks like a known issue
with an available workaround:
https://issues.apache.org/jira/browse/BEAM-10430

On Tue, Mar 1, 2022 at 4:19 PM Victor Chen <ca...@victorplusc.com> wrote:

> Just to add on - I noticed that the images I had originally attached were
> not sent properly. I have attached them again for reference:
>
> NoClassDefFoundError:
> [image: NoClassDefFoundError.png]
>
> ServiceConfigurationError:
> [image: ServiceConfigurationError.png]
>
> Thank you in advance,
> Victor
>
> On Fri, Feb 25, 2022 at 7:39 PM Victor Chen <ca...@victorplusc.com>
> wrote:
>
>> Hi Beam Dev Team,
>>
>> Ning, Sam, and I are looking to add supporting functionality to the
>> InteractiveRunner in the case where the underlying_runner is set to
>> FlinkRunner, and no flink_master is provided, but we are running into a few
>> issues that we were hoping someone from the dev team would be more familiar
>> with.
>>
>> To provide some context, we are looking to automatically create a
>> Dataproc cluster with all configurations necessary for the FlinkRunner to
>> run, and we are currently able to get jobs to show up on the Flink
>> dashboard corresponding to the used cluster, which is available through the
>> YARN ResourceManager interface.
>>
>> Currently, any pipeline that is being run on a provided Dataproc cluster
>> flink_master does not succeed, the job will show up on the Flink dashboard,
>> but fail after a few seconds. It seems that the recurring issues are the
>> 'NoClassDefFoundError' error or the 'ServiceConfigurationError' error, as
>> shown in the examples below. These two errors show up using the same
>> configurations, with the NoClassDefFoundError seeming more common.
>>
>> NoClassDefFoundError:
>> [image: image.png]
>>
>> ServiceConfigurationError:
>> [image: image.png]
>>
>> I have also attached the corresponding root exception logs from the Flink
>> dashboard.
>>
>> The clusters are being configured with apache-beam[gcp]==2.35.0
>> installed, a Dataproc image with Flink version 1.12.5 installed, and the
>> optional Docker and Flink Dataproc cluster components. The notebook that is
>> being used for these runs is also using Apache Beam 2.35.0. Is there
>> additional setup required for clusters to support Flink?
>>
>> Are these issues that have been seen before?
>>
>> Thank you in advance,
>> Victor
>>
>

Re: Errors when running FlinkRunner on a Dataproc cluster flink_master

Posted by Victor Chen <ca...@victorplusc.com>.

Just to add on - I noticed that the images I had originally attached were
not sent properly. I have attached them again for reference:

NoClassDefFoundError:
[image: NoClassDefFoundError.png]

ServiceConfigurationError:
[image: ServiceConfigurationError.png]

Thank you in advance,
Victor

On Fri, Feb 25, 2022 at 7:39 PM Victor Chen <ca...@victorplusc.com> wrote:

> Hi Beam Dev Team,
>
> Ning, Sam, and I are looking to add supporting functionality to the
> InteractiveRunner in the case where the underlying_runner is set to
> FlinkRunner, and no flink_master is provided, but we are running into a few
> issues that we were hoping someone from the dev team would be more familiar
> with.
>
> To provide some context, we are looking to automatically create a Dataproc
> cluster with all configurations necessary for the FlinkRunner to run, and
> we are currently able to get jobs to show up on the Flink dashboard
> corresponding to the used cluster, which is available through the YARN
> ResourceManager interface.
>
> Currently, any pipeline that is being run on a provided Dataproc cluster
> flink_master does not succeed, the job will show up on the Flink dashboard,
> but fail after a few seconds. It seems that the recurring issues are the
> 'NoClassDefFoundError' error or the 'ServiceConfigurationError' error, as
> shown in the examples below. These two errors show up using the same
> configurations, with the NoClassDefFoundError seeming more common.
>
> NoClassDefFoundError:
> [image: image.png]
>
> ServiceConfigurationError:
> [image: image.png]
>
> I have also attached the corresponding root exception logs from the Flink
> dashboard.
>
> The clusters are being configured with apache-beam[gcp]==2.35.0 installed,
> a Dataproc image with Flink version 1.12.5 installed, and the optional
> Docker and Flink Dataproc cluster components. The notebook that is being
> used for these runs is also using Apache Beam 2.35.0. Is there additional
> setup required for clusters to support Flink?
>
> Are these issues that have been seen before?
>
> Thank you in advance,
> Victor
>

Re: Errors when running FlinkRunner on a Dataproc cluster flink_master

Posted by Andoni Guzman Becerra <an...@wizeline.com>.

Hi Victor,
I'm facing the same issue, I'm working trying to re-enable some load tests
that use flink. My configuration in dataproc 2.0 and flink 1.12 as you.
I'll let you know if I found a solution for this issue, also if you find a
solution for this problem, please let me know how you solve it.

On Sun, Feb 27, 2022 at 11:58 PM Ahmet Altay <al...@google.com> wrote:

> /cc @Kyle Weaver <kc...@google.com>
>
> On Fri, Feb 25, 2022 at 4:39 PM Victor Chen <ca...@victorplusc.com>
> wrote:
>
>> Hi Beam Dev Team,
>>
>> Ning, Sam, and I are looking to add supporting functionality to the
>> InteractiveRunner in the case where the underlying_runner is set to
>> FlinkRunner, and no flink_master is provided, but we are running into a few
>> issues that we were hoping someone from the dev team would be more familiar
>> with.
>>
>> To provide some context, we are looking to automatically create a
>> Dataproc cluster with all configurations necessary for the FlinkRunner to
>> run, and we are currently able to get jobs to show up on the Flink
>> dashboard corresponding to the used cluster, which is available through the
>> YARN ResourceManager interface.
>>
>> Currently, any pipeline that is being run on a provided Dataproc cluster
>> flink_master does not succeed, the job will show up on the Flink dashboard,
>> but fail after a few seconds. It seems that the recurring issues are the
>> 'NoClassDefFoundError' error or the 'ServiceConfigurationError' error, as
>> shown in the examples below. These two errors show up using the same
>> configurations, with the NoClassDefFoundError seeming more common.
>>
>> NoClassDefFoundError:
>> [image: image.png]
>>
>> ServiceConfigurationError:
>> [image: image.png]
>>
>> I have also attached the corresponding root exception logs from the Flink
>> dashboard.
>>
>> The clusters are being configured with apache-beam[gcp]==2.35.0
>> installed, a Dataproc image with Flink version 1.12.5 installed, and the
>> optional Docker and Flink Dataproc cluster components. The notebook that is
>> being used for these runs is also using Apache Beam 2.35.0. Is there
>> additional setup required for clusters to support Flink?
>>
>> Are these issues that have been seen before?
>>
>> Thank you in advance,
>> Victor
>>
>

-- 

Andoni Guzman | WIZELINE

Software Engineer II

andoni.guzman@wizeline.com

Amado Nervo 2200, Esfera P6, Col. Ciudad del Sol, 45050 Zapopan, Jal.

-- 
*This email and its contents (including any attachments) are being sent to
you on the condition of confidentiality and may be protected by legal
privilege. Access to this email by anyone other than the intended recipient
is unauthorized. If you are not the intended recipient, please immediately
notify the sender by replying to this message and delete the material
immediately from your system. Any further use, dissemination, distribution
or reproduction of this email is strictly prohibited. Further, no
representation is made with respect to any content contained in this email.*

Re: Errors when running FlinkRunner on a Dataproc cluster flink_master

Posted by Ahmet Altay <al...@google.com>.

/cc @Kyle Weaver <kc...@google.com>

On Fri, Feb 25, 2022 at 4:39 PM Victor Chen <ca...@victorplusc.com> wrote:

> Hi Beam Dev Team,
>
> Ning, Sam, and I are looking to add supporting functionality to the
> InteractiveRunner in the case where the underlying_runner is set to
> FlinkRunner, and no flink_master is provided, but we are running into a few
> issues that we were hoping someone from the dev team would be more familiar
> with.
>
> To provide some context, we are looking to automatically create a Dataproc
> cluster with all configurations necessary for the FlinkRunner to run, and
> we are currently able to get jobs to show up on the Flink dashboard
> corresponding to the used cluster, which is available through the YARN
> ResourceManager interface.
>
> Currently, any pipeline that is being run on a provided Dataproc cluster
> flink_master does not succeed, the job will show up on the Flink dashboard,
> but fail after a few seconds. It seems that the recurring issues are the
> 'NoClassDefFoundError' error or the 'ServiceConfigurationError' error, as
> shown in the examples below. These two errors show up using the same
> configurations, with the NoClassDefFoundError seeming more common.
>
> NoClassDefFoundError:
> [image: image.png]
>
> ServiceConfigurationError:
> [image: image.png]
>
> I have also attached the corresponding root exception logs from the Flink
> dashboard.
>
> The clusters are being configured with apache-beam[gcp]==2.35.0 installed,
> a Dataproc image with Flink version 1.12.5 installed, and the optional
> Docker and Flink Dataproc cluster components. The notebook that is being
> used for these runs is also using Apache Beam 2.35.0. Is there additional
> setup required for clusters to support Flink?
>
> Are these issues that have been seen before?
>
> Thank you in advance,
> Victor
>