You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Ahmet Altay (Jira)" <ji...@apache.org> on 2021/12/21 21:07:00 UTC

[jira] [Commented] (BEAM-12898) Flink Load Tests failure- UncheckedExecutionException - leaking vms

    [ https://issues.apache.org/jira/browse/BEAM-12898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17463465#comment-17463465 ] 

Ahmet Altay commented on BEAM-12898:
------------------------------------

A changed to use a supported dataproc clusters (https://github.com/apache/beam/pull/16310) makes this problem worse and now the load tests fail to create flink clusters. We went with the change since it was vulnerability related, and tests were disabled anyway.

Work to re-enable the tests, first need to start with fixing the creation of flink clusters with the update dataproc image version (1.5 instead of 1.2)

> Flink Load Tests failure- UncheckedExecutionException - leaking vms
> -------------------------------------------------------------------
>
>                 Key: BEAM-12898
>                 URL: https://issues.apache.org/jira/browse/BEAM-12898
>             Project: Beam
>          Issue Type: Test
>          Components: test-failures
>            Reporter: Alex Amato
>            Assignee: Kyle Weaver
>            Priority: P2
>         Attachments: 6L8weM2p7mDLMJV.png, BmJoKx8T8pZT2Ls.png
>
>          Time Spent: 6h
>  Remaining Estimate: 0h
>
> Same failure from different tests:
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_CoGBK_Flink_Batch/277/console]
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_Combine_Flink_Batch/289/console]
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_GBK_Flink_Batch/290/console]
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_ParDo_Flink_Batch/295/console]
> I think that this test may also be responsible for leaking some gce vms on  apache-beam-testing. As this morning we discovered several vms that were not torn down. I suspect this is the cause of the leaked vms.
> The vms have names like this:
> vm names:
>  beam-loadtests-python*flink*
>  beam-loadtests-go*flink*
> i.e.
> beam-loadtests-go-cogbk-flink-batch-277-m
>  beam-loadtests-go-gbk-flink-batch-290-w-2
>  beam-loadtests-go-pardo-flink-batch-295-m
>  beam-loadtests-go-sideinput-flink-batch-269-w-2
>  beam-loadtests-python-combine-flink-batch-766-m
>  beam-loadtests-python-combine-flink-streaming-368-w-0
>  beam-loadtests-python-pardo-flink-batch-716-m
>  
> It seems like this tests are spinning up a dataproc cluster. The gce metadata on the vms refers to a lot of dataproc stuff (attached). Likely the tests are crashing and not running their code to clean up/shutdown the dataproc cluster.
> Logs
> ----
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_Combine_Flink_Batch/lastBuild/console]
> 01:43:59 2021/09/14 08:43:59  (): org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Encountered unsupported logical type URN: int01:43:59 2021/09/14 08:43:59  (): org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Encountered unsupported logical type URN: int01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents.getCoder(RehydratedComponents.java:168)01:43:59 at org.apache.beam.runners.fnexecution.wire.WireCoders.instantiateRunnerWireCoder(WireCoders.java:94)01:43:59 at org.apache.beam.runners.fnexecution.wire.WireCoders.instantiateRunnerWireCoder(WireCoders.java:75)01:43:59 at org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator.translateExecutableStage(FlinkBatchPortablePipelineTranslator.java:311)01:43:59 at org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator.translate(FlinkBatchPortablePipelineTranslator.java:272)01:43:59 at org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator.translate(FlinkBatchPortablePipelineTranslator.java:118)01:43:59 at org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:115)01:43:59 at org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:85)01:43:59 at org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:86)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)01:43:59 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)01:43:59 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)01:43:59 at java.lang.Thread.run(Thread.java:748)01:43:59 Caused by: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Encountered unsupported logical type URN: int01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents.getCoder(RehydratedComponents.java:168)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslation.fromKnownCoder(CoderTranslation.java:158)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslation.fromProto(CoderTranslation.java:145)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:87)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:82)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)01:43:59 ... 18 more01:43:59 Caused by: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Encountered unsupported logical type URN: int01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents.getCoder(RehydratedComponents.java:168)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslation.fromKnownCoder(CoderTranslation.java:158)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslation.fromProto(CoderTranslation.java:145)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:87)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:82)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)01:43:59 ... 30 more01:43:59 Caused by: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Encountered unsupported logical type URN: int01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents.getCoder(RehydratedComponents.java:168)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslation.fromKnownCoder(CoderTranslation.java:158)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslation.fromProto(CoderTranslation.java:145)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:87)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:82)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)01:43:59 ... 42 more01:43:59 Caused by: java.lang.IllegalArgumentException: Encountered unsupported logical type URN: int01:43:59 at org.apache.beam.sdk.schemas.SchemaTranslation.fieldTypeFromProtoWithoutNullable(SchemaTranslation.java:328)01:43:59 at org.apache.beam.sdk.schemas.SchemaTranslation.fieldTypeFromProto(SchemaTranslation.java:244)01:43:59 at org.apache.beam.sdk.schemas.SchemaTranslation.fieldFromProto(SchemaTranslation.java:238)01:43:59 at org.apache.beam.sdk.schemas.SchemaTranslation.schemaFromProto(SchemaTranslation.java:212)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslators$8.fromComponents(CoderTranslators.java:169)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslators$8.fromComponents(CoderTranslators.java:151)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslation.fromKnownCoder(CoderTranslation.java:170)01:43:59 at org.apache.beam.runners.core.construction.CoderTranslation.fromProto(CoderTranslation.java:145)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:87)01:43:59 at org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:82)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)01:43:59 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)01:43:59 ... 54 more



--
This message was sent by Atlassian Jira
(v8.20.1#820001)