You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Reuven Lax (Jira)" <ji...@apache.org> on 2020/04/24 16:58:00 UTC

[jira] [Commented] (BEAM-9696) UnionCoder IndexOutOfBoundsException in schema-driven join transform

    [ https://issues.apache.org/jira/browse/BEAM-9696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091735#comment-17091735 ] 

Reuven Lax commented on BEAM-9696:
----------------------------------

Is there an easy repro to run?

This failure is happening deep inside CoGroupByKey. CGBK seems to have claimed that an element is from the third tag input, however there are only two inputs. The input index is encoded in the the CGBK element using CGBK's union coder. 

Looking at the code in CoGroupByKey.java this appears to be impossible, though it's possible that there's some weird latent bug in CoGroupByKey that we have triggered.  Would need a repro to debug.

> UnionCoder IndexOutOfBoundsException in schema-driven join transform
> --------------------------------------------------------------------
>
>                 Key: BEAM-9696
>                 URL: https://issues.apache.org/jira/browse/BEAM-9696
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql-zetasql, sdk-java-core
>            Reporter: Andrew Pilloud
>            Assignee: Reuven Lax
>            Priority: Minor
>              Labels: zetasql-compliance
>
> one failure in shard 17
> {code}
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
> 	at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348)
> 	at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318)
> 	at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213)
> 	at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
> 	at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
> 	at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
> 	at org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.runCollector(BeamEnumerableConverter.java:201)
> 	at org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.collectRows(BeamEnumerableConverter.java:218)
> 	at org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:150)
> 	at org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:127)
> 	at cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:329)
> 	at com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423)
> 	at com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
> 	at com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
> 	at com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711)
> 	at com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
> 	at com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
> 	at java.util.ArrayList.rangeCheck(ArrayList.java:658)
> 	at java.util.ArrayList.get(ArrayList.java:434)
> 	at org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83)
> 	at org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32)
> 	at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82)
> 	at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36)
> 	at org.apache.beam.sdk.util.CoderUtils.decodeFromSafeStream(CoderUtils.java:115)
> 	at org.apache.beam.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:98)
> 	at org.apache.beam.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:92)
> 	at org.apache.beam.sdk.util.CoderUtils.clone(CoderUtils.java:141)
> 	at org.apache.beam.sdk.util.MutationDetectors$CodedValueMutationDetector.<init>(MutationDetectors.java:115)
> 	at org.apache.beam.sdk.util.MutationDetectors.forValueWithCoder(MutationDetectors.java:46)
> 	at org.apache.beam.runners.direct.ImmutabilityCheckingBundleFactory$ImmutabilityEnforcingBundle.add(ImmutabilityCheckingBundleFactory.java:112)
> 	at org.apache.beam.runners.direct.ParDoEvaluator$BundleOutputManager.output(ParDoEvaluator.java:299)
> 	at org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:259)
> 	at org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner.access$800(SimpleDoFnRunner.java:79)
> 	at org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:405)
> 	at org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:393)
> 	at org.apache.beam.sdk.transforms.join.CoGroupByKey$ConstructUnionTableFn.processElement(CoGroupByKey.java:175)
> {code}
> {code}
> Apr 01, 2020 6:05:14 PM cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl executeQuery
> INFO: Processing Sql statement: SELECT 1, 0 UNION DISTINCT
> SELECT 1, NULL UNION DISTINCT
> SELECT 1, NULL UNION DISTINCT
> SELECT 2, NULL UNION DISTINCT
> SELECT NULL, 0 UNION DISTINCT
> SELECT NULL, 0 UNION DISTINCT
> SELECT NULL, 1 UNION DISTINCT
> SELECT NULL, NULL UNION DISTINCT
> SELECT NULL, NULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)