You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/01/06 23:38:00 UTC

[jira] [Work logged] (BEAM-13577) Beam Select's uniquifyNames function loses nullability of Complex types while inferring schema

     [ https://issues.apache.org/jira/browse/BEAM-13577?focusedWorklogId=704834&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-704834 ]

ASF GitHub Bot logged work on BEAM-13577:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Jan/22 23:37
            Start Date: 06/Jan/22 23:37
    Worklog Time Spent: 10m 
      Work Description: ibzib commented on pull request #16380:
URL: https://github.com/apache/beam/pull/16380#issuecomment-1007014608


   Andrew is temporarily on leave, so I'll take this one.
   
   R: @ibzib 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 704834)
    Time Spent: 40m  (was: 0.5h)

> Beam Select's uniquifyNames function loses nullability of Complex types while inferring schema 
> -----------------------------------------------------------------------------------------------
>
>                 Key: BEAM-13577
>                 URL: https://issues.apache.org/jira/browse/BEAM-13577
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql, sdk-java-core
>    Affects Versions: 2.32.0, 2.33.0, 2.34.0
>            Reporter: Talat Uyarer
>            Priority: P2
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> We use BeamSQL in our project. When we use any JOIN. SQL generates BeamCoGBKJoinRel plan which uses Select from core sdk. While Select infer output schema it loses nullability of complex types such as Array, Map. You can see an example error. 
> {code:java}
> INFO: SQL:
> SELECT `o1`.`order_id`, `o1`.`site_id`, `o1`.`price`, `o1`.`f_stringArr`, `o2`.`order_id` AS `order_id0`, `o2`.`site_id` AS `site_id0`, `o2`.`price` AS `price0`
> FROM `beam`.`ORDER_DETAILS1_WITH_ARRAY` AS `o1`
> INNER JOIN `beam`.`ORDER_DETAILS2` AS `o2` ON `o1`.`order_id` = `o2`.`site_id` AND `o2`.`price` = `o1`.`site_id`
> Dec 28, 2021 1:20:14 PM org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner convertToBeamRel
> INFO: SQLPlan>
> LogicalProject(order_id=[$0], site_id=[$1], price=[$2], f_stringArr=[$3], order_id0=[$4], site_id0=[$5], price0=[$6])
>   LogicalJoin(condition=[AND(=($0, $5), =($6, $1))], joinType=[inner])
>     BeamIOSourceRel(table=[[beam, ORDER_DETAILS1_WITH_ARRAY]])
>     BeamIOSourceRel(table=[[beam, ORDER_DETAILS2]])Dec 28, 2021 1:20:14 PM org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner convertToBeamRel
> INFO: BEAMPlan>
> BeamCoGBKJoinRel(condition=[AND(=($0, $5), =($6, $1))], joinType=[inner])
>   BeamIOSourceRel(table=[[beam, ORDER_DETAILS1_WITH_ARRAY]])
>   BeamIOSourceRel(table=[[beam, ORDER_DETAILS2]])
> Types not equal. provided output schema: Fields:
> Field{name=order_id, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=site_id, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=price, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=f_stringArr, description=, type=ARRAY<STRING NOT NULL>, options={{}}}
> Field{name=order_id0, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=site_id0, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=price0, description=, type=INT32 NOT NULL, options={{}}}
> Encoding positions:
> {f_stringArr=3, price0=6, price=2, site_id=1, order_id0=4, order_id=0, site_id0=5}
> Options:{{}}UUID: null Schema inferred from select: Fields:
> Field{name=317499b1-9c8a-4bb9-8897-4ecda110d02a, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=46249f84-b89e-439d-b799-a039b427a60a, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=565d397c-d36c-4387-b2e4-5d6402c839bd, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=bbe10404-3c44-41a4-b942-16f484211dab, description=, type=ARRAY<STRING NOT NULL> NOT NULL, options={{}}}
> Field{name=bd3e3adf-ae12-4155-9770-b0123c8bb18c, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=60a346c5-fa18-40e1-819f-c06af92ff033, description=, type=INT32 NOT NULL, options={{}}}
> Encoding positions:
> {2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6=5, 60a346c5-fa18-40e1-819f-c06af92ff033=6, 565d397c-d36c-4387-b2e4-5d6402c839bd=2, bbe10404-3c44-41a4-b942-16f484211dab=3, bd3e3adf-ae12-4155-9770-b0123c8bb18c=4, 46249f84-b89e-439d-b799-a039b427a60a=1, 317499b1-9c8a-4bb9-8897-4ecda110d02a=0}
> Options:{{}}UUID: null from input type: Fields:
> Field{name=lhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 NOT NULL, price INT32 NOT NULL, f_stringArr ARRAY<STRING NOT NULL>> NOT NULL, options={{}}}
> Field{name=rhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 NOT NULL, price INT32 NOT NULL> NOT NULL, options={{}}}
> Encoding positions:
> {lhs=0, rhs=1}
> Options:{{}}UUID: a35cf07b-2bc1-48b8-b229-3c2368993738
> java.lang.IllegalArgumentException: Types not equal. provided output schema: Fields:
> Field{name=order_id, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=site_id, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=price, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=f_stringArr, description=, type=ARRAY<STRING NOT NULL>, options={{}}}
> Field{name=order_id0, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=site_id0, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=price0, description=, type=INT32 NOT NULL, options={{}}}
> Encoding positions:
> {f_stringArr=3, price0=6, price=2, site_id=1, order_id0=4, order_id=0, site_id0=5}
> Options:{{}}UUID: null Schema inferred from select: Fields:
> Field{name=317499b1-9c8a-4bb9-8897-4ecda110d02a, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=46249f84-b89e-439d-b799-a039b427a60a, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=565d397c-d36c-4387-b2e4-5d6402c839bd, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=bbe10404-3c44-41a4-b942-16f484211dab, description=, type=ARRAY<STRING NOT NULL> NOT NULL, options={{}}}
> Field{name=bd3e3adf-ae12-4155-9770-b0123c8bb18c, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6, description=, type=INT32 NOT NULL, options={{}}}
> Field{name=60a346c5-fa18-40e1-819f-c06af92ff033, description=, type=INT32 NOT NULL, options={{}}}
> Encoding positions:
> {2b2a14ed-dcd6-45d5-b49d-ae8d3037d6c6=5, 60a346c5-fa18-40e1-819f-c06af92ff033=6, 565d397c-d36c-4387-b2e4-5d6402c839bd=2, bbe10404-3c44-41a4-b942-16f484211dab=3, bd3e3adf-ae12-4155-9770-b0123c8bb18c=4, 46249f84-b89e-439d-b799-a039b427a60a=1, 317499b1-9c8a-4bb9-8897-4ecda110d02a=0}
> Options:{{}}UUID: null from input type: Fields:
> Field{name=lhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 NOT NULL, price INT32 NOT NULL, f_stringArr ARRAY<STRING NOT NULL>> NOT NULL, options={{}}}
> Field{name=rhs, description=, type=ROW<order_id INT32 NOT NULL, site_id INT32 NOT NULL, price INT32 NOT NULL> NOT NULL, options={{}}}
> Encoding positions:
> {lhs=0, rhs=1}
> Options:{{}}UUID: a35cf07b-2bc1-48b8-b229-3c2368993738
>     at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
>     at org.apache.beam.sdk.schemas.transforms.Select$Fields.expand(Select.java:205)
>     at org.apache.beam.sdk.schemas.transforms.Select$Fields.expand(Select.java:157)
>     at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
>     at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:482)
>     at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:363)
>     at org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel.standardJoin(BeamCoGBKJoinRel.java:196)
>     at org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel.access$400(BeamCoGBKJoinRel.java:75)
>     at org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel$StandardJoin.expand(BeamCoGBKJoinRel.java:135)
>     at org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRel$StandardJoin.expand(BeamCoGBKJoinRel.java:93)
>     at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
>     at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:499)
>     at org.apache.beam.sdk.extensions.sql.impl.rel.BeamSqlRelUtils.toPCollection(BeamSqlRelUtils.java:72)
>     at org.apache.beam.sdk.extensions.sql.impl.rel.BeamSqlRelUtils.toPCollection(BeamSqlRelUtils.java:42)
>     at org.apache.beam.sdk.extensions.sql.impl.rel.BaseRelTest.compilePipeline(BaseRelTest.java:34)
>     at org.apache.beam.sdk.extensions.sql.impl.rel.BeamCoGBKJoinRelBoundedVsBoundedTest.testInnerJoin(BeamCoGBKJoinRelBoundedVsBoundedTest.java:83)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:322)
>     at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>     at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>     at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>     at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>     at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>     at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>     at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>     at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>     at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>     at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>     at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
>     at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>     at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
>     at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
>     at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>     at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:119)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
>     at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>     at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
>     at org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
>     at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:414)
>     at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
>     at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
>     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
>     at java.base/java.lang.Thread.run(Thread.java:829) {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)