You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/06/24 02:47:28 UTC

[GitHub] [iceberg] openinx opened a new issue #2730: Encounter exceptions when query the iceberg table filled with change logs by using flink SQL

openinx opened a new issue #2730:
URL: https://github.com/apache/iceberg/issues/2730


   Saying if we write few change log records ( INSERT,  DELETE, UPDATE_BEFORE, UPDATE_AFTER ) into apache iceberg table and then query it by using flink SQL, then we will encounter the exception: 
   
   ```txt
   [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source -> SinkConversionToTuple2 -> Sink: Select table sink (1/1) (03d66a7324d07ea31972ce3cc8d3f1df) switched from RUNNING to FAILED on 8e8e9f3c-67a5-4aaa-9cc8-18400ceee4a3 @ localhost (dataPort=-1).
   java.lang.IllegalArgumentException: Row arity: 4, but serializer arity: 2
   	at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:124)
   	at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:48)
   	at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:69)
   	at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
   	at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
   	at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50)
   	at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28)
   	at org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollect(StreamSourceContexts.java:317)
   	at org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collect(StreamSourceContexts.java:411)
   	at org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:92)
   	at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:110)
   	at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:66)
   	at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:241)
   ```
   
   The way to reproduce this bug is quite easy ,  please just apply the following patch, and then run the unit tests  `org.apache.iceberg.flink.TestChangeLogTable.testSqlChangeLogOnIdKey`
   
   ```patch
   diff --git a/flink/src/test/java/org/apache/iceberg/flink/TestChangeLogTable.java b/flink/src/test/java/org/apache/iceberg/flink/TestChangeLogTable.java
   index d44f45ab5..f4334b43c 100644
   --- a/flink/src/test/java/org/apache/iceberg/flink/TestChangeLogTable.java
   +++ b/flink/src/test/java/org/apache/iceberg/flink/TestChangeLogTable.java
   @@ -273,6 +273,8 @@ public class TestChangeLogTable extends ChangeLogTableTestBase {
        Table table = createTable(tableName, key, partitioned);
        sql("INSERT INTO %s SELECT * FROM %s", tableName, SOURCE_TABLE);
    
   +    sql("SELECT * FROM %s", tableName);
   +
        table.refresh();
        List<Snapshot> snapshots = findValidSnapshots(table);
        int expectedSnapshotNum = expectedRecordsPerCheckpoint.size();
   ```
   
   I guess other query engines such as spark sql , presto sql, hive sql will also encounter this issue if we've write few change log events in the iceberg table.
   
   I remember there's another issue also encountered this exception ( I encountered it many times in our asia user group). Will prepare a  pull request for this.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx closed issue #2730: Encounter exceptions when query the iceberg table filled with change logs by using flink SQL

Posted by GitBox <gi...@apache.org>.
openinx closed issue #2730:
URL: https://github.com/apache/iceberg/issues/2730


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on issue #2730: Encounter exceptions when query the iceberg table filled with change logs by using flink SQL

Posted by GitBox <gi...@apache.org>.
openinx commented on issue #2730:
URL: https://github.com/apache/iceberg/issues/2730#issuecomment-867293258


   The similar issue from apache iceberg issues: https://github.com/apache/iceberg/issues/2627


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on issue #2730: Encounter exceptions when query the iceberg table filled with change logs by using flink SQL

Posted by GitBox <gi...@apache.org>.
openinx commented on issue #2730:
URL: https://github.com/apache/iceberg/issues/2730#issuecomment-867292964


   There's a similar issue from the apache flink user mail list: http://apache-flink.147419.n8.nabble.com/Iceberg-Upsert-Iceberg-Kafka-td11912.html#a11929


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org