You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/05/27 08:52:02 UTC

[GitHub] [iceberg] sfiend commented on issue #2627: Using Kafka to insert multiple pieces of data with the same primary key value in Iceberg at one time, the data cannot be queried

sfiend commented on issue #2627:
URL: https://github.com/apache/iceberg/issues/2627#issuecomment-849458968


   I had met the same problem before. When multiple pieces of data with the same primary key value are inserted in the same batch, besides equality delete files, iceberg will also write position delete files. A fter that during your query, when FlinkInputFormat initialize the RowDataIterator and read next, the iterator will initialize the FlinkDeleteFilter, in this initialization, the FlinkDeleteFilter's parent will add a column named '_pos' if the current split has position delete file. I think the purpose of adding column is to apply the position delete file depending on the position of each row, but iceberg did not delete it before sending the result rows to flink.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org