You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/08/07 07:44:47 UTC

[GitHub] [iceberg] openinx opened a new issue #1305: Flink: Refactor to replace Row type with RowData type in write path.

openinx opened a new issue #1305:
URL: https://github.com/apache/iceberg/issues/1305


   We have upgraded the flink version to 1.11, and flink 1.11 have turned its Row data type to `RowData`. While the previous parquet/avro readers writers we developed were based on `Row` type,  now @JingsongLi have contributed the `RowData`  avro reader and writer (https://github.com/apache/iceberg/pull/1232),  @chenjunjiedada  is helping to contribute the `RowData` parquet reader(https://github.com/apache/iceberg/pull/1266) and writer (https://github.com/apache/iceberg/pull/1272),  and I've pushed a  `RowData`  orc reader and writer (https://github.com/apache/iceberg/pull/1255) for reviewing.    
   
   IMO,  we'd better to replace the `Row` with `RowData` in the flink module as soon as possible, so that we could unify all the path and put all the resources (both developing and reviewing resources) on `RowData` path.  My plan is: 
   
   1.  As the patch (https://github.com/apache/iceberg/pull/1145) about flink IcebergStreamWriter has been reviewed and is ready to merge now,  so we let this patch get into master branch firstly. 
   2.  The flink TaskWriter unit tests are running based on `Row` partition key,  before turning to `RowData` we need to implement `RowData` partition key firstly.  So I prepared the patch `RowDataWrapper` (https://github.com/apache/iceberg/pull/1299).   Get this patch merged is the second step. 
   3.  We will need an extra patch doing the refactor to replace all the `Row` type with `RowData` (I have implemented one in my own branch https://github.com/apache/iceberg/commit/2af37c53fd36639ba41aebd362f379c7f5451ed1), and make sure all the unit tests could pass.  From this point in time,  all flink development and unit tests will use `RowData`. 
   4.  The future RowData parquet/orc reader and writer will be added in the `TaskWriter` tests. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx closed issue #1305: Flink: Refactor to replace Row type with RowData type in write path.

Posted by GitBox <gi...@apache.org>.
openinx closed issue #1305:
URL: https://github.com/apache/iceberg/issues/1305


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org