You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/04/02 11:48:23 UTC

[GitHub] [incubator-iceberg] waterlx edited a comment on issue #856: [WIP] Flink Iceberg sink

waterlx edited a comment on issue #856: [WIP] Flink Iceberg sink
URL: https://github.com/apache/incubator-iceberg/pull/856#issuecomment-607796003
 
 
   Some update here in case you are interested:
   1. Spectator and Metacat are moved. #867, #868 and #871 are closed as result.
   2. Some duplicated logic are added to determine whethere it is Hive Catalog or Hadoop Catalog based on a setting, which I do not think has a good fit here. But they are used to check if the code path could apply to Hadoop Catalog as well. Plan to remove them or figure out a better way. 
   3. Working on removing S3/AWS SDK and re-writing it using FileIO. (#872)
   4. Will work on breaking up the logic into resonable pieces so as to add them to code base and be friendly to review, after the remove work are all cleared.
   5. The PR is still NOT ready for detailed review yet.
   
   More on item 2 above:
   The current logic is to pass a few parameters (String) to Fink sink, to tell it about namespace, table, even catalog, then the internal logic new Catalog and load the table. Maybe a way to simplify that is to pass an instance of [Table](https://github.com/apache/incubator-iceberg/blob/master/api/src/main/java/org/apache/iceberg/Table.java), while the sink does not need to care about if Table is loaded by Hive/Hadoop Catalog or by HadoopTables. But the idea is blocked by BaseTable being not serializable, making the logic have to do more, such as new Catalog and load table. 
   
   @jerryshao @chenjunjiedada @openinx @aokolnychyi @bowenli86 @stevenzwu @rdblue FYI
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org