You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/03/08 07:51:20 UTC

[GitHub] [incubator-seatunnel] ououtt opened a new pull request #1439: add iceberg spark batch sink

ououtt opened a new pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439


   #791 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] ououtt commented on pull request #1439: add iceberg spark batch sink

Posted by GitBox <gi...@apache.org>.
ououtt commented on pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439#issuecomment-1061757510


   @wuchunfu Hi. PTAL thx


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] wuchunfu commented on a change in pull request #1439: add iceberg spark batch sink

Posted by GitBox <gi...@apache.org>.
wuchunfu commented on a change in pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439#discussion_r822216094



##########
File path: docs/en/spark/configuration/sink-plugins/Iceberg.md
##########
@@ -0,0 +1,62 @@
+# Iceberg
+
+> Sink plugin: Iceberg [Spark]
+
+## Description
+
+Write data to Iceberg.
+
+## Options
+
+| name           | type   | required | default value |
+| -------------- | ------ | -------- | ------------- |
+| [path](#path)  | string | yes      | -             |
+| [saveMode](#saveMode) | string | yes | -             |
+| [target-file-size-bytes](#target-file-size-bytes) | long | no      | -   |
+| [check-nullability](#check-nullability) | bool | no| - |
+| [snapshot-property.custom-key](#snapshot-property.custom-key) | string | no| - |
+| [fanout-enabled](#fanout-enabled) | bool | no| - |
+| [check-ordering](#check-ordering) | bool | no| - |
+
+
+Refer to [iceberg write options](https://iceberg.apache.org/docs/latest/spark-configuration/) for more configurations.
+
+### path
+
+Iceberg table location.
+
+### saveMode
+

Review comment:
       The `Options` table above is best arranged neatly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] wuchunfu commented on a change in pull request #1439: add iceberg spark batch sink

Posted by GitBox <gi...@apache.org>.
wuchunfu commented on a change in pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439#discussion_r822215517



##########
File path: docs/en/spark/configuration/sink-plugins/Iceberg.md
##########
@@ -0,0 +1,62 @@
+# Iceberg
+
+> Sink plugin: Iceberg [Spark]
+
+## Description
+
+Write data to Iceberg.
+
+## Options
+
+| name           | type   | required | default value |
+| -------------- | ------ | -------- | ------------- |
+| [path](#path)  | string | yes      | -             |
+| [saveMode](#saveMode) | string | yes | -             |
+| [target-file-size-bytes](#target-file-size-bytes) | long | no      | -   |
+| [check-nullability](#check-nullability) | bool | no| - |
+| [snapshot-property.custom-key](#snapshot-property.custom-key) | string | no| - |
+| [fanout-enabled](#fanout-enabled) | bool | no| - |
+| [check-ordering](#check-ordering) | bool | no| - |
+
+
+Refer to [iceberg write options](https://iceberg.apache.org/docs/latest/spark-configuration/) for more configurations.
+
+### path
+
+Iceberg table location.
+
+### saveMode
+

Review comment:
       @ououtt `saveMode` I guess it's better to have a default value

##########
File path: docs/en/spark/configuration/sink-plugins/Iceberg.md
##########
@@ -0,0 +1,62 @@
+# Iceberg
+
+> Sink plugin: Iceberg [Spark]
+
+## Description
+
+Write data to Iceberg.
+
+## Options
+
+| name           | type   | required | default value |
+| -------------- | ------ | -------- | ------------- |
+| [path](#path)  | string | yes      | -             |
+| [saveMode](#saveMode) | string | yes | -             |
+| [target-file-size-bytes](#target-file-size-bytes) | long | no      | -   |
+| [check-nullability](#check-nullability) | bool | no| - |
+| [snapshot-property.custom-key](#snapshot-property.custom-key) | string | no| - |
+| [fanout-enabled](#fanout-enabled) | bool | no| - |
+| [check-ordering](#check-ordering) | bool | no| - |
+
+
+Refer to [iceberg write options](https://iceberg.apache.org/docs/latest/spark-configuration/) for more configurations.
+
+### path
+
+Iceberg table location.
+
+### saveMode
+

Review comment:
       Is `saveMode` only these two? I think you can write a link that looks for `saveMode` after




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] liujinhui1994 commented on pull request #1439: [Feature][Connector] Add iceberg spark batch sink

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439#issuecomment-1067502591


   @ououtt  @wuchunfu  Seatunnel is currently using spark2.4
   Will this PR work? According to my understanding iceberg can only read and write existing tables in this version


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] liujinhui1994 commented on pull request #1439: [Feature][Connector] Add iceberg spark batch sink

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439#issuecomment-1067516060


   @ououtt  If the table needs to be created ahead of time, I understand that seatunnel-iceberg is invalid?
   I feel the need to add the api to create the table in the code.
   At least sink should be like this? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] ououtt commented on a change in pull request #1439: add iceberg spark batch sink

Posted by GitBox <gi...@apache.org>.
ououtt commented on a change in pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439#discussion_r822251173



##########
File path: docs/en/spark/configuration/sink-plugins/Iceberg.md
##########
@@ -0,0 +1,62 @@
+# Iceberg
+
+> Sink plugin: Iceberg [Spark]
+
+## Description
+
+Write data to Iceberg.
+
+## Options
+
+| name           | type   | required | default value |
+| -------------- | ------ | -------- | ------------- |
+| [path](#path)  | string | yes      | -             |
+| [saveMode](#saveMode) | string | yes | -             |
+| [target-file-size-bytes](#target-file-size-bytes) | long | no      | -   |
+| [check-nullability](#check-nullability) | bool | no| - |
+| [snapshot-property.custom-key](#snapshot-property.custom-key) | string | no| - |
+| [fanout-enabled](#fanout-enabled) | bool | no| - |
+| [check-ordering](#check-ordering) | bool | no| - |
+
+
+Refer to [iceberg write options](https://iceberg.apache.org/docs/latest/spark-configuration/) for more configurations.
+
+### path
+
+Iceberg table location.
+
+### saveMode
+

Review comment:
       iceberg only support append and overwrite. and I have set the default value of saveMode to append and format the options table.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] wuchunfu merged pull request #1439: [Feature][Connector] Add iceberg spark batch sink

Posted by GitBox <gi...@apache.org>.
wuchunfu merged pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] ououtt commented on pull request #1439: [Feature][Connector] Add iceberg spark batch sink

Posted by GitBox <gi...@apache.org>.
ououtt commented on pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439#issuecomment-1067514805


   > @ououtt @wuchunfu Seatunnel is currently using spark2.4 Will this PR work? According to my understanding iceberg can only read and write existing tables in this version
   
   Yes, the table needs to be created in advance. In my local based on hdfs  test is no problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] ououtt commented on pull request #1439: [Feature][Connector] Add iceberg spark batch sink

Posted by GitBox <gi...@apache.org>.
ououtt commented on pull request #1439:
URL: https://github.com/apache/incubator-seatunnel/pull/1439#issuecomment-1067751179


   > @ououtt If the table needs to be created ahead of time, I understand that seatunnel-iceberg is invalid? I feel the need to add the api to create the table in the code. At least sink should be like this?
   
   Sorry, I didn't know it was necessary to create a table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org