You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/03/30 12:19:19 UTC

[GitHub] [iceberg] pvary opened a new pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

pvary opened a new pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394


   As discussed in #2228 we should document that the Hive multi-table inserts are only atomic on table level.
   
   @massdosage, @rdblue, @RussellSpitzer, @jackye1995 could one of you please check my English?
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394#discussion_r604306703



##########
File path: site/docs/hive.md
##########
@@ -121,3 +121,11 @@ Columns from the Hive SQL `SELECT` clause are projected down to the Iceberg read
 
 #### Hive Query Engines
 Both the Map Reduce and Tez query execution engines are supported.
+
+#### Hive Multi-table inserts
+It is possible to issue inserts targeting multiple Hive tables backed by Iceberg. The users should be aware that the commits are atomic only on table level, and the commit is not atomic on query level. The commits to Iceberg tables happen 1-by-1, and if one of the commit fails the other writes will not be rolled back. Example:

Review comment:
       I'd move "Example" to another line and give a bit more context, like "Here is an example of inserting into multiple tables at once in Hive SQL:".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] massdosage commented on a change in pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
massdosage commented on a change in pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394#discussion_r604782225



##########
File path: site/docs/hive.md
##########
@@ -121,3 +121,11 @@ Columns from the Hive SQL `SELECT` clause are projected down to the Iceberg read
 
 #### Hive Query Engines
 Both the Map Reduce and Tez query execution engines are supported.
+
+#### Hive Multi-table inserts
+It is possible to issue inserts targeting multiple Hive tables backed by Iceberg. The users should be aware that the commits are atomic only on table level, and the commit is not atomic on query level. The commits to Iceberg tables happen 1-by-1, and if one of the commit fails the other writes will not be rolled back. Example:

Review comment:
       ```suggestion
   It is possible to issue inserts targeting multiple Hive tables backed by Iceberg. Users should be aware that the commits are atomic only at the _table_ level and not at the _query_ level. The commits to Iceberg tables happen 1-by-1, and if one of the commits fails the other writes will not be rolled back. Example:
   ```
   
   (or go with @RussellSpitzer's  rewording above)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394#discussion_r604078081



##########
File path: site/docs/hive.md
##########
@@ -121,3 +121,11 @@ Columns from the Hive SQL `SELECT` clause are projected down to the Iceberg read
 
 #### Hive Query Engines
 Both the Map Reduce and Tez query execution engines are supported.
+
+#### Hive Multi-table inserts
+It is possible to issue inserts targeting multiple Hive tables backed by Iceberg. The users should be aware that the commits are atomic only on table level, and the commit is not atomic on query level. The commits to Iceberg tables happen 1-by-1, and if one of the commit fails the other writes will not be rolled back. Example:

Review comment:
       small typo: `if one of the commits fails`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394#discussion_r604108412



##########
File path: site/docs/hive.md
##########
@@ -121,3 +121,11 @@ Columns from the Hive SQL `SELECT` clause are projected down to the Iceberg read
 
 #### Hive Query Engines
 Both the Map Reduce and Tez query execution engines are supported.
+
+#### Hive Multi-table inserts
+It is possible to issue inserts targeting multiple Hive tables backed by Iceberg. The users should be aware that the commits are atomic only on table level, and the commit is not atomic on query level. The commits to Iceberg tables happen 1-by-1, and if one of the commit fails the other writes will not be rolled back. Example:

Review comment:
       I have a slight suggestion here but be free to ignore it, just a rewording.
   
   Instead of "The users should be"
   
   "Multi-table inserts will not be atomic and are committed one table at a time.  Partial changes will be visible during the commit process and failures can leave partial changes committed. Changes within a single table will remain atomic."
   
   But feel free to ignore. I just wanted to move the important part closer to the beginning of the paragraph.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394#discussion_r604821455



##########
File path: site/docs/hive.md
##########
@@ -121,3 +121,11 @@ Columns from the Hive SQL `SELECT` clause are projected down to the Iceberg read
 
 #### Hive Query Engines
 Both the Map Reduce and Tez query execution engines are supported.
+
+#### Hive Multi-table inserts
+It is possible to issue inserts targeting multiple Hive tables backed by Iceberg. The users should be aware that the commits are atomic only on table level, and the commit is not atomic on query level. The commits to Iceberg tables happen 1-by-1, and if one of the commit fails the other writes will not be rolled back. Example:

Review comment:
       Fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394#issuecomment-811294184


   Merged! Thanks, @pvary and to everyone for reviewing!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394#discussion_r604821718



##########
File path: site/docs/hive.md
##########
@@ -121,3 +121,11 @@ Columns from the Hive SQL `SELECT` clause are projected down to the Iceberg read
 
 #### Hive Query Engines
 Both the Map Reduce and Tez query execution engines are supported.
+
+#### Hive Multi-table inserts
+It is possible to issue inserts targeting multiple Hive tables backed by Iceberg. The users should be aware that the commits are atomic only on table level, and the commit is not atomic on query level. The commits to Iceberg tables happen 1-by-1, and if one of the commit fails the other writes will not be rolled back. Example:

Review comment:
       Seems like a much better description than my one. Updated the PR.
   
   Thanks,
   Peter




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394#issuecomment-814736727


   Thanks for the reviews and the merge @rdblue, @RussellSpitzer, @massdosage, @marton-bod!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2394: Doc: Document non-atomicity of Hive multi-table inserts

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2394:
URL: https://github.com/apache/iceberg/pull/2394#discussion_r604822026



##########
File path: site/docs/hive.md
##########
@@ -121,3 +121,11 @@ Columns from the Hive SQL `SELECT` clause are projected down to the Iceberg read
 
 #### Hive Query Engines
 Both the Map Reduce and Tez query execution engines are supported.
+
+#### Hive Multi-table inserts
+It is possible to issue inserts targeting multiple Hive tables backed by Iceberg. The users should be aware that the commits are atomic only on table level, and the commit is not atomic on query level. The commits to Iceberg tables happen 1-by-1, and if one of the commit fails the other writes will not be rolled back. Example:

Review comment:
       Went with @RussellSpitzer's suggestion. For me it seems much cleaner than my version. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org