You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/06/22 19:52:27 UTC

[GitHub] [iceberg] jackye1995 opened a new issue #2724: Spark SQL extension syntax for identifier field update

jackye1995 opened a new issue #2724:
URL: https://github.com/apache/iceberg/issues/2724


   continuing the discussion in #2560 
   
   Currently I introduced the following syntax for identifier fields:
   
   ```sql
   ALTER TABLE table SET IDENTIFIER FIELDS col1, col2, col3
   ALTER TABLE table DROP IDENTIFIER FIELDS col1,col2
   ```
   
   In the end I had a thought that the above syntax is trying to mimic the Hive standard, but we are not really using that standard in Spark extensions anyway. It might be better to make it consistent with sort order update syntax, with something like:
   
   ```sql
   ALTER TABLE table WRITE IDENTIFIED BY col1,col2;
   ```
   
   And we do not support add/drop to avoid confusion.
   
   Any thoughts? @rdblue @aokolnychyi @openinx 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] jackye1995 commented on issue #2724: Spark SQL extension syntax for identifier field update

Posted by GitBox <gi...@apache.org>.

jackye1995 commented on issue #2724:
URL: https://github.com/apache/iceberg/issues/2724#issuecomment-866286830


   Also @yyanyy @RussellSpitzer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] yyanyy commented on issue #2724: Spark SQL extension syntax for identifier field update

Posted by GitBox <gi...@apache.org>.

yyanyy commented on issue #2724:
URL: https://github.com/apache/iceberg/issues/2724#issuecomment-867177322


   Sorry I didn't follow the rest of the discussion in the PR/it's been a while since I looked last time, I thought UNSET/DROP removes all identifier fields from the table. Now it seems that we can selectively drop some fields but not the others. Is this the case? I'm not sure if I like the idea of dropping some but not all fields since the user has to know what the original list was. 
   
   I think `WRITE IDENTIFIED BY col1,col2;` looks good for setting and replacing identifier fields, but I wonder what the command should be if we want to drop everything. Do we plan to make it `ALTER TABLE table WRITE IDENTIFIED BY;`? Would this look a bit odd as if the command hasn't finished, and potentially error prone if someone accidentally press `;` and enter when they are still coming up with the list? `UNSET`/`DROP` requires a completely different keyword which may reduce this risk. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org