You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/12/11 06:49:01 UTC

[GitHub] [iceberg] ajantha-bhat opened a new pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

ajantha-bhat opened a new pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718


   ExpireSnapshot Action supports setting specific snapshots ids to expire, but it is missing in call procedure. Hence this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718#issuecomment-997145715


   I think i'm the side of less options exposed the better. Since SQL is the "easy" first use api I would try to hide features which may cause undesired effects. I'm not sure it really hinders anyone who specifically ones to expire certain snapshots since they already would need to be very careful. 
   
   For example I assume Nessie would use the underlying api with a separate interface for Nessie users since actually determining the list of Snapshots that can be removed will be non-trivial. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718#issuecomment-997101447


   I realize that this is already exposed through the API. But is there a need for SQL users to call this? Do you have people that need to do this or are you adding it because it is something exposed in Actions that is not present in the stored procedure?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718#issuecomment-999015655


   > Since SQL is the "easy" first use api I would try to hide features which may cause undesired effects.
   
   +1, especially as we are adding branching feature, expiring a specific snapshot by ID should be strongly discouraged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue closed pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
rdblue closed pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718#issuecomment-997128972


   @rdblue : Nessie is planning to use expireSnapshots action with specific expire snapshot id as if we don't set list of expire snapshot id, action will expire table's snapshots that are referred in other branches also.
   
   We don't have any plans for using SQL yet, but it is just my own initiative to make SQL procedure same as Actions. So, that user can use it with all functionalities. I think these changes will help users.
   @rymurr , @RussellSpitzer , @jackye1995 : What is your opinion on this PR changes ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718#issuecomment-997485900


   Oops, I didn't intend to close this. I clicked the wrong button.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat closed pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
ajantha-bhat closed pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718#issuecomment-997485868


   > Since SQL is the "easy" first use api I would try to hide features which may cause undesired effects.
   
   Well said. That captures my reservations about this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718#issuecomment-999282213


   @jackye1995 , @RussellSpitzer , @rdblue : Thanks for your inputs. As majority of you (I think all of you) are not in favour of it, So, I am closing this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718#issuecomment-991949009


   @ajantha-bhat, what is the use case for exposing this?
   
   Expiring individual snapshots by ID is not a recommended operation because it can break certain operations that use table history. Unless there's a reason to expose this to SQL users, I would rather not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on pull request #3718: Spark: Support passing specific list of snapshot id for ExpireSnapshots procedure

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #3718:
URL: https://github.com/apache/iceberg/pull/3718#issuecomment-992079054


   > @ajantha-bhat, what is the use case for exposing this?
   
   > Expiring individual snapshots by ID is not a recommended operation because it can break certain operations that use table history. Unless there's a reason to expose this to SQL users, I would rather not.
   
   @rdblue 
   a) I am not newly exposing this feature to the user, it was already exposed in the Spark Actions. I am just supporting from SQL also to be in sync.
   b) I see the code to handle snapshot history log file also present in expireSnapshots. So, can you elaborate what operations it can break ? If it is really issue, do we need to handle this for spark actions also ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org