You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/29 09:49:41 UTC

[GitHub] [iceberg] brysd commented on issue #6290: Spark SQL support for partition spec?

brysd commented on issue #6290:
URL: https://github.com/apache/iceberg/issues/6290#issuecomment-1330360184

Hi, thanks for your feedback.

maybe I'm getting the use case wrong but the intention we have is to create iceberg tables dynamically where the 'user' of the application can configure the partition columns. But this could change of course and we'd like to change the partition columns. Maybe the 'user' also removes the partition columns from the configuration and hence the partition field needs to be dropped.

So we would use the `alter table ... replace partition field ... with ...`, `alter table ... drop partition field ... `, `alter table ... add partition field ...` spark sql statements for sure. However, to be able to dynamically generate these spark SQL statements in the code we need to know what the current partition columns actually are.

So e.g. suppose we have a table A with a column ts and we originally set the partition on months(ts). Someone changes the configuration to days(ts). We then want to generate a spark sql statement like 'alter table A replace partition field months(ts) with days(ts)'. This implies we know that the current partition field = months(ts) before we execute the alter statement. We don't know this anymore in our configuration since this has been overwritten.
So can we retrieve somehow the current partition field(s) definitions through Spark SQL or the python API?

We know we can use `select * from table A.partitions` to get the partition instances but it is not clear how we can 'derive' the actual partition fields. We can retrieve the spec_id through this table (partitions) but how can we then find the potential transform and table field on which partitioning was done so we can dynamically create the spark SQL alter statement?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org