You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "cccs-jc (via GitHub)" <gi...@apache.org> on 2023/04/12 17:45:50 UTC

[GitHub] [iceberg] cccs-jc opened a new issue, #7334: location parameter of remove_orphan_files procedure relative to table location

cccs-jc opened a new issue, #7334:
URL: https://github.com/apache/iceberg/issues/7334

   ### Feature Request / Improvement
   
   I'm writing to an Iceberg table using spark structured streaming. I chose to put my checkpoint dir inside my target table
   /iceberg/my_schema/data
   /iceberg/my_schema/metadata
   /iceberg/my_schema/checkpoint
   
   When I run the procedure to remove orphan files, Iceberg considers the files inside the checkpoint dir as orphans and wants to delete them.
   
   ```sql
   CALL my_catalog.system.remove_orphan_files(
       table => '[my_catalog.my](http://my_catalog.my/)_schema.telemetry_table',
       older_than => timestamp '2023-04-06 00:00:00',
       dry_run => true)
   ```
   I then tried to specify the location I want Iceberg to clean. For example the data folder. However, it seems like I have to give it a full path. Is there a way to refer to the table location. Something like {table_location}/data ?
   ```sql
   CALL my_catalog.system.remove_orphan_files(
       table => '[my_catalog.my](http://my_catalog.my/)_schema.telemetry_table',
       location => '{table_location}/data',
       older_than => timestamp '2023-04-06 00:00:00',
       dry_run => true)
   ```
   
   The work around I did, is to use the `describe table` to get the location of the table
   ```
   spark.sql(f"describe extended {table_name}").where("col_name = 'Location'").collect()
   ```
   And use that location in the statements above
   
   
   
   
   ### Query engine
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] schaure commented on issue #7334: location parameter of remove_orphan_files procedure relative to table location

Posted by "schaure (via GitHub)" <gi...@apache.org>.
schaure commented on issue #7334:
URL: https://github.com/apache/iceberg/issues/7334#issuecomment-1735164301

   gives error as "describe extended ^^


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org