You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/02/22 09:57:07 UTC

[GitHub] [iceberg] Initial-neko opened a new issue #4192: Remove Orphan Files error...

Initial-neko opened a new issue #4192:
URL: https://github.com/apache/iceberg/issues/4192


   when I call SparkAction 
   `SparkActions
                   .get()
                   .deleteOrphanFiles(table)
                   .olderThan(System.currentTimeMillis() - 1000000)
                   .execute();`
   it throws
   
   `Exception in thread "main" java.lang.IllegalArgumentException: Cannot find the metadata table for xxxx of type ALL_MANIFESTS
   	at org.apache.iceberg.spark.actions.BaseSparkAction.loadMetadataTable(BaseSparkAction.java:191)
   	at org.apache.iceberg.spark.actions.BaseSparkAction.buildValidDataFileDF(BaseSparkAction.java:121)
   	at org.apache.iceberg.spark.actions.BaseDeleteOrphanFilesSparkAction.doExecute(BaseDeleteOrphanFilesSparkAction.java:154)
   	at org.apache.iceberg.spark.actions.BaseSparkAction.withJobGroupInfo(BaseSparkAction.java:101)
   	at org.apache.iceberg.spark.actions.BaseDeleteOrphanFilesSparkAction.execute(BaseDeleteOrphanFilesSparkAction.java:141)
   	at org.apache.iceberg.spark.actions.BaseDeleteOrphanFilesSparkAction.execute(BaseDeleteOrphanFilesSparkAction.java:76)
   	`
   
   So what exactly is a metadata table? Why don't I find it in my data file and metadata file
   thanks~
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Initial-neko commented on issue #4192: Remove Orphan Files error...

Posted by GitBox <gi...@apache.org>.
Initial-neko commented on issue #4192:
URL: https://github.com/apache/iceberg/issues/4192#issuecomment-1048409274


   > The issue is probably that the table object you are using is not from the configured Spark Catalog. Try loading the table instance without Spark3Util.loadIcebergTable.
   > 
   > The error is because Spark needs the catalog name to match the one in the Spark conf. Otherwise it does not know where to find the table. When instantiating a new table using the Java api it will have the default catalog name which is most likely not the same.
   > 
   > A metadata table is just a special view of a table which returns information like what partitions are in the table or what files are in the table. All iceberg tables have them available, see the docs for more info.
   
   Our tables are loaded through hivecatalog. There will be no such problems in expiresnapshot and rewrite operations. I will confirm it; Thanks for your reply~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #4192: Remove Orphan Files error...

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on issue #4192:
URL: https://github.com/apache/iceberg/issues/4192#issuecomment-1047769838


   The issue is probably that the table object you are using is not from the configured Spark Catalog.  Try loading the table instance without Spark3Util.loadIcebergTable. 
   
   The error is because Spark needs the catalog name to match the one in the Spark conf. Otherwise it does not know where to find the table. When instantiating a new table using the Java api it will have the default catalog name which is most likely not the same.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer edited a comment on issue #4192: Remove Orphan Files error...

Posted by GitBox <gi...@apache.org>.
RussellSpitzer edited a comment on issue #4192:
URL: https://github.com/apache/iceberg/issues/4192#issuecomment-1047769838


   The issue is probably that the table object you are using is not from the configured Spark Catalog.  Try loading the table instance without Spark3Util.loadIcebergTable. 
   
   The error is because Spark needs the catalog name to match the one in the Spark conf. Otherwise it does not know where to find the table. When instantiating a new table using the Java api it will have the default catalog name which is most likely not the same.
   
   A metadata table is just a special view of a table which returns information like what partitions are in the table or what files are in the table. All iceberg tables have them available, see the docs for more info.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org