You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/07/02 07:53:04 UTC
[GitHub] [iceberg] zhangdove opened a new issue #1160: NoSuchTableException: Table does not exist
zhangdove opened a new issue #1160:
URL: https://github.com/apache/iceberg/issues/1160
I have some test code use the function of `removeOrphanFiles`, throw NoSuchTableException.
```scala
case class TwoColumnRecord(id: String, name: String)
def testCode(spark: SparkSession): Unit = {
val schemaName = "testDb"
val tableName = "testTb"
val conf: Configuration = new Configuration(spark.sparkContext.hadoopConfiguration)
val catalog: HadoopCatalog = new HadoopCatalog(conf, conf.get("fs.defaultFS") + "/iceberg/warehouse")
// 1. create iceberg table by hadoopCatalog
val nameSpace = Namespace.of(schemaName)
val tableIdentifier: TableIdentifier = TableIdentifier.of(nameSpace, tableName)
val columns: List[Types.NestedField] = new ArrayList[Types.NestedField]
columns.add(Types.NestedField.of(1, true, "id", Types.StringType.get, "id doc"))
columns.add(Types.NestedField.of(2, true, "name", Types.StringType.get, "name doc"))
val schema: Schema = new Schema(columns)
val table: Table = catalog.createTable(tableIdentifier, schema)
// 2. create DataFrame
val df = spark.createDataFrame(Seq(TwoColumnRecord("1", "iceberg"), TwoColumnRecord("2", "spark"))).toDF()
// 3. write data to iceberg table
df.write.format("iceberg").mode("append").save(table.location())
Thread.sleep(1000)
// 4. write data by parquet to path of data
df.write.format("parquet").mode("append").save(table.location() + "/data/")
// 5. removeOrphanFiles
Thread.sleep(1000)
val actions: Actions = Actions.forTable(table)
val removeFileList = actions.removeOrphanFiles().olderThan(System.currentTimeMillis()).execute()
// throw Exception and exit
}
```
The expected result is normal exit and delete some orphan files. However, I get some error:
```java
Exception in thread "main" org.apache.iceberg.exceptions.NoSuchTableException: Table does not exist: testDb.testTb
at org.apache.iceberg.BaseMetastoreCatalog.loadMetadataTable(BaseMetastoreCatalog.java:153)
at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:139)
at org.apache.iceberg.spark.source.IcebergSource.findTable(IcebergSource.java:148)
at org.apache.iceberg.spark.source.IcebergSource.getTableAndResolveHadoopConfiguration(IcebergSource.java:177)
at org.apache.iceberg.spark.source.IcebergSource.createReader(IcebergSource.java:80)
at org.apache.iceberg.spark.source.IcebergSource.createReader(IcebergSource.java:74)
at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$SourceHelpers.createReader(DataSourceV2Relation.scala:155)
at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.create(DataSourceV2Relation.scala:172)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:204)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.iceberg.actions.RemoveOrphanFilesAction.buildValidDataFileDF(RemoveOrphanFilesAction.java:161)
at org.apache.iceberg.actions.RemoveOrphanFilesAction.execute(RemoveOrphanFilesAction.java:139)
at com.dove.iceberg.IcebergIssues$.testCode(IcebergIssues.scala:64)
at com.dove.iceberg.IcebergIssues$.main(IcebergIssues.scala:29)
```
I had checked my hadoop file.
```bash
[root@hadoop39 ~]# hdfs dfs -ls /iceberg/warehouse/testDb/testTb/*/*
-rw-r--r-- 3 hdfs supergroup 645 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/data/00000-0-dba22cf3-b96c-467c-8bf3-01dd7d2f45c6-00000.parquet
-rw-r--r-- 3 hdfs supergroup 630 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/data/00001-1-1c2caf36-bc08-492d-8f1b-cc0fe83795ab-00000.parquet
-rw-r--r-- 3 hdfs supergroup 0 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/data/_SUCCESS
-rw-r--r-- 3 hdfs supergroup 607 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/data/part-00000-7c56f3e9-3b0f-48db-ad77-340ea302074c-c000.snappy.parquet
-rw-r--r-- 3 hdfs supergroup 589 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/data/part-00001-7c56f3e9-3b0f-48db-ad77-340ea302074c-c000.snappy.parquet
-rw-r--r-- 3 hdfs supergroup 4221 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/metadata/1c330540-e731-4804-b1d4-4ace0952ea0a-m0.avro
-rw-r--r-- 3 hdfs supergroup 2544 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/metadata/snap-7685756080210806989-1-1c330540-e731-4804-b1d4-4ace0952ea0a.avro
-rw-r--r-- 3 hdfs supergroup 762 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/metadata/v1.metadata.json
-rw-r--r-- 3 hdfs supergroup 1503 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/metadata/v2.metadata.json
-rw-r--r-- 3 hdfs supergroup 1 2020-07-02 15:39 /iceberg/warehouse/testDb/testTb/metadata/version-hint.text
```
Iceberg table is created successed and write some data successed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] rdblue closed issue #1160: NoSuchTableException: Table does not exist
Posted by GitBox <gi...@apache.org>.
rdblue closed issue #1160:
URL: https://github.com/apache/iceberg/issues/1160
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] zhangdove commented on issue #1160: NoSuchTableException: Table does not exist
Posted by GitBox <gi...@apache.org>.
zhangdove commented on issue #1160:
URL: https://github.com/apache/iceberg/issues/1160#issuecomment-652859150
I have add a PR [1161](https://github.com/apache/iceberg/pull/1161).
Who can review the issue at a convenient time?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] rdblue commented on issue #1160: NoSuchTableException: Table does not exist
Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #1160:
URL: https://github.com/apache/iceberg/issues/1160#issuecomment-653621430
To summarize the fix from the PR, the issue is that a table loaded by HadoopCatalog reports a table name rather than a location. When we try to use the table name to load a metadata table from the Spark 2.4 IcebergSource, it uses the HiveCatalog instead of a HadoopCatalog (because the HiveCatalog is configured, HadoopCatalog is not). The fix is to convert tables loaded by a Hadoop catalog (name starting with `hadoop.`) to paths so that HadoopTables is used to load the metadata.
We should also consider adding configuration for the IcebergSource. Maybe we could configure it to use a HadoopCatalog.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org