You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/09/07 03:36:46 UTC

[GitHub] [iceberg] shengkui opened a new issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

shengkui opened a new issue #3079:
URL: https://github.com/apache/iceberg/issues/3079


   I met a exception " Mkdirs failed to create file" while using Iceberg (v0.12.0) + Flink (v1.12.5) + hive metastore (v3.0.0) + s3a (ceph) storage.
   
   ```
   Flink SQL> CREATE CATALOG hive_catalog WITH (
   >   'type'='iceberg',
   >   'catalog-type'='hive',
   >   'uri'='thrift://172.21.92.171:32141',
   >   'clients'='5',
   >   'property-version'='1',
   >   'warehouse'='s3a://lsk/'
   > );
   2021-09-07 11:11:10,425 INFO  org.apache.hadoop.hive.conf.HiveConf                         [] - Found configuration file null
   2021-09-07 11:11:15,701 INFO  org.apache.hadoop.hive.metastore.HiveMetaStoreClient         [] - Trying to connect to metastore with URI thrift://172.21.92.171:32141
   2021-09-07 11:11:15,751 INFO  org.apache.hadoop.hive.metastore.HiveMetaStoreClient         [] - Opened a connection to metastore, current connections: 1
   2021-09-07 11:11:15,795 INFO  org.apache.hadoop.hive.metastore.HiveMetaStoreClient         [] - Connected to metastore.
   [INFO] Catalog has been created.
   
   Flink SQL> use catalog hive_catalog;
   
   Flink SQL> show databases;
   default
   
   Flink SQL> show tables;
   [INFO] Result was empty.
   
   Flink SQL> CREATE TABLE bench (
   >   id INT,
   >   val INT,
   >   num INT,
   >   location INT)
   > WITH (
   >     'connector' = 'iceberg',
   >     'path' = 's3a://lsk/demo',
   >     'fs.s3a.access.key' = 'OC4ECY5P75I3NP14763',
   >     'fs.s3a.secret.key' = 'jhgKEU9k5uKu2huQk6ycLAZBWNaVkYUFASQDk4KP',
   >     'fs.s3a.endpoint' = 'http://172.21.92.176:7480',
   >     'fs.s3a.aws.credentials.provider' = 'org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider',
   >     'fs.s3a.impl' = 'org.apache.hadoop.fs.s3a.S3AFileSystem',
   >     'fs.s3a.path.style.access' = 'true',
   >     'hive.metastore.uris' = 'thrift://172.21.92.171:32141',
   >     'hive.metastore.warehouse.dir' = 's3a://lsk/'
   > );
   [ERROR] Could not execute SQL statement. Reason:
   java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata (exists=false, cwd=file:/home/shengkui.leng)
   
   Flink SQL>
   ```
   
   The log of flink sql-client:
   
   ```
   2021-09-07 11:11:15,701 INFO  org.apache.hadoop.hive.metastore.HiveMetaStoreClient         [] - Trying to connect to metastore with URI thrift://172.21.92.171:32141
   2021-09-07 11:11:15,751 INFO  org.apache.hadoop.hive.metastore.HiveMetaStoreClient         [] - Opened a connection to metastore, current connections: 1
   2021-09-07 11:11:15,795 INFO  org.apache.hadoop.hive.metastore.HiveMetaStoreClient         [] - Connected to metastore.
   2021-09-07 11:11:32,542 INFO  org.apache.flink.table.catalog.CatalogManager                [] - Set the current default catalog as [hive_catalog] and the current default database as [default].
   2021-09-07 11:12:23,598 WARN  org.apache.flink.table.client.cli.CliClient                  [] - Could not execute SQL statement.
   org.apache.flink.table.client.gateway.SqlExecutionException: Could not execute statement: CREATE TABLE bench (
     id INT,
     val INT,
     num INT,
     location INT)
   WITH (
       'connector' = 'iceberg',
       'path' = 's3a://lsk/demo',
       'fs.s3a.access.key' = 'OC4ECY5P75I3NP14763',
       'fs.s3a.secret.key' = 'jhgKEU9k5uKu2huQk6ycLAZBWNaVkYUFASQDk4KP',
       'fs.s3a.endpoint' = 'http://172.21.92.176:7480',
       'fs.s3a.aws.credentials.provider' = 'org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider',
       'fs.s3a.impl' = 'org.apache.hadoop.fs.s3a.S3AFileSystem',
       'fs.s3a.path.style.access' = 'true',
       'hive.metastore.uris' = 'thrift://172.21.92.171:32141',
       'hive.metastore.warehouse.dir' = 's3a://lsk/'
   )
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:317) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.cli.CliClient.callDdl(CliClient.java:739) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.cli.CliClient.callDdl(CliClient.java:734) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.cli.CliClient.callCommand(CliClient.java:330) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at java.util.Optional.ifPresent(Optional.java:159) [?:1.8.0_282]
   	at org.apache.flink.table.client.cli.CliClient.open(CliClient.java:214) [flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:144) [flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.SqlClient.start(SqlClient.java:115) [flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.SqlClient.main(SqlClient.java:201) [flink-sql-client_2.12-1.12.5.jar:1.12.5]
   Caused by: org.apache.flink.table.api.TableException: Could not execute CreateTable in path `hive_catalog`.`default`.`bench`
   	at org.apache.flink.table.catalog.CatalogManager.execute(CatalogManager.java:796) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.catalog.CatalogManager.createTable(CatalogManager.java:632) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:776) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeSql$1(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:256) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	... 8 more
   Caused by: org.apache.iceberg.exceptions.RuntimeIOException: Failed to create file: file:/user/hive/warehouse/bench/metadata/00000-0a8000ba-e302-4f3e-ba54-7126120b8e94.metadata.json
   	at org.apache.iceberg.hadoop.HadoopOutputFile.createOrOverwrite(HadoopOutputFile.java:87) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.TableMetadataParser.internalWrite(TableMetadataParser.java:119) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.TableMetadataParser.overwrite(TableMetadataParser.java:109) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.BaseMetastoreTableOperations.writeNewMetadata(BaseMetastoreTableOperations.java:154) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:206) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:126) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.BaseMetastoreCatalog$BaseMetastoreCatalogTableBuilder.create(BaseMetastoreCatalog.java:216) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.CachingCatalog$CachingTableBuilder.lambda$create$0(CachingCatalog.java:212) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2344) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) ~[?:1.8.0_282]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2342) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2325) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.CachingCatalog$CachingTableBuilder.create(CachingCatalog.java:210) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.CachingCatalog.createTable(CachingCatalog.java:106) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.flink.FlinkCatalog.createTable(FlinkCatalog.java:379) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.flink.table.catalog.CatalogManager.lambda$createTable$10(CatalogManager.java:633) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.catalog.CatalogManager.execute(CatalogManager.java:790) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.catalog.CatalogManager.createTable(CatalogManager.java:632) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:776) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeSql$1(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:256) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	... 8 more
   Caused by: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata (exists=false, cwd=file:/home/shengkui.leng)
   	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:458) ~[flink-s3-fs-hadoop-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:443) ~[flink-s3-fs-hadoop-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169) ~[flink-s3-fs-hadoop-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149) ~[flink-s3-fs-hadoop-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038) ~[flink-s3-fs-hadoop-1.12.5.jar:1.12.5]
   	at org.apache.iceberg.hadoop.HadoopOutputFile.createOrOverwrite(HadoopOutputFile.java:85) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.TableMetadataParser.internalWrite(TableMetadataParser.java:119) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.TableMetadataParser.overwrite(TableMetadataParser.java:109) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.BaseMetastoreTableOperations.writeNewMetadata(BaseMetastoreTableOperations.java:154) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:206) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:126) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.BaseMetastoreCatalog$BaseMetastoreCatalogTableBuilder.create(BaseMetastoreCatalog.java:216) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.CachingCatalog$CachingTableBuilder.lambda$create$0(CachingCatalog.java:212) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2344) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) ~[?:1.8.0_282]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2342) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2325) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.CachingCatalog$CachingTableBuilder.create(CachingCatalog.java:210) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.CachingCatalog.createTable(CachingCatalog.java:106) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.iceberg.flink.FlinkCatalog.createTable(FlinkCatalog.java:379) ~[iceberg-flink-runtime-0.12.0.jar:?]
   	at org.apache.flink.table.catalog.CatalogManager.lambda$createTable$10(CatalogManager.java:633) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.catalog.CatalogManager.execute(CatalogManager.java:790) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.catalog.CatalogManager.createTable(CatalogManager.java:632) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:776) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeSql$1(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:256) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	... 8 more
   ```
   
   The path of metadata in the error message is "/user/hive/warehouse/bench/metadata", but it's not my real path. I've took a look at the source code of iceberg, it's the default value of warehouse. I've not found any setting about this in the document of Iceberg, have I missed something?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] shengkui edited a comment on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

shengkui edited a comment on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-916599585


   @openinx I've tried following the document you provided, and failed again.  Following is picked from the log of flink sql-client:
   
   ```
   Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Got exception: org.apache.hadoop.fs.s3a.AWSBadRequestException PUT 0-byte object  on lsk.db/: com.amazonaws.services.s3.model.AmazonS3Exception: null (Service: Amazon S3; Status Code: 400; Error Code: InvalidRequest; Request ID: tx000000000000000265ae0-006139a08d-5efa-default; S3 Extended Request ID: 5efa-default-default), S3 Extended Request ID: 5efa-default-default:InvalidRequest: null (Service: Amazon S3; Status Code: 400; Error Code: InvalidRequest; Request ID: tx000000000000000265ae0-006139a08d-5efa-default; S3 Extended Request ID: 5efa-default-default)
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result$create_database_resultStandardScheme.read(ThriftHiveMetastore.java:39343) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result$create_database_resultStandardScheme.read(ThriftHiveMetastore.java:39311) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result.read(ThriftHiveMetastore.java:39245) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_database(ThriftHiveMetastore.java:1106) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_database(ThriftHiveMetastore.java:1093) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:809) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.iceberg.hive.HiveCatalog.lambda$createNamespace$7(HiveCatalog.java:243) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:76) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.hive.HiveCatalog.createNamespace(HiveCatalog.java:242) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.flink.FlinkCatalog.createDatabase(FlinkCatalog.java:203) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.flink.FlinkCatalog.createDatabase(FlinkCatalog.java:196) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:968) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeSql$1(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:256) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	... 8 more
   ```
   
   I've tried hadoop catalog with S3(without hive-metastore), it works well. But there is someone said, iceberg need hive-metastore for S3 storage(#1468). I know iceberg have implemented Glue Catalog, but it for AWS. Is there any solution to use S3 without hive-metastore?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] shengkui commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

shengkui commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-915089953


   @openinx thanks, I'll have a try.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] shengkui commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

shengkui commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-915022408


   @openinx I've tried the latest flink-iceberg-runtime jar(built from master branch), it works when I use "file://" as warehouse. But it doesn't work with "s3a://".
   
   I've put following JARs into flink/lib/ directory:
   
   - flink-iceberg-runtime.jar
   - flink-sql-connector-hive.jar 
    
   Is there any other JAR should be placed under Flink's lib/ directory?
   
   Could you please give me some advice for configuration of "s3a"? I'm not very sure if I've written right parameter to the right place.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] shengkui commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

shengkui commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-946502324


   This issue is caused by the version of hive-metastore. I changed the hive to 2.3.9, then it's ok to create table and insert in flink sql client.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] openinx commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

openinx commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-914113853


   @shengkui , It's insecure to expose your s3 accessKey & accessSecret in this open issue. So I masked them in the issue. About the issue, I will also try under my local host.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] openinx commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

openinx commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-914247097


   @shengkui ,  I've proposed a PR to add the flink iceberg connector document in #3085 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] shengkui closed issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

shengkui closed issue #3079:
URL: https://github.com/apache/iceberg/issues/3079


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] openinx commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

openinx commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-914159475


   @shengkui I think you need to use the master branch to build the latest flink-iceberg-runtime jar because the PR #2666 was got merged only in master branch now ( Not get released in 0.12.0). 
   
   I tried to use the connector to executing the following SQL: 
   
   ```shell
   ./bin/sql-client.sh embedded -j /Users/openinx/software/apache-iceberg/flink-runtime/build/libs/iceberg-flink-runtime-5f90476.jar shell
   
   
   Flink SQL> CREATE TABLE iceberg_table (
   >     id   BIGINT,
   >     data STRING
   > ) WITH (
   >     'connector'='iceberg',
   >     'catalog-name'='hive_prod',
   >     'uri'='thrift://localhost:9083',
   >     'warehouse'='file:///Users/openinx/test/iceberg-warehouse'
   > );
   [INFO] Table has been created.
   
   Flink SQL> INSERT INTO iceberg_table values (1, 'AAA'), (2, 'BBB'), (3, 'CCC');
   [INFO] Submitting SQL update statement to the cluster...
   [INFO] Table update statement has been successfully submitted to the cluster:
   Job ID: c9742d48cbd35502f9a3093d0d668543
   
   Flink SQL> select * from iceberg_table ;
   +----+------+
   | id | data |
   +----+------+
   |  1 |  AAA |
   |  2 |  BBB |
   |  3 |  CCC |
   +----+------+
   3 rows in set
   ```
   
   All seems OK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] openinx commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

openinx commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-915044545

@shengkui , I don't have a correct aws s3 enviroment, but I've configured this flink connector correctly in our alibaba public object storage before (Just use the open hadoop distribution with aliyun-oss hdfs implementation). The first thing you need to do is : configurate the hadoop hdfs correctly by setting the key-values in core-site.xml and verify this by using `hadoop hdfs` command. Then you will need to make sure your flink cluster & hive-metastore are using the correct hadoop classpath you've configured above. In theory, you can submit the flink job correctly then.

We don't need to configure any s3 configurations in the flink table properties. There's a [document](https://developer.aliyun.com/article/783957) describing how to write data into aliyun oss in Chinese. You may need to replace all the oss configurations to s3 according to the doc.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] shengkui commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

shengkui commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-914757553


   @openinx Thanks for your help , I'll try it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] shengkui commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

shengkui commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-914138961


   > @shengkui , It's insecure to expose your s3 accessKey & accessSecret in this open issue. So I masked them in the issue. About the issue, I will also try under my local host.
   
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] shengkui commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

shengkui commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-916599585


   @openinx I've tried following the document you provided, and failed again.  Following is picked from the log of flink sql-client:
   
   ```
   Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Got exception: org.apache.hadoop.fs.s3a.AWSBadRequestException PUT 0-byte object  on lsk.db/: com.amazonaws.services.s3.model.AmazonS3Exception: null (Service: Amazon S3; Status Code: 400; Error Code: InvalidRequest; Request ID: tx000000000000000265ae0-006139a08d-5efa-default; S3 Extended Request ID: 5efa-default-default), S3 Extended Request ID: 5efa-default-default:InvalidRequest: null (Service: Amazon S3; Status Code: 400; Error Code: InvalidRequest; Request ID: tx000000000000000265ae0-006139a08d-5efa-default; S3 Extended Request ID: 5efa-default-default)
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result$create_database_resultStandardScheme.read(ThriftHiveMetastore.java:39343) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result$create_database_resultStandardScheme.read(ThriftHiveMetastore.java:39311) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result.read(ThriftHiveMetastore.java:39245) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_database(ThriftHiveMetastore.java:1106) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_database(ThriftHiveMetastore.java:1093) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:809) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.iceberg.hive.HiveCatalog.lambda$createNamespace$7(HiveCatalog.java:243) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:76) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.hive.HiveCatalog.createNamespace(HiveCatalog.java:242) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.flink.FlinkCatalog.createDatabase(FlinkCatalog.java:203) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.flink.FlinkCatalog.createDatabase(FlinkCatalog.java:196) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:968) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeSql$1(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:256) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	... 8 more
   ```
   
   I've tried hadoop catalog with S3(without hive-metastore), it works well. But there is someone said, iceberg need hive-metastore for S3 storage(#2274). I know iceberg have implemented Glue Catalog, but it for AWS. Is any solution to use S3 without hive-metastore?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] pvary commented on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

pvary commented on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-913982936


   There were changes around catalog configuration in 0.12. Maybe this effects the FlinkCatalog as well. I would try to check how the HiveCatalog should be parameterized in Flink use-cases.
   
   Thanks, Peter 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] shengkui edited a comment on issue #3079: java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata

Posted by GitBox <gi...@apache.org>.

shengkui edited a comment on issue #3079:
URL: https://github.com/apache/iceberg/issues/3079#issuecomment-916599585


   @openinx I've tried following the document you provided, and failed again.  Following is picked from the log of flink sql-client:
   
   ```
   Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Got exception: org.apache.hadoop.fs.s3a.AWSBadRequestException PUT 0-byte object  on lsk.db/: com.amazonaws.services.s3.model.AmazonS3Exception: null (Service: Amazon S3; Status Code: 400; Error Code: InvalidRequest; Request ID: tx000000000000000265ae0-006139a08d-5efa-default; S3 Extended Request ID: 5efa-default-default), S3 Extended Request ID: 5efa-default-default:InvalidRequest: null (Service: Amazon S3; Status Code: 400; Error Code: InvalidRequest; Request ID: tx000000000000000265ae0-006139a08d-5efa-default; S3 Extended Request ID: 5efa-default-default)
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result$create_database_resultStandardScheme.read(ThriftHiveMetastore.java:39343) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result$create_database_resultStandardScheme.read(ThriftHiveMetastore.java:39311) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result.read(ThriftHiveMetastore.java:39245) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_database(ThriftHiveMetastore.java:1106) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_database(ThriftHiveMetastore.java:1093) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:809) ~[flink-sql-connector-hive-3.1.2_2.12-1.12.5.jar:1.12.5]
   	at org.apache.iceberg.hive.HiveCatalog.lambda$createNamespace$7(HiveCatalog.java:243) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:76) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.hive.HiveCatalog.createNamespace(HiveCatalog.java:242) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.flink.FlinkCatalog.createDatabase(FlinkCatalog.java:203) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.iceberg.flink.FlinkCatalog.createDatabase(FlinkCatalog.java:196) ~[iceberg-flink-runtime-5f90476.jar:?]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:968) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) ~[flink-table_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeSql$1(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:256) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:315) ~[flink-sql-client_2.12-1.12.5.jar:1.12.5]
   	... 8 more
   ```
   
   I've tried hadoop catalog with S3(without hive-metastore), it works well. But there is someone said, iceberg need hive-metastore for S3 storage(#1468). I know iceberg have implemented Glue Catalog, but it for AWS. Is any solution to use S3 without hive-metastore?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org