You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "duc-dn (via GitHub)" <gi...@apache.org> on 2023/03/23 03:11:29 UTC

[GitHub] [hudi] duc-dn opened a new issue, #8273: [SUPPORT] How to connect Hudi cli to MinIO

duc-dn opened a new issue, #8273:
URL: https://github.com/apache/hudi/issues/8273

   
   **Describe the problem you faced**
   
   I setup hudi cli in local and  don't connect hudi cli to minio
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   - I am using spark version 3.1.1, and I clone hudi latest version from github, run `mvn clean package -DskipTests -Dspark3.1 -Dscala-2.12`
   - After setup environment variables in .zshrc
   ```
   export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64
   export PATH=$PATH:$JAVA_HOME/bin
   
   export SPARK_HOME=/home/****/Documents/spark
   export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
   
   export AWS_ENDPOINT=http://localhost:9000
   export AWS_ACCESSS_KEY_ID=minioadmin
   export AWS_SECRET_ACCESS_KEY=minioadmin
   export CLIENT_JAR=/home/*****/Desktop/client_jar/aws-java-sdk-bundle-1.11.271.jar:/home/*****/Desktop/client_jar/hadoop-aws-3.1.1.jar
   ```
   - I run hudi_cli and execute the command connect to the bucket on minio: `connect --path s3a://datalake/data`, but don't connect
   
   **Environment Description**
   
   * Hudi version : 0.13.0
   
   * Spark version : 3.1.1
   
   * Storage (HDFS/S3/GCS..) : Minio
   
   **Stacktrace**
   
   ```
   Failed to get instance of org.apache.hadoop.fs.FileSystem
   Details of the error have been omitted. You can use the stacktrace command to print the full stacktrace.
   hudi->stacktrace 
   org.apache.hudi.exception.HoodieIOException: Failed to get instance of org.apache.hadoop.fs.FileSystem
   	at org.apache.hudi.common.fs.FSUtils.getFs(FSUtils.java:112)
   	at org.apache.hudi.common.table.HoodieTableMetaClient.getFs(HoodieTableMetaClient.java:305)
   	at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:136)
   	at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
   	at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
   	at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
   	at org.apache.hudi.cli.HoodieCLI.refreshTableMetadata(HoodieCLI.java:89)
   	at org.apache.hudi.cli.HoodieCLI.connectTo(HoodieCLI.java:95)
   	at org.apache.hudi.cli.commands.TableCommand.connect(TableCommand.java:86)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.springframework.shell.command.invocation.InvocableShellMethod.doInvoke(InvocableShellMethod.java:306)
   	at org.springframework.shell.command.invocation.InvocableShellMethod.invoke(InvocableShellMethod.java:232)
   	at org.springframework.shell.command.CommandExecution$DefaultCommandExecution.evaluate(CommandExecution.java:158)
   	at org.springframework.shell.Shell.evaluate(Shell.java:208)
   	at org.springframework.shell.Shell.run(Shell.java:140)
   	at org.springframework.shell.jline.InteractiveShellRunner.run(InteractiveShellRunner.java:73)
   	at org.springframework.shell.DefaultShellApplicationRunner.run(DefaultShellApplicationRunner.java:65)
   	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:762)
   	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:752)
   	at org.springframework.boot.SpringApplication.run(SpringApplication.java:315)
   	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306)
   	at org.springframework.boot.SpringApplication.run(SpringApplication.java:1295)
   	at org.apache.hudi.cli.Main.main(Main.java:34)
   Caused by: java.io.InterruptedIOException: doesBucketExist on datalake: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint: 
   	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:141)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:341)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:280)
   	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3247)
   	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)
   	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3296)
   	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3264)
   	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475)
   	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
   	at org.apache.hudi.common.fs.FSUtils.getFs(FSUtils.java:110)
   	... 25 more
   Caused by: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint: 
   	at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:151)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1257)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:833)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:783)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
   	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5437)
   	at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6408)
   	at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6381)
   	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5422)
   	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5384)
   	at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1445)
   	at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1381)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:329)
   	... 33 more
   Caused by: com.amazonaws.SdkClientException: Failed to connect to service endpoint: 
   	at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
   	at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:70)
   	at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:75)
   	at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
   	at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsEndpoint(InstanceMetadataServiceCredentialsFetcher.java:58)
   	at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:46)
   	at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112)
   	at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68)
   	at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:165)
   	at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:129)
   	... 50 more
   Caused by: java.net.SocketTimeoutException: connect timed out
   	at java.net.PlainSocketImpl.socketConnect(Native Method)
   	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
   	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
   	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
   	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
   	at java.net.Socket.connect(Socket.java:607)
   	at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
   	at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
   	at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
   	at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
   	at sun.net.www.http.HttpClient.New(HttpClient.java:339)
   	at sun.net.www.http.HttpClient.New(HttpClient.java:357)
   	at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1228)
   	at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1207)
   	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
   	at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
   	at com.amazonaws.internal.ConnectionUtils.connectToEndpoint(ConnectionUtils.java:52)
   	at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:80)
   	... 59 more
   
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1482218848

   Yeah, maybe the aws fellows can give some help here, @umehrot2 , do you have any idea how these AWS env variables can be handled over to HUDI CLI correctly?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] duc-dn commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO

Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1486611749

   I resolved my problem. This is link: https://github.com/duc-dn/hudi-cli-with-minio


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] duc-dn commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO

Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1480830698

   @nsivabalan can you help me


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] duc-dn closed issue #8273: [SUPPORT] How to connect Hudi cli to MinIO

Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn closed issue #8273: [SUPPORT] How to connect Hudi cli to MinIO
URL: https://github.com/apache/hudi/issues/8273


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] duc-dn commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO

Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1482204522

   Hi @danny0405 
   - yes, I configured correctly MINIO env but don't connect. I find that hudi cli don't get this variable environments
   ```
   export AWS_ENDPOINT=http://localhost:9000
   export AWS_ACCESSS_KEY_ID=minioadmin
   export AWS_SECRET_ACCESS_KEY=minioadmin
   ```
   - I accessed MINIO by spark.
   Now I want to use hudi to show and list commits in my hudi table


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] duc-dn commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO

Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1483723081

   @umehrot2 can you help me, please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1482158363

   > Caused by: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint: 
   
   Did you configure the MINIO env correctly? Have you got a successful access to MINIO backed fs through any other way?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org