You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "duc-dn (via GitHub)" <gi...@apache.org> on 2023/03/23 03:11:29 UTC
[GitHub] [hudi] duc-dn opened a new issue, #8273: [SUPPORT] How to connect Hudi cli to MinIO
duc-dn opened a new issue, #8273:
URL: https://github.com/apache/hudi/issues/8273
**Describe the problem you faced**
I setup hudi cli in local and don't connect hudi cli to minio
**To Reproduce**
Steps to reproduce the behavior:
- I am using spark version 3.1.1, and I clone hudi latest version from github, run `mvn clean package -DskipTests -Dspark3.1 -Dscala-2.12`
- After setup environment variables in .zshrc
```
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64
export PATH=$PATH:$JAVA_HOME/bin
export SPARK_HOME=/home/****/Documents/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
export AWS_ENDPOINT=http://localhost:9000
export AWS_ACCESSS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minioadmin
export CLIENT_JAR=/home/*****/Desktop/client_jar/aws-java-sdk-bundle-1.11.271.jar:/home/*****/Desktop/client_jar/hadoop-aws-3.1.1.jar
```
- I run hudi_cli and execute the command connect to the bucket on minio: `connect --path s3a://datalake/data`, but don't connect
**Environment Description**
* Hudi version : 0.13.0
* Spark version : 3.1.1
* Storage (HDFS/S3/GCS..) : Minio
**Stacktrace**
```
Failed to get instance of org.apache.hadoop.fs.FileSystem
Details of the error have been omitted. You can use the stacktrace command to print the full stacktrace.
hudi->stacktrace
org.apache.hudi.exception.HoodieIOException: Failed to get instance of org.apache.hadoop.fs.FileSystem
at org.apache.hudi.common.fs.FSUtils.getFs(FSUtils.java:112)
at org.apache.hudi.common.table.HoodieTableMetaClient.getFs(HoodieTableMetaClient.java:305)
at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:136)
at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
at org.apache.hudi.cli.HoodieCLI.refreshTableMetadata(HoodieCLI.java:89)
at org.apache.hudi.cli.HoodieCLI.connectTo(HoodieCLI.java:95)
at org.apache.hudi.cli.commands.TableCommand.connect(TableCommand.java:86)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.shell.command.invocation.InvocableShellMethod.doInvoke(InvocableShellMethod.java:306)
at org.springframework.shell.command.invocation.InvocableShellMethod.invoke(InvocableShellMethod.java:232)
at org.springframework.shell.command.CommandExecution$DefaultCommandExecution.evaluate(CommandExecution.java:158)
at org.springframework.shell.Shell.evaluate(Shell.java:208)
at org.springframework.shell.Shell.run(Shell.java:140)
at org.springframework.shell.jline.InteractiveShellRunner.run(InteractiveShellRunner.java:73)
at org.springframework.shell.DefaultShellApplicationRunner.run(DefaultShellApplicationRunner.java:65)
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:762)
at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:752)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:315)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1295)
at org.apache.hudi.cli.Main.main(Main.java:34)
Caused by: java.io.InterruptedIOException: doesBucketExist on datalake: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:141)
at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:341)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:280)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3247)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3296)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3264)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
at org.apache.hudi.common.fs.FSUtils.getFs(FSUtils.java:110)
... 25 more
Caused by: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:151)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1257)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:833)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:783)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5437)
at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6408)
at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6381)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5422)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5384)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1445)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1381)
at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:329)
... 33 more
Caused by: com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:70)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:75)
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsEndpoint(InstanceMetadataServiceCredentialsFetcher.java:58)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:46)
at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112)
at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:165)
at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:129)
... 50 more
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1228)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1207)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
at com.amazonaws.internal.ConnectionUtils.connectToEndpoint(ConnectionUtils.java:52)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:80)
... 59 more
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO
Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1482218848
Yeah, maybe the aws fellows can give some help here, @umehrot2 , do you have any idea how these AWS env variables can be handled over to HUDI CLI correctly?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] duc-dn commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO
Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1486611749
I resolved my problem. This is link: https://github.com/duc-dn/hudi-cli-with-minio
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] duc-dn commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO
Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1480830698
@nsivabalan can you help me
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] duc-dn closed issue #8273: [SUPPORT] How to connect Hudi cli to MinIO
Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn closed issue #8273: [SUPPORT] How to connect Hudi cli to MinIO
URL: https://github.com/apache/hudi/issues/8273
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] duc-dn commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO
Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1482204522
Hi @danny0405
- yes, I configured correctly MINIO env but don't connect. I find that hudi cli don't get this variable environments
```
export AWS_ENDPOINT=http://localhost:9000
export AWS_ACCESSS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minioadmin
```
- I accessed MINIO by spark.
Now I want to use hudi to show and list commits in my hudi table
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] duc-dn commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO
Posted by "duc-dn (via GitHub)" <gi...@apache.org>.
duc-dn commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1483723081
@umehrot2 can you help me, please?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on issue #8273: [SUPPORT] How to connect Hudi cli to MinIO
Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8273:
URL: https://github.com/apache/hudi/issues/8273#issuecomment-1482158363
> Caused by: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Failed to connect to service endpoint:
Did you configure the MINIO env correctly? Have you got a successful access to MINIO backed fs through any other way?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org