You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Khushbukela (via GitHub)" <gi...@apache.org> on 2023/03/15 07:47:17 UTC

[GitHub] [hudi] Khushbukela opened a new issue, #8191: Unable to execute HTTP request | connection timeout issues

Khushbukela opened a new issue, #8191:
URL: https://github.com/apache/hudi/issues/8191

   **Describe the problem you faced**
   Using hudi in the spark streaming job. Jobs are getting failed due to - HTTP connection timeout:_ 
   
   `A clear and concise description of the problem.`
   hudi version: 0.12
   table type: COW
   ingestion mode: INSERT
   
   - above problem is faced with hudi 0.12 when metadata is enabled - true
   - with metadata - false not seeing this connection limit issue 
   
   - we are running multiple streaming queries/jobs in one spark job. and setting the connection value to 1000 sometimes helps and sometimes does not, due to this job is getting killed
   _spark.hadoop.fs.s3a.connection.maximum: "1000"_
   
   - metadata false behavior is helping but
     - time spent on parallel listing paths is comparatively high. [from ~2s to ~1 min]
     - **Question**: will this[time spent on parallel listing paths] time increase, wrt data [size/files] increase?
     
    
   - tried with hudi 0.13 version 
      -  hudi 0.13 version is not creating any connection limit issues.  But LTS is 0.12.2
    
    
   Thanks for reading the issue, Need help to check if any other solutions can try or which behavior is more recommended to use in production. 
    
    
   
   
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. start to spark streaming  job with COW, metadata enable and have multiple streaming queries [50+]
   
   **Expected behavior**
   There seems to be a connection leaks issue in metadata with 0.12.2. 
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.12.2 / 0.13
   
   * Spark version : 3.3.0
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
   
   2023-03-14 07:04:14  WARN o.a.s.storag.BlockManager.logWarning (Logging.scala:73) [task 0.3 in stage 1115.0 (TID 2558)]: Putting block rdd_2569_0 failed due to exception org.apache.hudi.exception.HoodieException: Exception when reading log file .
   2023-03-14 07:04:14  WARN o.a.s.storag.BlockManager.logWarning (Logging.scala:73) [task 0.3 in stage 1115.0 (TID 2558)]: Block rdd_2569_0 could not be removed as it was not found on disk or in memory
   2023-03-14 07:04:14 ERROR o.a.s.execut.Executor.logError (Logging.scala:98) [task 0.3 in stage 1115.0 (TID 2558)]: Exception in task 0.3 in stage 1115.0 (TID 2558)
   org.apache.hudi.exception.HoodieException: Exception when reading log file 
   
   
   
   Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1216) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1162) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1372) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$4(S3AFileSystem.java:1289) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:285) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1286) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2203) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2142) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:715) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:195) ~[__app__.jar:?]
   	at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:475) ~[__app__.jar:?]
   	at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114) ~[__app__.jar:?]
   	at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110) ~[__app__.jar:?]
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223) ~[__app__.jar:?]
   	... 29 more
   Caused by: com.amazonaws.thirdparty.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
   	at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:316) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at sun.reflect.GeneratedMethodAccessor257.invoke(Unknown Source) ~[?:?]
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_362]
   	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
   	at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.conn.$Proxy51.get(Unknown Source) ~[?:?]
   	at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1343) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1372) ~[aws-java-sdk-bundle-1.12.170.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$4(S3AFileSystem.java:1289) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:285) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1286) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2203) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2142) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:715) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
   	at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:195) ~[__app__.jar:?]
   	at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:475) ~[__app__.jar:?]
   	at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114) ~[__app__.jar:?]
   	at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110) ~[__app__.jar:?]
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223) ~[__app__.jar:?]
   	... 29 more
   
   
   
   
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1627995591

   Thanks, but for 0.14.0 we do many improvements to the MDT, let's see whethe the issue could be solved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] kandyrise commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "kandyrise (via GitHub)" <gi...@apache.org>.
kandyrise commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1717826763

   It is not being set.  Let me try that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1723327547

   @kandyrise Were you able to fix it after setting that config?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1659852025

   Did you use Spark or Flink ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #8191: Unable to execute HTTP request | connection timeout issues

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1472404160

   I see you are having issue w/ 0.12.0. Did you test it w/ 0.12.2. We did fix some connection leaks in both 0.12.2 and 0.13.0. We are planning for 0.12.3 shortly. So, if you can confirm that 0.12.2 is already in good shape, we should be good. if not, we will have to ensure 0.12.3 has the fix thats already in 0.13.0. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] kandyrise commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "kandyrise (via GitHub)" <gi...@apache.org>.
kandyrise commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1716941236

   While reading one of the hudi table, we started observing following warning and error, 
   "... invalid or extra rollback command block in s3://...'
   
   After the warning, the repeated error appears as
   "SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool"
   
   The full log is attached.
   
   The error seems to be arising everytime when the job runs.  Looking for the root cause and possible solution.
   
   [1.ReadTimeout.To.HudiSupport.log](https://github.com/apache/hudi/files/12593508/1.ReadTimeout.To.HudiSupport.log)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1569885591

   @Khushbukela Sorry for the delay for this ticket. 
   
   Can you provide how many files are there in the table? Can you also paste the write and table configs what you are using. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jlloh commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "jlloh (via GitHub)" <gi...@apache.org>.
jlloh commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1627759070

   Seeing something similar for Flink 1.16 with Hudi 0.13.1, COW insert with metadata enabled. Problem seems to occur ~4 hours after the job has been running. The job is an inline clustering job. After disabling metadata, the job is able to proceed.
   
   Configurations:
   ```
       "table.table": "COPY_ON_WRITE"
       "write.operation": "insert"
       "write.insert.cluster": "true"
       "hoodie.datasource.write.hive_style_partitioning": "true"
       "metadata.enabled": "true"
       "hoodie.datasource.write.hive_style_partitioning": "true"
       "hoodie.parquet.max.file.size": "104857600"
       "hoodie.parquet.small.file.limit": "20971520"
       "clustering.plan.strategy.small.file.limit": "100"
   ```
   Files:
   ~211 parquet files per partition across 4 hourly partitions when the issue started happening and the job failed to continue. The bucket assigner task is the one that hits this error. I have tried both hourly and daily partitions but both jobs seem to eventually fail and not able to recover with metadata enabled.
   
   Full stacktrace:
   ```
   org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve files in partition s3a://<bucket_name>/folder_name/local_year=2023/local_month=07/local_day=08 from metadata
   	at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:152)
   	at org.apache.hudi.metadata.HoodieMetadataFileSystemView.listPartition(HoodieMetadataFileSystemView.java:69)
   	at org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$ensurePartitionLoadedCorrectly$16(AbstractTableFileSystemView.java:432)
   	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
   	at org.apache.hudi.common.table.view.AbstractTableFileSystemView.ensurePartitionLoadedCorrectly(AbstractTableFileSystemView.java:423)
   	at org.apache.hudi.common.table.view.AbstractTableFileSystemView.getLatestBaseFilesBeforeOrOn(AbstractTableFileSystemView.java:660)
   	at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.execute(PriorityBasedFileSystemView.java:104)
   	at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.getLatestBaseFilesBeforeOrOn(PriorityBasedFileSystemView.java:145)
   	at org.apache.hudi.sink.partitioner.profile.WriteProfile.smallFilesProfile(WriteProfile.java:208)
   	at org.apache.hudi.sink.partitioner.profile.WriteProfile.getSmallFiles(WriteProfile.java:191)
   	at org.apache.hudi.sink.partitioner.BucketAssigner.getSmallFileAssign(BucketAssigner.java:179)
   	at org.apache.hudi.sink.partitioner.BucketAssigner.addInsert(BucketAssigner.java:137)
   	at org.apache.hudi.sink.partitioner.BucketAssignFunction.getNewRecordLocation(BucketAssignFunction.java:215)
   	at org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:200)
   	at org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:162)
   	at org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
   	at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
   	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
   	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
   	at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
   	at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:542)
   	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
   	at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:831)
   	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:780)
   	at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
   	at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:914)
   	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
   	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieException: Exception when reading log file 
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternalV1(AbstractHoodieLogRecordReader.java:374)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:198)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:114)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
   	at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:432)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$3(HoodieBackedTableMetadata.java:239)
   	at java.util.HashMap.forEach(HashMap.java:1290)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:237)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:152)
   	at org.apache.hudi.metadata.BaseTableMetadata.fetchAllFilesInPartition(BaseTableMetadata.java:339)
   	at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:150)
   	... 28 more
   Caused by: org.apache.hudi.exception.HoodieIOException: unable to initialize read with log file 
   	at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:113)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternalV1(AbstractHoodieLogRecordReader.java:247)
   	... 43 more
   Caused by: java.io.InterruptedIOException: getFileStatus on s3a://<redacted>/.hoodie/metadata/files/.files-0000_00000000000000.log.2_0-1-0: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
   	at org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:352)
   	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
   	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:151)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2278)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2226)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2160)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:727)
   	at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:203)
   	at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:498)
   	at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:118)
   	at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110)
   	... 44 more
   Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1216)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1162)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
   	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5445)
   	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5392)
   	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1368)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$4(S3AFileSystem.java:1307)
   	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
   	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:285)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1304)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2264)
   	... 51 more
   Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
   	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:316)
   	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
   	at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
   	at com.amazonaws.http.conn.$Proxy56.get(Unknown Source)
   	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
   	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
   	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
   	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
   	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
   	at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1343)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
   	... 66 more
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] kandyrise commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "kandyrise (via GitHub)" <gi...@apache.org>.
kandyrise commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1724055581

   yes, setting the config 'fs.s3a.connection.maximum' fixed the error.  thx 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gauravkumar37 commented on issue #8191: Unable to execute HTTP request | connection timeout issues

Posted by "gauravkumar37 (via GitHub)" <gi...@apache.org>.
gauravkumar37 commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1472405998

   We tried with 0.12.2 but it didn't help. However, 0.13 is good and not seeing such problems with it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zbbkeepgoing commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "zbbkeepgoing (via GitHub)" <gi...@apache.org>.
zbbkeepgoing commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1659855211

   > Did you use Spark or Flink ?
   
   Spark 3.3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zbbkeepgoing commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "zbbkeepgoing (via GitHub)" <gi...@apache.org>.
zbbkeepgoing commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1659725409

   0.13.1 meet this issue too. COW table, about 7 million row. And we clustering it by target parquet size is 1024 MB. 
   
   If we run performance query test, some of query will be block by `com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool`
   
   ```
   23/08/01 06:28:11 INFO AmazonHttpClient: Unable to execute HTTP request: Timeout waiting for connection from pool
   org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
           at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
           at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
           at jdk.internal.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
           at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
           at java.base/java.lang.reflect.Method.invoke(Unknown Source)
           at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
           at com.amazonaws.http.conn.$Proxy42.getConnection(Unknown Source)
           at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:423)
           at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
           at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
           at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
           at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
           at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
           at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
           at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
           at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1050)
           at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:904)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1553)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$getFileStatus$17(HoodieWrapperFileSystem.java:410)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:114)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.getFileStatus(HoodieWrapperFileSystem.java:404)
           at org.apache.hudi.exception.TableNotFoundException.checkTableValidity(TableNotFoundException.java:51)
           at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:137)
           at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
           at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
           at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
           at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.<init>(AbstractHoodieLogRecordReader.java:165)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:101)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
           at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeyPrefixes$7539c171$1(HoodieBackedTableMetadata.java:193)
           at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingMapWrapper$0(FunctionWrapper.java:38)
           at org.apache.hudi.common.data.HoodieListData.lambda$flatMap$0(HoodieListData.java:124)
           at java.base/java.util.stream.ReferencePipeline$7$1.accept(Unknown Source)
           at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
           at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
           at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
           at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
           at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
           at java.base/java.util.stream.AbstractTask.compute(Unknown Source)
           at java.base/java.util.concurrent.CountedCompleter.exec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
   23/08/01 06:28:11 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3a://public-xuanwu-assets/porsche/vehicle_bloom_mdt_clustering/.hoodie/metadata
   23/08/01 06:28:11 INFO AmazonHttpClient: Unable to execute HTTP request: Timeout waiting for connection from pool
   org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
           at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
           at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
           at jdk.internal.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
           at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
           at java.base/java.lang.reflect.Method.invoke(Unknown Source)
           at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
           at com.amazonaws.http.conn.$Proxy42.getConnection(Unknown Source)
           at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:423)
           at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
           at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
           at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
           at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
           at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
           at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
           at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
           at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1050)
           at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:904)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1553)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$getFileStatus$17(HoodieWrapperFileSystem.java:410)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:114)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.getFileStatus(HoodieWrapperFileSystem.java:404)
           at org.apache.hudi.exception.TableNotFoundException.checkTableValidity(TableNotFoundException.java:51)
           at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:137)
           at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
           at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
           at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
           at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.<init>(AbstractHoodieLogRecordReader.java:165)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:101)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
           at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeyPrefixes$7539c171$1(HoodieBackedTableMetadata.java:193)
           at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingMapWrapper$0(FunctionWrapper.java:38)
           at org.apache.hudi.common.data.HoodieListData.lambda$flatMap$0(HoodieListData.java:124)
           at java.base/java.util.stream.ReferencePipeline$7$1.accept(Unknown Source)
           at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
           at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
           at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
           at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
           at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
           at java.base/java.util.stream.AbstractTask.compute(Unknown Source)
           at java.base/java.util.concurrent.CountedCompleter.exec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
   23/08/01 06:28:11 INFO AmazonHttpClient: Unable to execute HTTP request: Timeout waiting for connection from pool
   org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
           at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
           at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
           at jdk.internal.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
           at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
           at java.base/java.lang.reflect.Method.invoke(Unknown Source)
           at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
           at com.amazonaws.http.conn.$Proxy42.getConnection(Unknown Source)
           at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:423)
           at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
           at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
           at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
           at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
           at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
           at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
           at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
           at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1050)
           at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:904)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1553)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$getFileStatus$17(HoodieWrapperFileSystem.java:410)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:114)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.getFileStatus(HoodieWrapperFileSystem.java:404)
           at org.apache.hudi.exception.TableNotFoundException.checkTableValidity(TableNotFoundException.java:51)
           at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:137)
           at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
           at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
           at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
           at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.<init>(AbstractHoodieLogRecordReader.java:165)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:101)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
           at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeyPrefixes$7539c171$1(HoodieBackedTableMetadata.java:193)
           at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingMapWrapper$0(FunctionWrapper.java:38)
           at org.apache.hudi.common.data.HoodieListData.lambda$flatMap$0(HoodieListData.java:124)
           at java.base/java.util.stream.ReferencePipeline$7$1.accept(Unknown Source)
           at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
           at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
           at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
           at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
           at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
           at java.base/java.util.stream.AbstractTask.compute(Unknown Source)
           at java.base/java.util.concurrent.CountedCompleter.exec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
   23/08/01 06:28:11 INFO AmazonHttpClient: Unable to execute HTTP request: Timeout waiting for connection from pool
   org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
           at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
           at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
           at jdk.internal.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
           at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
           at java.base/java.lang.reflect.Method.invoke(Unknown Source)
           at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
           at com.amazonaws.http.conn.$Proxy42.getConnection(Unknown Source)
           at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:423)
           at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
           at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
           at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
           at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
           at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
           at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
           at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
           at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1050)
           at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:904)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1553)
           at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$getFileStatus$17(HoodieWrapperFileSystem.java:410)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:114)
           at org.apache.hudi.common.fs.HoodieWrapperFileSystem.getFileStatus(HoodieWrapperFileSystem.java:404)
           at org.apache.hudi.exception.TableNotFoundException.checkTableValidity(TableNotFoundException.java:51)
           at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:137)
           at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
           at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
           at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
           at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.<init>(AbstractHoodieLogRecordReader.java:165)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:101)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
           at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
           at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
           at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeyPrefixes$7539c171$1(HoodieBackedTableMetadata.java:193)
           at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingMapWrapper$0(FunctionWrapper.java:38)
           at org.apache.hudi.common.data.HoodieListData.lambda$flatMap$0(HoodieListData.java:124)
           at java.base/java.util.stream.ReferencePipeline$7$1.accept(Unknown Source)
           at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
           at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
           at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
           at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
           at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
           at java.base/java.util.stream.AbstractTask.compute(Unknown Source)
           at java.base/java.util.concurrent.CountedCompleter.exec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
           at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #8191: Unable to execute HTTP request | connection timeout issues

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1474218516

   got it thanks, will watch out for fixes that went into 0.13.0 around connection leaks and will pick those for 0.12.3.
   thanks for confirming. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] Khushbukela commented on issue #8191: Unable to execute HTTP request | connection timeout issues

Posted by "Khushbukela (via GitHub)" <gi...@apache.org>.
Khushbukela commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1477365707

   Hi @nsivabalan We have started seeing similar behaviour for 0.13 also. 
   we are running 25-50 streaming queries per job. 
   let me know if you need any other information which can help to debug this issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ad1happy2go commented on issue #8191: [BUG] Unable to execute HTTP request | connection timeout issues

Posted by "ad1happy2go (via GitHub)" <gi...@apache.org>.
ad1happy2go commented on issue #8191:
URL: https://github.com/apache/hudi/issues/8191#issuecomment-1717607976

   @kandyrise Are you setting fs.s3a.connection.maximum ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org