You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/29 05:58:19 UTC
[GitHub] [hudi] gubinjie opened a new issue, #6825: [SUPPORT]org.apache.hudi.exception.HoodieRemoteException: *****:37568 failed to respond
gubinjie opened a new issue, #6825:
URL: https://github.com/apache/hudi/issues/6825
Hudi :0.10.1
Flink:1.13.3
When using FlinkSql to write to Hudi, Hudi will have an exception and cause the task to exit abnormally. I don't know why this is caused. Is there any solution? ?
`2022-09-29 13:38:51,316 INFO org.apache.hudi.table.action.clean.CleanPlanner [] - Incremental Cleaning mode is enabled. Looking up partition-paths that have since changed since last cleaned at 20220929130332153. New Instant to retain : Option{val=[20220929130632162__commit__COMPLETED]}
2022-09-29 13:38:51,320 INFO org.apache.hudi.client.HoodieFlinkWriteClient [] - Cleaner has been spawned already. Waiting for it to finish
2022-09-29 13:38:51,320 INFO org.apache.hudi.client.AsyncCleanerService [] - Waiting for async cleaner to finish
2022-09-29 13:38:51,320 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded instants upto : Option{val=[20220929133815434__rollback__COMPLETED]}
2022-09-29 13:38:51,322 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded instants upto : Option{val=[20220929133815434__rollback__COMPLETED]}
2022-09-29 13:38:51,322 INFO org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - No compaction plan for checkpoint 206
2022-09-29 13:38:51,322 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded instants upto : Option{val=[20220929133815434__rollback__COMPLETED]}
2022-09-29 13:38:51,322 INFO org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView [] - Sending request : (http://172.16.7.55:37568/v1/hoodie/view/refresh/?basepath=hdfs%3A%2F%2Fpaat-dev%2Fuser%2Fhudi%2Fwarehouse%2Fpaat_ods_hudi.db&lastinstantts=20220929133815434&timelinehash=349d0f08ef10a04fc26f6567771fd2ba51f211bcbfe748b039706ea116397cf2)
2022-09-29 13:38:51,326 INFO org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend [] - Closed RocksDB State Backend. Cleaning up RocksDB working directory /tmp/flink-io-90c278ae-7432-43cf-8ed9-0a2b1e517719/job_e91eef4dfcd5908c56bb1a4978a1f564_op_KeyedProcessOperator_86011fb501be3971273edc8de4ea854a__1_2__uuid_7e0588fb-a70f-47c5-b91e-a04ea88f4de0.
2022-09-29 13:38:51,327 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded instants upto : Option{val=[20220929133815434__rollback__COMPLETED]}
2022-09-29 13:38:51,327 INFO org.apache.hudi.table.action.clean.CleanPlanner [] - Total Partitions to clean : 1, with policy KEEP_LATEST_COMMITS
2022-09-29 13:38:51,327 INFO org.apache.hudi.table.action.clean.CleanPlanner [] - Using cleanerParallelism: 1
2022-09-29 13:38:51,327 INFO org.apache.hudi.table.action.clean.CleanPlanner [] - Cleaning , retaining latest 10 commits.
2022-09-29 13:38:51,327 INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView [] - Building file system view for partition ()
2022-09-29 13:38:51,328 WARN org.apache.flink.runtime.taskmanager.Task [] - bucket_assigner (1/2)#5 (1316413b42cf9d8a6ea5be1da909a000) switched from RUNNING to FAILED with failure cause: org.apache.hudi.exception.HoodieRemoteException: 172.16.7.55:37568 failed to respond
at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.refresh(RemoteHoodieTableFileSystemView.java:420)
at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.sync(RemoteHoodieTableFileSystemView.java:484)
at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.sync(PriorityBasedFileSystemView.java:257)
at org.apache.hudi.sink.partitioner.profile.WriteProfile.reload(WriteProfile.java:252)
at org.apache.hudi.sink.partitioner.BucketAssigner.reload(BucketAssigner.java:211)
at org.apache.hudi.sink.partitioner.BucketAssignFunction.notifyCheckpointComplete(BucketAssignFunction.java:234)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.notifyCheckpointComplete(AbstractUdfStreamOperator.java:130)
at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.notifyCheckpointComplete(StreamOperatorWrapper.java:99)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpointComplete(SubtaskCheckpointCoordinatorImpl.java:334)
at org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:1171)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointCompleteAsync$10(StreamTask.java:1136)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointOperation$12(StreamTask.java:1159)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:344)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:330)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
at org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.http.NoHttpResponseException: 172.16.7.55:37568 failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at org.apache.http.client.fluent.Request.execute(Request.java:151)
at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.executeRequest(RemoteHoodieTableFileSystemView.java:176)
at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.refresh(RemoteHoodieTableFileSystemView.java:418)
... 23 more
2022-09-29 13:38:51,328 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for bucket_assigner (1/2)#5 (1316413b42cf9d8a6ea5be1da909a000).
2022-09-29 13:38:51,328 INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded instants upto : Option{val=[20220929133815434__rollback__COMPLETED]}`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on issue #6825: [SUPPORT]org.apache.hudi.exception.HoodieRemoteException: *****:37568 failed to respond
Posted by GitBox <gi...@apache.org>.
codope commented on issue #6825:
URL: https://github.com/apache/hudi/issues/6825#issuecomment-1263231014
Yes from the stack trace it seems like the timeline server crashed. To root cause, we will have to look into driver resource monitors. In the upcoming release version, we have added a retry mechanism based on exponential backoff to avoid any throttling issues: https://github.com/apache/hudi/commit/c597eb5d206b528d201a47194e144b8aed732c0e
However, in the meantime, perhaps you could disable timeline server and remote filesytem view using
```
hoodie.embed.timeline.server=false
hoodie.filesystem.view.type=MEMORY
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] gubinjie commented on issue #6825: [SUPPORT]org.apache.hudi.exception.HoodieRemoteException: *****:37568 failed to respond
Posted by GitBox <gi...@apache.org>.
gubinjie commented on issue #6825:
URL: https://github.com/apache/hudi/issues/6825#issuecomment-1272269358
TH
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] gubinjie closed issue #6825: [SUPPORT]org.apache.hudi.exception.HoodieRemoteException: *****:37568 failed to respond
Posted by GitBox <gi...@apache.org>.
gubinjie closed issue #6825: [SUPPORT]org.apache.hudi.exception.HoodieRemoteException: *****:37568 failed to respond
URL: https://github.com/apache/hudi/issues/6825
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #6825: [SUPPORT]org.apache.hudi.exception.HoodieRemoteException: *****:37568 failed to respond
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #6825:
URL: https://github.com/apache/hudi/issues/6825#issuecomment-1263120280
guess timeline server crashed for some reason.
CC @yihua any thoughts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org