You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/10 06:13:18 UTC
[GitHub] [hudi] LinMingQiang opened a new issue, #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
LinMingQiang opened a new issue, #5832:
URL: https://github.com/apache/hudi/issues/5832
Flink streaming write to hudi , The task runs fine about an hour after it starts, and the communication is normal,When the task runs for a period of time,An error is reported when BucketAssignFunction communicates with JM's Timeline server, This error will occur under certain circumstances.
When the task restarts after an error is reported, it happens again after running for a period of time,eventually cause the task to fail
in addition:I have modified NetworkUtils' method of getting ip as suggested.
```
org.apache.hudi.exception.HoodieRemoteException: 10.18x.xx.xx:34805 failed to respond
at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.refresh(RemoteHoodieTableFileSystemView.java:420)
at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.sync(RemoteHoodieTableFileSystemView.java:484)
at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.sync(PriorityBasedFileSystemView.java:257)
at org.apache.hudi.sink.partitioner.profile.WriteProfile.reload(WriteProfile.java:252)
at org.apache.hudi.sink.partitioner.BucketAssigner.reload(BucketAssigner.java:211)
at org.apache.hudi.sink.partitioner.BucketAssignFunction.notifyCheckpointComplete(BucketAssignFunction.java:234)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.notifyCheckpointComplete(AbstractUdfStreamOperator.java:130)
at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.notifyCheckpointComplete(StreamOperatorWrapper.java:99) org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpointComplete(SubtaskCheckpointCoordinatorImpl.java:334)
at org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:1171)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointCompleteAsync$10(StreamTask.java:1136)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointOperation$12(StreamTask.java:1159)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:344)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:330)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
at org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
Caused by: org.apache.http.NoHttpResponseException: 10.18x.xx.xx:34805 failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at org.apache.http.client.fluent.Request.execute(Request.java:151)
at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.executeRequest(RemoteHoodieTableFileSystemView.java:176)
at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.refresh(RemoteHoodieTableFileSystemView.java:418)
... 23 more
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] LinMingQiang commented on issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
Posted by GitBox <gi...@apache.org>.
LinMingQiang commented on issue #5832:
URL: https://github.com/apache/hudi/issues/5832#issuecomment-1157180620
I think it's a network problem. When I retry the request, It's working.
I think we can add a retry mechanism to `executerequest`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] LinMingQiang commented on issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
Posted by GitBox <gi...@apache.org>.
LinMingQiang commented on issue #5832:
URL: https://github.com/apache/hudi/issues/5832#issuecomment-1157187205
OK, I will submit a PR as soon as possible
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #5832:
URL: https://github.com/apache/hudi/issues/5832#issuecomment-1157186639
> I think it's a network problem. When I retry the request, It's working. I think we can add a retry mechanism to `executerequest`
That's a good idea, feel free to fire the JIRA issue and send a PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
Posted by GitBox <gi...@apache.org>.
codope commented on issue #5832:
URL: https://github.com/apache/hudi/issues/5832#issuecomment-1158971410
Closing the issue as we have a patch available
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] LinMingQiang commented on issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
Posted by GitBox <gi...@apache.org>.
LinMingQiang commented on issue #5832:
URL: https://github.com/apache/hudi/issues/5832#issuecomment-1157166475
only 10
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope closed issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
Posted by GitBox <gi...@apache.org>.
codope closed issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
URL: https://github.com/apache/hudi/issues/5832
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] minihippo commented on issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
Posted by GitBox <gi...@apache.org>.
minihippo commented on issue #5832:
URL: https://github.com/apache/hudi/issues/5832#issuecomment-1156627623
How many partitions and files in a partition a flink checkpoint touched? How the `write.tasks` config?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] yuzhaojing commented on issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
Posted by GitBox <gi...@apache.org>.
yuzhaojing commented on issue #5832:
URL: https://github.com/apache/hudi/issues/5832#issuecomment-1155946799
@minihippo Can you follow up on this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on issue #5832: when task communicates with jm's timeline server throw HoodieRemoteException: IP : port failed to respond
Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #5832:
URL: https://github.com/apache/hudi/issues/5832#issuecomment-1157178021
Did you use 0.11.0 ? Let's wait for 0.11.1, in 0.11.0 we introduced many unnecessary fs view refresh/sync for write path, in 0.11.1, we fix these regression.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org