You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2022/04/18 02:34:00 UTC
[GitHub] [skywalking] yangyiweigege opened a new issue, #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
yangyiweigege opened a new issue, #8895:
URL: https://github.com/apache/skywalking/issues/8895
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar issues.
### Apache SkyWalking Component
OAP server (apache/skywalking)
### What happened
when i use skywalking in test enviorment, i find the trace data write to ES have most delay.sometimes data write to ES will cost 30 minute later.
### What you expected to happen
the trace write to ES within 20 seconds。
### How to reproduce
ES connectTimeout set 3000 seconds.
### Anything else
the flush task have exception cause the schedule thread all waiting ,didn't work :
<img width="1036" alt="6592F812-C690-479B-85EA-E36EE03EA8CA" src="https://user-images.githubusercontent.com/23202824/163744919-73362212-180a-4347-9e0b-da9fc5a13842.png">
so i add a try catch code in native,the exception is :
![39920013-C59B-402B-A14E-D37AE76435D2](https://user-images.githubusercontent.com/23202824/163745218-df761845-cf7a-40f6-98a1-762135b11934.png)
how to deal:
1.add try cache code in org.apache.skywalking.library.elasticsearch.bulk.BulkProcessor#flush
2.set more time for ES connection time, response time to avoid exception.
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101227298
> This doesn’t make sense to me. You don’t have available/reachable ES server but you still set a very large connection timeout (3000 seconds == 50 minutes!!!). The default flush interval is less than 1 minute so you will get many flush tasks that are waiting for the connection to be established. This is nothing related to exception in the task, I believe all possible exceptions are caught in the flush method.
3000 ms, I write the wrong number。
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101267426
>
>
>
>
>
>
> > The timeout threshold is not the community's concern. The current value is reasonable.
>
> >
>
> > I am just curious what is the exception.
>
>
>
> I only noticed that flush schedule throws this exception, but I don't know what the concret exception is. i will take time to design the exception.
The exception "EmptyEndpointException" means there is no healthy ES server that can establish a connection
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101260023
I only noticed that flush schedule throws this exception, but I don't know what the concret exception is. i will take time to design the exception.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101230115
The timeout threshold is not the community's concern. The current value is reasonable.
I am just curious what is the exception.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1102474571
> > > The timeout threshold is not the community's concern. The current value is reasonable.
> >
> >
> > >
> >
> >
> > > I am just curious what is the exception.
> >
> >
> > I only noticed that flush schedule throws this exception, but I don't know what the concret exception is. i will take time to design the exception.
>
> The exception "EmptyEndpointException" means there is no healthy ES server that can establish a connection
i have found the really reason of why schedule task will throw exception, please see https://github.com/apache/skywalking/pull/8909
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101142344
> @yangyiweigege Do you have any update about what you are going to do?
i will replace the runable code in flush scheduler by use RunnableWithExceptionProtection ,eg:
new RunnableWithExceptionProtection(this::flush,
t -> log.error("flush data to ES failure.", t)
and i will submit pr later.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101278515
> > connection
>
>
>
> but in actually,we have the healthy ES server...maybe it‘s only timeout?
Might be. You can try `curl` in the same machine as your OAP, to your ES server and see the time elapsed and status code. Like `curl http://<es>:<port>/_cluster/health`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101278427
> > connection
>
>
>
> but in actually,we have the healthy ES server...maybe it‘s only timeout?
Might be. You can try `curl` in the same machine as your OAP, to your ES server and see the time elapsed and status code. Like `curl http://<es>:<port>/_cluster/health`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101036928
Assigned to you. Wait for your PR to discuss details on that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101278429
> > connection
>
>
>
> but in actually,we have the healthy ES server...maybe it‘s only timeout?
Might be. You can try `curl` in the same machine as your OAP, to your ES server and see the time elapsed and status code. Like `curl http://<es>:<port>/_cluster/health`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101260479
> The timeout threshold is not the community's concern. The current value is reasonable.
>
> I am just curious what is the exception.
I only noticed that flush schedule throws this exception, but I don't know what the concret exception is. i will take time to design the exception.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101128788
@yangyiweigege Do you have any update what what you are going to do?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101065776
Try to use `RunnableWithExceptionProtection` to avoid try/catch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101225561
This doesn’t make sense to me. You don’t have available/reachable ES server but you still set a very large connection timeout (3000 seconds == 50 minutes!!!). The default flush interval is less than 1 minute so you will get many flush tasks that are waiting for the connection to be established. This is nothing related to exception in the task, I believe all possible exceptions are catches in the flush method.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101275932
> connection
but in actually,we have the healthy ES server...maybe it‘s only timeout?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [skywalking] wu-sheng closed issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
URL: https://github.com/apache/skywalking/issues/8895
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org