You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2022/04/18 02:34:00 UTC

[GitHub] [skywalking] yangyiweigege opened a new issue, #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

yangyiweigege opened a new issue, #8895:
URL: https://github.com/apache/skywalking/issues/8895

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Apache SkyWalking Component
   
   OAP server (apache/skywalking)
   
   ### What happened
   
   when i use skywalking in test enviorment,  i find the trace data write to ES have most delay.sometimes data write to ES will cost 30 minute later.
   
   ### What you expected to happen
   
   the trace write to ES  within 20 seconds。
   
   ### How to reproduce
   
   ES connectTimeout  set 3000 seconds.
   
   ### Anything else
   
   the flush task have exception cause the schedule thread all waiting ,didn't work :
   <img width="1036" alt="6592F812-C690-479B-85EA-E36EE03EA8CA" src="https://user-images.githubusercontent.com/23202824/163744919-73362212-180a-4347-9e0b-da9fc5a13842.png">
   
   so  i add a try catch code in native,the exception is :
   ![39920013-C59B-402B-A14E-D37AE76435D2](https://user-images.githubusercontent.com/23202824/163745218-df761845-cf7a-40f6-98a1-762135b11934.png)
   
   how to deal:
   1.add try cache code in org.apache.skywalking.library.elasticsearch.bulk.BulkProcessor#flush
   2.set more time for ES connection time, response time to  avoid exception.
   
   
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101227298

   > This doesn’t make sense to me. You don’t have available/reachable ES server but you still set a very large connection timeout (3000 seconds == 50 minutes!!!). The default flush interval is less than 1 minute so you will get many flush tasks that are waiting for the connection to be established. This is nothing related to exception in the task, I believe all possible exceptions are caught in the flush method.
   
   3000 ms, I write the wrong number。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101267426

   > 
   > 
   > 
   > 
   > 
   > 
   > > The timeout threshold is not the community's concern. The current value is reasonable.
   > 
   > > 
   > 
   > > I am just curious what is the exception.
   > 
   > 
   > 
   > I only noticed that flush schedule throws this exception, but I don't know what the concret exception is. i will take time to design the exception.
   
   The exception "EmptyEndpointException" means there is no healthy ES server that can establish a connection


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101260023

   I only noticed that flush schedule  throws this exception, but I don't know what the concret exception is. i will take time to design the exception.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101230115

   The timeout threshold is not the community's concern. The current value is reasonable.
   
   I am just curious what is the exception.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1102474571

   > > > The timeout threshold is not the community's concern. The current value is reasonable.
   > > 
   > > 
   > > > 
   > > 
   > > 
   > > > I am just curious what is the exception.
   > > 
   > > 
   > > I only noticed that flush schedule throws this exception, but I don't know what the concret exception is. i will take time to design the exception.
   > 
   > The exception "EmptyEndpointException" means there is no healthy ES server that can establish a connection
   
   i have found the  really reason of why schedule task will throw exception, please see https://github.com/apache/skywalking/pull/8909


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101142344

   > @yangyiweigege Do you have any update about what you are going to do?
   
   i will replace the  runable code in flush scheduler by use RunnableWithExceptionProtection ,eg:
                  new RunnableWithExceptionProtection(this::flush,
                          t -> log.error("flush data to ES failure.", t) 
   and i will submit pr  later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101278515

   > > connection
   > 
   > 
   > 
   > but in actually,we have the healthy ES server...maybe it‘s only timeout?
   
   Might be. You can try `curl` in the same machine as your OAP, to your ES server and see the time elapsed and status code. Like `curl http://<es>:<port>/_cluster/health`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101278427

   > > connection
   > 
   > 
   > 
   > but in actually,we have the healthy ES server...maybe it‘s only timeout?
   
   Might be. You can try `curl` in the same machine as your OAP, to your ES server and see the time elapsed and status code. Like `curl http://<es>:<port>/_cluster/health`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101036928

   Assigned to you. Wait for your PR to discuss details on that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101278429

   > > connection
   > 
   > 
   > 
   > but in actually,we have the healthy ES server...maybe it‘s only timeout?
   
   Might be. You can try `curl` in the same machine as your OAP, to your ES server and see the time elapsed and status code. Like `curl http://<es>:<port>/_cluster/health`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101260479

   
   
   
   > The timeout threshold is not the community's concern. The current value is reasonable.
   > 
   > I am just curious what is the exception.
   
   I only noticed that flush schedule throws this exception, but I don't know what the concret exception is. i will take time to design the exception.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101128788

   @yangyiweigege Do you have any update what what you are going to do?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101065776

   Try to use `RunnableWithExceptionProtection` to avoid try/catch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] kezhenxu94 commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101225561

   This doesn’t make sense to me. You don’t have available/reachable ES server but you still set a very large connection timeout (3000 seconds == 50 minutes!!!). The default flush interval is less than 1 minute so you will get many flush tasks that are waiting for the connection to be established. This is nothing related to exception in the task, I believe all possible exceptions are catches in the flush method. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] yangyiweigege commented on issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
yangyiweigege commented on issue #8895:
URL: https://github.com/apache/skywalking/issues/8895#issuecomment-1101275932

   > connection
   
   but in actually,we have the healthy ES server...maybe it‘s only timeout?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng closed issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception

Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #8895: [Bug] ES flush thread will stop work when flush schedule task have exception
URL: https://github.com/apache/skywalking/issues/8895


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org