You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2019/05/14 02:38:11 UTC

[GitHub] [skywalking] chenmutime opened a new issue #2658: There are numbers of "TIMED WAITING" & "BLOCKED" threads in jstack log

chenmutime opened a new issue #2658: There are numbers of "TIMED WAITING" & "BLOCKED" threads in jstack log
URL: https://github.com/apache/skywalking/issues/2658
 
 
   [environment]
   1 single virtual machine, 4C/8GB, with es & skywalking server deployed.
   standalone one-node elasticsearch, assigned Xmx4g for jvm.
   skywalking server deployed on the same vm, assigned Xmx2g for jvm.
   2 spring boot applications (deployed on another vm) connected to skywalking server.
   
   [symptom]
   1. After restarting es & skywalking server, skywalking server receives continued traces from application.
   2. It lasts 15-30 minutes. after then, no traces come into skywalking server any more.
   
   [effort taken]
   1. I've checked elasticsearch.log. There are no obvious error messages in elasticsearch.log.
   2. I've checked gc.log. There are no gc overhead messages in gc.log.
   3. I've printed stacktrace using jstack, log shows as below:
   
   ```
   "DataCarrier.IndicatorPersistentWorker.all_p50_day.Consumser.0.Thread" #20 daemon prio=5 os_prio=0 tid=0x00007f3e9507f000 nid=0x2e99 waiting on condition [0x00007f3e58acd000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
   at java.lang.Thread.sleep(Native Method)
   at org.apache.skywalking.apm.commons.datacarrier.consumer.ConsumerThread.run(ConsumerThread.java:72)
   ```
   ```
   "DataCarrier.IndicatorPersistentWorker.endpoint_avg.Consumser.0.Thread" #43 daemon prio=5 os_prio=0 tid=0x00007f3e950e2800 nid=0x2eb0 waiting for monitor entry [0x00007f3dfb2f1000]
   java.lang.Thread.State: BLOCKED (on object monitor)
   at org.elasticsearch.action.bulk.BulkProcessor.internalAdd(BulkProcessor.java:286)
   - waiting to lock <0x00000000813c8440> (a org.elasticsearch.action.bulk.BulkProcessor)
   at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:271)
   at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:267)
   at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:253)
   at org.apache.skywalking.oap.server.storage.plugin.elasticsearch.base.BatchProcessEsDAO.lambda$batchPersistence$0(BatchProcessEsDAO.java:64)
   at org.apache.skywalking.oap.server.storage.plugin.elasticsearch.base.BatchProcessEsDAO$$Lambda$278/81782502.accept(Unknown Source)
   at java.lang.Iterable.forEach(Iterable.java:75)
   at org.apache.skywalking.oap.server.storage.plugin.elasticsearch.base.BatchProcessEsDAO.batchPersistence(BatchProcessEsDAO.java:62)
   at org.apache.skywalking.oap.server.core.analysis.worker.PersistenceWorker.onWork(PersistenceWorker.java:51)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorPersistentWorker.onWork(IndicatorPersistentWorker.java:63)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorPersistentWorker$PersistentConsumer.consume(IndicatorPersistentWorker.java:153)
   at org.apache.skywalking.apm.commons.datacarrier.consumer.ConsumerThread.consume(ConsumerThread.java:101)
   at org.apache.skywalking.apm.commons.datacarrier.consumer.ConsumerThread.run(ConsumerThread.java:68)
   ```
   
   ```
   "DataCarrier.IndicatorAggregateWorker.instance_jvm_cpu.Consumser.0.Thread" #152 daemon prio=5 os_prio=0 tid=0x00007f3e951c1000 nid=0x2f1d sleeping[0x00007f3df4584000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
   at java.lang.Thread.sleep(Native Method)
   at org.apache.skywalking.apm.commons.datacarrier.buffer.Buffer.save(Buffer.java:64)
   at org.apache.skywalking.apm.commons.datacarrier.buffer.Channels.save(Channels.java:52)
   at org.apache.skywalking.apm.commons.datacarrier.DataCarrier.produce(DataCarrier.java:88)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorPersistentWorker.in(IndicatorPersistentWorker.java:68)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorTransWorker.in(IndicatorTransWorker.java:89)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorTransWorker.in(IndicatorTransWorker.java:32)
   at org.apache.skywalking.oap.server.core.remote.client.SelfRemoteClient.push(SelfRemoteClient.java:55)
   at org.apache.skywalking.oap.server.core.remote.RemoteSenderService.send(RemoteSenderService.java:51)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorRemoteWorker.in(IndicatorRemoteWorker.java:51)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorRemoteWorker.in(IndicatorRemoteWorker.java:33)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorAggregateWorker.lambda$sendToNext$0(IndicatorAggregateWorker.java:93)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorAggregateWorker$$Lambda$203/1364775767.accept(Unknown Source)
   at java.util.HashMap$Values.forEach(HashMap.java:981)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorAggregateWorker.sendToNext(IndicatorAggregateWorker.java:88)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorAggregateWorker.onWork(IndicatorAggregateWorker.java:73)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorAggregateWorker.access$100(IndicatorAggregateWorker.java:38)
   at org.apache.skywalking.oap.server.core.analysis.worker.IndicatorAggregateWorker$AggregatorConsumer.consume(IndicatorAggregateWorker.java:131)
   at org.apache.skywalking.apm.commons.datacarrier.consumer.ConsumerThread.consume(ConsumerThread.java:101)
   at org.apache.skywalking.apm.commons.datacarrier.consumer.ConsumerThread.run(ConsumerThread.java:68)
   ```
   
   4. I've checked es cluster stat and node stat.
   heap percent is in normal range, and cluster keeps yellow.
   
   [target]
   I wonder if you could give some hint on:
   - if there is recommended hardware for skywalking deployment. Like recommend 1C/2GB for single application tracing.
   - if there is any log or troubleshooting methods I could take for more deeply looking into the problem.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services