You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/05/21 00:02:13 UTC

[GitHub] [druid] jasonk000 opened a new issue #11279: High CPU (or dropped events) in HttpPostEmitter with bursty events

jasonk000 opened a new issue #11279:
URL: https://github.com/apache/druid/issues/11279


   When configured with a `minHttpTimeoutMilis`, coordinator node has HttpPostEmitter thread pegged at 100% CPU when large batches arrive followed by an idle period. Alternatively, when no min timeout is set, submissions to backend fail due to very short timeout periods (2ms).
   
   ### Affected Version
   
   0.20.0
   
   ### Description
   
   HttpPostEmitter has some heuristic logic to batch events and ship them in an efficient and timely fashion, while also trying to somewhat throttle itself.
   
   When a large batch of data is received in too short a time frame, the heuristics arrive to a state such that the emitter gets stuck in a busy-loop and unable to make forward progress. This is the case until enough normal messages have accrued to reset the `lastBatchFillTimeMillis` duration.
   
   I can detect this as follows:
   - Configuration set as follows 
   ```
   # Emit metrics over http
   druid.emitter=http
   druid.emitter.http.recipientBaseUrl=http://localhost:8078/druidmetrics
   druid.emitter.http.minHttpTimeoutMillis=100
   ```
   - Debug logs on, and look for `Failed to send events to url[http://localhost:8078/druidmetrics] with timeout less than minimum`
   - CPU up to 100%
   - Monitor logs of downstream service (ie: the receiver of the HTTP POSTs). One post arrives and then no more for some period of time.
   - Take heap dump to inspect service configuration - many buffers present in failed buffer queue.
   - Heap analysis also shows `lastBatchFillTimeMillis` = 2, and `approximateXxxToEmitCount` = 0.
   
   ### Current logic
   
   With reference to the logic [here](https://github.com/apache/druid/blob/8296123d895db7d06bc4517db5e767afb7862b83/core/src/main/java/org/apache/druid/java/util/emitter/core/HttpPostEmitter.java#L501-L521) and [here](https://github.com/apache/druid/blob/8296123d895db7d06bc4517db5e767afb7862b83/core/src/main/java/org/apache/druid/java/util/emitter/core/HttpPostEmitter.java#L708-L716):
   - As data arrives, it is batched. Once a batch is filled, a timer is captured `lastBatchFillTimeMillis` to use later, and the batch submitted to the endpoint.
   - The overall flow is a `while(true) { readFromQueueAndSubmitBatch(); }` flow, with some ability to park failed messages for resubmission.
   - In all cases, the control flows through `sendWithRetries()` into `send()`.
   - In `send()`, through to `[computeTimeoutForSendRequestInMillis]`(https://github.com/apache/druid/blob/8296123d895db7d06bc4517db5e767afb7862b83/core/src/main/java/org/apache/druid/java/util/emitter/core/HttpPostEmitter.java#L805-L810), a calculation is made to set a timer based on how long the last batch took to fill and how much is waiting in queue.
   - `send()` then compares that to the configuration item `minHttpTimeoutMillis` and if it is lower, than it stalls submission.
   
   ### Issue
   
   - When a large burst of data arrives into the emitter, that fills more than a single buffer, the `lastBatchFillTimeMillis` becomes very short, (say, 2ms). The burst stops, and so no more messages are arriving to completely fill a batch for "some time" (maybe 10s of seconds).
   - As long as the queue depth is modest, (< 25 in this case), the calculated timeout will _always be < 100ms_ (lastBatchFillTime * allowance), and so _the send operation will always be rejected_ to wait for later. The allowance factor is always `2.0f` or less as the queue grows, which gives a best case of 2ms * 2.0f = 4ms which is drastically less than the minimum 100ms.
   - The message is immediately failed, and eventually accumulates in the failed queue.
   - Once a "normal throughput" batch completes, the last batch fill time will be more normal (15818ms in captured analysis).
   - On next iteration, the failed submission queue will attempt a send, and a calculated timeout will be greater than the minimum 100ms. Submission proceeds. Very quickly all failed batches are submitted to the backend and CPU drops back to normal (~0% for HttpPostEmitter).
   
   ### Attempts to work around
   - Setting the `.minHttpTimeoutMillis` from 1-4ms results in a high rate of failure under normal operations.
   - Setting `.maxBatchSize` smaller does help reduce it, however the lowest limit is strictly defined [here](https://github.com/apache/druid/blob/8296123d895db7d06bc4517db5e767afb7862b83/core/src/main/java/org/apache/druid/java/util/emitter/core/HttpPostEmitter.java#L157-L164)  `maxBatchSize must be greater than MAX_EVENT_SIZE[1,047,552] + overhead`
   - Setting `.flushCount` to a lower number (eg: `50` instead of default `500`), means the busy loop is resolved a little quicker (30-60sec) but it is more likely that a buffer filling event occurs.
   - Clearing the `minHttpTimeoutMillis` (default of `0`), shows a different error under the same circumstances, fundamentally due to the same causes - where the timeout heuristic calculated is based on the previous batch fill timer but that batch fill was short) - `Request timeout .. after 2ms`:
   ```
   @4000000060a6f7350c20a31c Caused by: java.util.concurrent.TimeoutException: Request timeout to localhost/127.0.0.1:8078 after 2 ms
   @4000000060a6f7350c20a31c 	at org.asynchttpclient.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43) ~[async-http-client-2.5.3.jar:?]
   @4000000060a6f7350c20aed4 	at org.asynchttpclient.netty.timeout.RequestTimeoutTimerTask.run(RequestTimeoutTimerTask.java:50) ~[async-http-client-2.5.3.jar:?]
   @4000000060a6f7350c20aed4 	at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:672) ~[netty-common-4.1.48.Final.jar:4.1.48.Final]
   @4000000060a6f7350c20b2bc 	at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:747) ~[netty-common-4.1.48.Final.jar:4.1.48.Final]
   @4000000060a6f7350c20b2bc 	at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:472) ~[netty-common-4.1.48.Final.jar:4.1.48.Final]
   @4000000060a6f7350c20b2bc 	at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_282]```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org