You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "jasonk000 (via GitHub)" <gi...@apache.org> on 2023/02/03 17:48:09 UTC

[GitHub] [druid] jasonk000 commented on pull request #12303: Kafka & Kinesis stream ingest parsing in parallel

jasonk000 commented on PR #12303:
URL: https://github.com/apache/druid/pull/12303#issuecomment-1416201807

   > Do you have a performance report that shows how this change improves the throughput by setting different values of thread count under same incoming message rate?
   
   Yes. This patch has a penalty when configured to use `parsingThreadCount=1` but, for 2 or higher, throughput is improved!
   
   ![image](https://user-images.githubusercontent.com/3196528/216670466-73788f13-1719-4f7e-9bc9-7fe45c2c2e0b.png)
   
   
   parsingThreadCount | ingested rows / 4mins |  speedup
   -- | -- | --
   1 | 52515 | 84%
   2 | 64926 | 104%
   3 | 74105 | 119%
   4 | 80810 | 130%
   5 | 86054 | 138%
   6 | 89136 | 143%
   7 | 94213 | 151%
   8 | 95680 | 153%
   pre-patch | 62372 | 100%
   
   This change moves the performance impact mostly to (1) kafka ingestion flow and (2) to index row generation
   
   focused only on the task-runner thread:
   
   before:
   ![image](https://user-images.githubusercontent.com/3196528/216671908-50852ad4-32f5-4767-9a64-2ace84c67eac.png)
   
   after:, notice the purple `parseWithInputFormat` is moved to this thread
   ![image](https://user-images.githubusercontent.com/3196528/216671932-66b41c80-b9de-4368-bd0f-300ba652c2bb.png)
   
   So, this makes the next bottleneck be the remainder of that loop, and any future improvements to index generator will scale up with N threads.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org