You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/05/18 22:44:42 UTC

[GitHub] [druid] jasonk000 commented on issue #11140: Druid Router UI throwing 504 when there are too many tasks

jasonk000 commented on issue #11140:
URL: https://github.com/apache/druid/issues/11140#issuecomment-843615701


   I have done some profiling on our stack here, my analysis follows. We are configured with `HeapMemoryTaskStorage`.
   
   By issuing repeated SQL requests against broker (such as with `ab`) we can see the workload increase on `overlord`. Taking a CPU profile of the overlord host and focusing on the CPU related to the `/tasks` endpoint gives a view that over 50% of the CPU load is in `HeapMemoryTaskStorage::getTasks`, and only a small % of time in serialization.
   
   Notice specifically in the before/after below that the % of time (width of bar) of `getCompletedTaskInfo...` (highlighted in a magenta-ish colour), and that the bulk of the time is in `sortedCopy`.
   
   Before
   
   ![image](https://user-images.githubusercontent.com/3196528/118732397-258d0e00-b7ef-11eb-9234-e7f05a40d874.png)
   
   After changes, `getCompletedTaskInfo...`  is significantly reduced as a % of the overall CPU time, so much that serialization is now far larger than the query time.
   
   ![image](https://user-images.githubusercontent.com/3196528/118732369-13ab6b00-b7ef-11eb-995b-0be5c9900d12.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org