You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/06/08 08:51:41 UTC

[GitHub] [druid] zhangyouxun opened a new issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

zhangyouxun opened a new issue #11341:
URL: https://github.com/apache/druid/issues/11341


   Here are two of the same SQL,Historical performance is better than Middle Manager,
   ```
   // query from MM, which takes 800ms.
   select TIME_FLOOR(__time, 'PT5M'), sum("requestCount")
   from request_info
   WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '1' HOUR
   GROUP BY 1
   
   //  query from Historical, which takes 100ms.
   select TIME_FLOOR(__time, 'PT5M'), sum("requestCount")
   from request_info
   WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '4' HOUR AND __time <= CURRENT_TIMESTAMP - INTERVAL '3' HOUR
   GROUP BY 1
   ```
   
   Historical config
   ```
   druid.processing.buffer.sizeBytes=500MiB
   druid.processing.numMergeBuffers=4
   druid.processing.numThreads=15
   druid.processing.tmpDir=var/druid/processing
   druid.historical.cache.useCache=true
   druid.historical.cache.populateCache=true
   druid.cache.type=caffeine
   druid.cache.sizeInBytes=256MiB
   ```
   
   Middle Manager config
   ```
   druid.indexer.runner.javaOpts=-Xms5g -Xmx5g  -XX:+UseG1GC -XX:MaxDirectMemorySize=10g
   druid.indexer.fork.property.druid.processing.numMergeBuffers=2
   druid.indexer.fork.property.druid.processing.buffer.sizeBytes=268435456
   druid.indexer.fork.property.druid.processing.numThreads=2
   druid.realtime.cache.useCache=true
   druid.realtime.cache.populateCache=true
   ```
   
   Is there any parameters needed being adjust at Middle Manager?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zhangyouxun commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
zhangyouxun commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-857318940


   > how many rows are there in the 1 hour interval ?
   The num of rows is 180 million in the 1 hour interval.
   
   >For you task configuration, up to 10g direct memory is available for JVM, but only (2 numMergeBuffers +2 numThreads +1)*256MiB = 1280MiB is configured for Druid to use.
    we will reduce direct memory to 5g later
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-857512939


   > Does it mean that the numThreads per peon cannot exceed 2?
   
   It depends on how many bytes your machine provides. Since the `-Xmx` is set to 5GiB, I assume that there're at least 30 * 5 = 150GiB memory available on your machine.
   
   And the direct memory Druid needs can be calculated by following formula:
   ```
   Direct Memory: (druid.processing.numThreads + druid.processing.numMergeBuffers + 1) * druid.processing.buffer.sizeBytes
   ```
   
   In this case, the SQL is translated into a timer-series query, you could increase `numThreads` and `sizeBytes` to see it helps.
   If group-by query is heavily used, you could also increase `numMergeBuffers`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-857680500


   Yes of course. You still have 100GiB memory for direct memory, that is 3GiB for each task. For example, `numThreads` could be set to 7, consuming direct memory (7 + 2 + 1)*256=2.5G in total.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zhangyouxun commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
zhangyouxun commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-865640821


   > Have you resolved this problem ?
   
   We adjust numThreads = 5, numMergeBuffers = 2 sizeBytes=512M,total direct memory is 5g, but query real node data not improve significantly。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zhangyouxun edited a comment on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
zhangyouxun edited a comment on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-857318940


   > how many rows are there in the 1 hour interval ?
   
   The num of rows is 180 million in the 1 hour interval.
   
   >For you task configuration, up to 10g direct memory is available for JVM, but only (2 numMergeBuffers +2 numThreads +1)*256MiB = 1280MiB is configured for Druid to use.
   The number of CPU of physical machine is 56, and the maximum number of peon is 30 in the middle manager.
   Does it mean that the numThreads per peon cannot exceed 2?
   if direct memory is 5g, how to balance numMergeBuffers and sizeBytes?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zhangyouxun commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
zhangyouxun commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-865640972


   > Have you resolved this problem ?
   
   We adjust numThreads = 5, numMergeBuffers = 2 sizeBytes=512M,total direct memory is 5g, but query real node data not improve significantly。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-864920368


   Have you resolved this problem ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-856867239


   how many rows are there in the 1 hour interval ?
   
   For you task configuration, up to 10g direct memory is available for JVM, but only (2 numMergeBuffers +2 numThreads +1)*256MiB = 1280MiB is configured for Druid to use.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-864920368


   Have you resolved this problem ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zhangyouxun commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
zhangyouxun commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-865640821






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zhangyouxun edited a comment on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
zhangyouxun edited a comment on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-857318940


   > how many rows are there in the 1 hour interval ?
   
   The num of rows is 180 million in the 1 hour interval.
   
   >For you task configuration, up to 10g direct memory is available for JVM, but only (2 numMergeBuffers +2 numThreads +1)*256MiB = 1280MiB is configured for Druid to use.
   
   The number of CPU of physical machine is 56, and the maximum number of peon is 30 in the middle manager.
   Does it mean that the numThreads per peon cannot exceed 2?
   if direct memory is 5g, how to balance numMergeBuffers and sizeBytes?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zhangyouxun commented on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
zhangyouxun commented on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-857607114


   The memory of the MM physical machine is 250g, and the number of CPU cores is 56. If each machine is configured with 30 druid.worker.capacity, can the thread num of peon be more than 2?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zhangyouxun edited a comment on issue #11341: In the same query, Historical performance is better than Middle Manager, and how to tune Middle Manager

Posted by GitBox <gi...@apache.org>.
zhangyouxun edited a comment on issue #11341:
URL: https://github.com/apache/druid/issues/11341#issuecomment-857318940


   > how many rows are there in the 1 hour interval ?
   The num of rows is 180 million in the 1 hour interval.
   
   >For you task configuration, up to 10g direct memory is available for JVM, but only (2 numMergeBuffers +2 numThreads +1)*256MiB = 1280MiB is configured for Druid to use.
   The number of CPU of physical machine is 56, and the maximum number of peon is 30 in the middle manager.
   Does it mean that the numThreads per peon cannot exceed 2?
   if direct memory is 5g, how to balance numMergeBuffers and sizeBytes?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org