You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/01/11 15:20:56 UTC

[GitHub] [druid] rohangarg commented on pull request #12139: Limit the subquery results by memory usage (estimated)

rohangarg commented on pull request #12139:
URL: https://github.com/apache/druid/pull/12139#issuecomment-1010068322


   Some thoughts : 
   1. For performance, I'd suggest to also benchmark `estimateResultRowSize` (complete function) as a function of (numRows, numCols, sizeOfCols) to measure the independent impact. For instance, currently we have a 100k limit on subquery rows so for all successful cases, we'd be only measuring the size of 100k rows by default. Maybe the benchmark also helps in determining the default parameters we might have to set (like the 'n' for sampling if needed). Also, more things like different strategies for fixed width columns and variable width columns can be thought of. Or even caching of the size for subquery to help in concurrency cases for same subquery.
   2. Should we have the config as `maxSubqueryResultMemory` to make the config clearer and scoped? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org