You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by paul-rogers <gi...@git.apache.org> on 2017/10/01 01:33:23 UTC

[GitHub] drill issue #958: DRILL-5808: Reduce memory allocator strictness for "manage...

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/958
  
    @Ben-Zvi, you are right that, in the worst case, this change will allow operators to exceed the memory allotment. But, that is actually the purpose.
    
    As we know, it is *very* difficult to get memory management just right at present due to the wildly varying memory layouts for vectors, power-of-two rounding of buffer sizes, unexpected doubling of vectors, and lack of control over the size of incoming batches. We'd love to fix these, but doing so will take time.
    
    In the meanwhile, we have the choice of failing queries because the calcs are off by a bit, or being more flexible and letting queries succeed at the risk of running out of memory. The change here does log each "excess" allocation so we can find them and fix any remaining issues. Also, in a test environment, strict limits are enforced to find bugs.
    
    All of this is set against the backdrop of the exchange operators, hash join, and other operators that have an unlimited appetite for memory. Until we reign in those operators, seems silly to kill user queries because those operators that *do* manage memory make a small mistake here or there.
    
    Once all operators are under control, and Drill's internal memory allocation is under better control, we can back out this change and be much more strict about enforcing memory limits.
    
    Bottom line: should we fail user queries because of remaining rough spots in the "managed" operators? Or, should we allow user queries to succeed at a very small additional risk of running out of memory?


---