You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/03/31 06:34:09 UTC

[GitHub] [spark] tanelk commented on pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

tanelk commented on pull request #30965:
URL: https://github.com/apache/spark/pull/30965#issuecomment-810810870


   @wzhfy and @cloud-fan 
   
   I'm not a fan of adding up the relative costs.
   
   A simple example, where the weight is 0.5:
   If this plans size (bytes) is 2x larger, then no matter how many times more rows does the other plan have, the other plan will allways be considered to be better - `0.5*2 + 0.5*0.00000000000001  > 1`.
   This basically the same situation, where one cost overwhelms the other.
   
   Perhaps this would be a best of both worlds:
   `(this.card / other.card) ^ cardWeight * (this.size / other.size) ^ (1 - cardWeight) < 1`.
   In short - multiply the relative costs instead of adding them.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org