You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/13 18:58:48 UTC

[GitHub] [spark] skambha opened a new pull request #24593: [SPARK-27692][SQL] Add new optimizer rule to evaluate the deterministic scala udf only once if all inputs are literals

skambha opened a new pull request #24593: [SPARK-27692][SQL] Add new optimizer rule to evaluate the deterministic scala udf only once if all inputs are literals
URL: https://github.com/apache/spark/pull/24593
 
 
   Description: 
   Deterministic UDF is a udf for which the following is true:  Given a specific input, the output of the udf will be the same no matter how many times you execute the udf.
   
   When your inputs to the UDF are all literal and UDF is deterministic, we can optimize this to evaluate the udf once and use the output instead of evaluating the UDF each time for every row in the query. 
   
   This is valid only if the UDF is deterministic and inputs are literal.  Otherwise we should not and cannot apply this optimization. 
   
   Changes: 
   - Add a new optimizer rule to evaluate the ScalaUDF once if it is deterministic and the inputs are literals.  
   
   Testing: 
   - Added new unit tests
   
   Credits: 
   Thanks to [Guy Khazma](https://github.com/guykhazma) from the IBM Haifa Research Team for the idea and the original implementation. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org