You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Liang-Chi Hsieh (JIRA)" <ji...@apache.org> on 2017/02/20 09:00:53 UTC

[jira] [Created] (SPARK-19665) Improve constraint propagation

Liang-Chi Hsieh created SPARK-19665:
---------------------------------------

             Summary: Improve constraint propagation
                 Key: SPARK-19665
                 URL: https://issues.apache.org/jira/browse/SPARK-19665
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.1.0
            Reporter: Liang-Chi Hsieh


If there are aliased expression in the projection, we propagate constraints by completely expanding the original constraints with aliases.

This expanding costs much computation time when the number of aliases increases.

Another issue is we actually don't need the additional constraints at most of time. For example, if there is a constraint "a > b", and "a" is aliased to "c" and "d". When we use this constraint in filtering, we don't need all constraints "a > b", "c > b", "d > b". We only need "a > b" because if it is false, it is guaranteed that all other constraints are false too.

Fully expanding all constraints at all the time makes iterative ML algorithms where a ML pipeline with many stages runs very slow.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org