You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/02/03 07:28:51 UTC

[jira] [Commented] (SPARK-19443) The function to generate constraints takes too long when the query plan grows continuously

    [ https://issues.apache.org/jira/browse/SPARK-19443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851181#comment-15851181 ] 

Apache Spark commented on SPARK-19443:
--------------------------------------

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/16785

> The function to generate constraints takes too long when the query plan grows continuously
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19443
>                 URL: https://issues.apache.org/jira/browse/SPARK-19443
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Liang-Chi Hsieh
>
> This issue is originally reported and discussed at http://apache-spark-developers-list.1001551.n3.nabble.com/SQL-ML-Pipeline-performance-regression-between-1-6-and-2-x-tc20803.html
> When run a ML `Pipeline` with many stages, during the iterative updating to `Dataset` , it is observed the it takes longer time to finish the fit and transform as the query plan grows continuously.
> Specially, the time spent on preparing optimized plan in current branch (74294 ms) is much higher than 1.6 (292 ms). Actually, the time is spent mostly on generating query plan's constraints during few optimization rules.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org