You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Vladimir Feinberg (JIRA)" <ji...@apache.org> on 2016/08/05 17:14:20 UTC

[jira] [Created] (SPARK-16920) Investigate and fix issues introduced in SPARK-15858

Vladimir Feinberg created SPARK-16920:
-----------------------------------------

             Summary: Investigate and fix issues introduced in SPARK-15858
                 Key: SPARK-16920
                 URL: https://issues.apache.org/jira/browse/SPARK-16920
             Project: Spark
          Issue Type: Bug
          Components: MLlib
            Reporter: Vladimir Feinberg


There were several issues regarding the PR resolving SPARK-15858, my comments are available here:

https://github.com/apache/spark/commit/393db655c3c43155305fbba1b2f8c48a95f18d93

The two most important issues are:

1. The PR did not add a stress test proving it resolved the issue it was supposed to (though I have no doubt the optimization made is indeed correct).
2. The PR introduced quadratic prediction time in terms of the number of trees, which was previously linear. This issue needs to be investigated for whether it causes problems for large numbers of trees (say, 1000), an appropriate test should be added, and then fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org