You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/04/23 14:04:38 UTC
[jira] [Resolved] (SPARK-7091) Too slow when use
GradientBoostedTrees to classify train data set.
[ https://issues.apache.org/jira/browse/SPARK-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-7091.
------------------------------
Resolution: Invalid
Please ask questions at user@spark.apache.org
> Too slow when use GradientBoostedTrees to classify train data set.
> ------------------------------------------------------------------
>
> Key: SPARK-7091
> URL: https://issues.apache.org/jira/browse/SPARK-7091
> Project: Spark
> Issue Type: Question
> Components: MLlib
> Affects Versions: 1.3.1
> Reporter: lee.xiaobo.2006
>
> This is one stage that consume too many times, The train data set shape is 1M*40K, any one can help me ?
> collectAsMap at DecisionTree.scala:642 2015/04/23 18:12:37 38 min 2/2 (1 skipped) 228/228 (4 skipped)
> the call stack is:
> org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:641)
> org.apache.spark.mllib.tree.DecisionTree$.findBestSplits(DecisionTree.scala:613)
> org.apache.spark.mllib.tree.RandomForest.run(RandomForest.scala:234)
> org.apache.spark.mllib.tree.DecisionTree.run(DecisionTree.scala:60)
> org.apache.spark.mllib.tree.GradientBoostedTrees$.org$apache$spark$mllib$tree$GradientBoostedTrees$$boost(GradientBoostedTrees.scala:194)
> org.apache.spark.mllib.tree.GradientBoostedTrees.run(GradientBoostedTrees.scala:67)
> org.apache.spark.mllib.tree.GradientBoostedTrees$.train(GradientBoostedTrees.scala:135)
> org.apache.spark.mllib.api.python.PythonMLLibAPI.trainGradientBoostedTreesModel(PythonMLLibAPI.scala:644)
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> java.lang.reflect.Method.invoke(Method.java:606)
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> py4j.Gateway.invoke(Gateway.java:259)
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> py4j.commands.CallCommand.execute(CallCommand.java:79)
> py4j.GatewayConnection.run(GatewayConnection.java:207)
> java.lang.Thread.run(Thread.java:724)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org