You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenmin Wu (JIRA)" <ji...@apache.org> on 2015/09/16 04:24:45 UTC

[jira] [Updated] (SPARK-10629) Gradient boosted trees: mapPartitions input size increasing

     [ https://issues.apache.org/jira/browse/SPARK-10629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenmin Wu updated SPARK-10629:
------------------------------
    Description: 
First of all, I think my problem is quite different from https://issues.apache.org/jira/browse/SPARK-10433, which point that the input size increasing at each iteration.

My problem is the mapPartitions input size increase in one iteration. My training samples has 2958359 features in total. Within one iteration, 3 collectAsMap operation had been called. And here is a summary of each call.

stage ID 4 mapPartitions at DecisionTree.scala:613 

> Gradient boosted trees: mapPartitions input size increasing 
> ------------------------------------------------------------
>
>                 Key: SPARK-10629
>                 URL: https://issues.apache.org/jira/browse/SPARK-10629
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 1.4.1
>            Reporter: Wenmin Wu
>
> First of all, I think my problem is quite different from https://issues.apache.org/jira/browse/SPARK-10433, which point that the input size increasing at each iteration.
> My problem is the mapPartitions input size increase in one iteration. My training samples has 2958359 features in total. Within one iteration, 3 collectAsMap operation had been called. And here is a summary of each call.
> stage ID 4 mapPartitions at DecisionTree.scala:613 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org