You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2016/03/22 23:45:25 UTC

[jira] [Updated] (SPARK-14084) Parallel training jobs in model selection

     [ https://issues.apache.org/jira/browse/SPARK-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiangrui Meng updated SPARK-14084:
----------------------------------
    Description: In CrossValidator and TrainValidationSplit, we run training jobs one by one. If users have a big cluster, they might see speed-ups if we parallelize the job submission on the driver. The trade-off is that we might need to make multiple copies of the training data, which could be expensive. It is worth testing and figure out the best way to implement it.  (was: In CrossValidator and TrainValidationSplit, we run training jobs one by one. If users have a big cluster, they might see speed-ups if we parallelize the jobs. The trade-off is that we might need to make multiple copies of the training data, which could be expensive. It is worth testing and figure out the best way to implement it.)

> Parallel training jobs in model selection
> -----------------------------------------
>
>                 Key: SPARK-14084
>                 URL: https://issues.apache.org/jira/browse/SPARK-14084
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>    Affects Versions: 2.0.0
>            Reporter: Xiangrui Meng
>
> In CrossValidator and TrainValidationSplit, we run training jobs one by one. If users have a big cluster, they might see speed-ups if we parallelize the job submission on the driver. The trade-off is that we might need to make multiple copies of the training data, which could be expensive. It is worth testing and figure out the best way to implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org