You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Danilo Bustos Pérez (JIRA)" <ji...@apache.org> on 2016/06/25 07:56:41 UTC

[jira] [Created] (SPARK-16206) Defining our own folds using CrossValidator

Danilo Bustos Pérez created SPARK-16206:
-------------------------------------------

             Summary: Defining our own folds using CrossValidator
                 Key: SPARK-16206
                 URL: https://issues.apache.org/jira/browse/SPARK-16206
             Project: Spark
          Issue Type: Wish
          Components: ML
    Affects Versions: 1.6.2
            Reporter: Danilo Bustos Pérez
            Priority: Trivial


I have been using cross validation process in order to train a Naive Bayes Model and I realize that it uses kFold method to get the random sampling data in order to create the folds. This method return an Array[(RDD[T], RDD[T])] of tuples, which I think are the set of different combination of the folds for training and testing.

My question is whether there is any specific reason because the API does not allow you to define your own array of folds. I think would be a good idea if this capability is supported, it would help a lot. 

Please refer to: http://stackoverflow.com/questions/37868984/why-we-can-not-define-our-own-folds-when-we-are-using-crossvalidator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org