You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Egor Pakhomov (JIRA)" <ji...@apache.org> on 2014/09/12 15:59:33 UTC
[jira] [Commented] (SPARK-3509) Method for generating random
LabeledPoints for testing
[ https://issues.apache.org/jira/browse/SPARK-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131554#comment-14131554 ]
Egor Pakhomov commented on SPARK-3509:
--------------------------------------
So far I have bad code for my usages. need good code. bad code -
def randomRegressionLabeledFeatureSet(size: Int, featureNumber: Int) = {
// bad code. task for better code - SPARK-3509
val seed = Random.nextLong();
sc.parallelize(1 to size, 10).map(i => {
val features = (1 to featureNumber).map(_ => Random.nextDouble()).toArray
var seedCopy = seed
val result = features.reduceLeft((a, b) => {
if (seedCopy % 3 == 0) {
seedCopy = seedCopy / 3
a * b
} else {
seedCopy = seedCopy / 2
a + b
}
})
new LabeledPoint(result, Vectors.dense(features))
})
}
> Method for generating random LabeledPoints for testing
> ------------------------------------------------------
>
> Key: SPARK-3509
> URL: https://issues.apache.org/jira/browse/SPARK-3509
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Affects Versions: 1.2.0
> Reporter: Egor Pakhomov
> Priority: Minor
> Fix For: 1.2.0
>
>
> During testing I need random LabeledPoints with some correletion behind it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org