You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "bofei.xiao (JIRA)" <ji...@apache.org> on 2015/06/05 03:41:38 UTC

[jira] [Updated] (SPARK-8096) how to convert dataframe field to label and features

     [ https://issues.apache.org/jira/browse/SPARK-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

bofei.xiao updated SPARK-8096:
------------------------------
    Description: 
how to convert the dataframe to RDD[LabelPoint]
dataframe with fields target,age,sex,height
i want to cast target as label,"age,sex,height" as features vector

I faced this problem in the following circumstance:
--------------------------------------------------
given i have a csv file data.csv
target,age,sex,height
1,18,1,170
0,25,1,165
.....

now,i want build a decisitin model
step 1:load csv data as dataframe
val data= sqlContext.load("com.databricks.spark.csv",:Map("path" -> "data.csv", "header" -> "true")

step 2:build a decisiontree model
but decisiontree need a RDD[LabelPoint] input

thanks!

  was:
given i have a csv file data.csv
target,age,sex,height
1,18,1,170
0,25,1,165
.....

now,i want build a decisitin model
step 1:load csv data as dataframe
val data= sqlContext.load("com.databricks.spark.csv",:Map("path" -> "data.csv", "header" -> "true")

step 2:build a decisiontree model
but decisiontree need a RDD[LabelPoint] input

Q:how to convert the dataframe to RDD[LabelPoint]

thanks!

        Summary: how to convert dataframe field to label and features  (was: use csv data to build a classification model,how to convert dataframe field to label and features)

> how to convert dataframe field to label and features
> ----------------------------------------------------
>
>                 Key: SPARK-8096
>                 URL: https://issues.apache.org/jira/browse/SPARK-8096
>             Project: Spark
>          Issue Type: Bug
>            Reporter: bofei.xiao
>
> how to convert the dataframe to RDD[LabelPoint]
> dataframe with fields target,age,sex,height
> i want to cast target as label,"age,sex,height" as features vector
> I faced this problem in the following circumstance:
> --------------------------------------------------
> given i have a csv file data.csv
> target,age,sex,height
> 1,18,1,170
> 0,25,1,165
> .....
> now,i want build a decisitin model
> step 1:load csv data as dataframe
> val data= sqlContext.load("com.databricks.spark.csv",:Map("path" -> "data.csv", "header" -> "true")
> step 2:build a decisiontree model
> but decisiontree need a RDD[LabelPoint] input
> thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org