You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "颜发才 (Yan Facai)" <ya...@gmail.com> on 2016/09/06 09:56:46 UTC

How to convert String to Vector ?

Hi,
I have a csv file like:
uid      mid      features       label
123    5231    [0, 1, 3, ...]    True

Both  "features" and "label" columns are used for GBTClassifier.

However, when I read the file:
Dataset<Row> samples = sparkSession.read().csv(file);
The type of samples.select("features") is String.

My question is:
How to map samples.select("features") to Vector or any appropriate type,
so I can use it to train like:
        GBTClassifier gbdt = new GBTClassifier()
                .setLabelCol("label")
                .setFeaturesCol("features")
                .setMaxIter(2)
                .setMaxDepth(7);


Thanks.

Re: clear steps for installation of spark, cassandra and cassandra connector to run on spyder 2.3.7 using python 3.5 and anaconda 2.4 ipython 4.0

Posted by ayan guha <gu...@gmail.com>.
Spark has pretty extensive documentation, that should be your starting
point. I do not use Cassandra much, but Cassandra connector should be a
spark package, so look for spark package website.

If I may say so, all docs should be one or two Google search away :)
On 6 Sep 2016 20:34, "muhammet pakyürek" <mp...@hotmail.com> wrote:

>
>
> could u send me  documents and links to satisfy all above requirements of installation
> of spark, cassandra and cassandra connector to run on spyder 2.3.7 using
> python 3.5 and anaconda 2.4 ipython 4.0
>
>
> ------------------------------
>
>

clear steps for installation of spark, cassandra and cassandra connector to run on spyder 2.3.7 using python 3.5 and anaconda 2.4 ipython 4.0

Posted by muhammet pakyürek <mp...@hotmail.com>.

could u send me  documents and links to satisfy all above requirements of installation of spark, cassandra and cassandra connector to run on spyder 2.3.7 using python 3.5 and anaconda 2.4 ipython 4.0


________________________________


Re: How to convert String to Vector ?

Posted by Leonard Cohen <34...@qq.com>.
hi,


map(feature => List(feature).split(',') )
in python:
list(string.split(',')) : 
eval(string)




http://stackoverflow.com/questions/31376574/spark-rddstring-string-into-rddmapstring-string




------------------ Original ------------------
From:  "颜发才(Yan Facai)";<ya...@gmail.com>;
Send time: Tuesday, Sep 6, 2016 5:56 PM
To: "user.spark"<us...@spark.apache.org>; 

Subject:  How to convert String to Vector ?



Hi, 

I have a csv file like:

uid      mid      features       label

123    5231    [0, 1, 3, ...]    True

Both  "features" and "label" columns are used for GBTClassifier.



However, when I read the file:

Dataset<Row> samples = sparkSession.read().csv(file);
The type of samples.select("features") is String.


My question is:

How to map samples.select("features") to Vector or any appropriate type,

so I can use it to train like:
        GBTClassifier gbdt = new GBTClassifier()
                .setLabelCol("label")
                .setFeaturesCol("features")
                .setMaxIter(2)
                .setMaxDepth(7);



Thanks.