You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Gourav Sengupta <go...@gmail.com> on 2016/07/22 20:14:20 UTC
Distributed Matrices - spark mllib
Hi,
I had a sparse matrix and I wanted to add the value of a particular row
which is identified by a particular number.
from pyspark.mllib.linalg.distributed import CoordinateMatrix, MatrixEntry
mat =
CoordinateMatrix(all_scores_df.select('ID_1','ID_2','value').rdd.map(lambda
row: MatrixEntry(*row)))
This gives me the number or rows and columns. But I am not able to extract
the values and it always reports back the error:
AttributeError: 'NoneType' object has no attribute 'setCallSite'
Thanks and Regards,
Gourav Sengupta
Re: Distributed Matrices - spark mllib
Posted by Yanbo Liang <yb...@gmail.com>.
Hi Gourav,
I can not reproduce your problem. The following code snippets works well on
my local machine, you can try to verify it in your environment. Or could
you provide more information to make others can reproduce your problem?
from pyspark.mllib.linalg.distributed import CoordinateMatrix, MatrixEntry
l = [(1, 1, 10), (2, 2, 20), (3, 3, 30)]
df = sqlContext.createDataFrame(l, ['row', 'column', 'value'])
rdd = df.select('row', 'column', 'value').rdd.map(lambda row:
MatrixEntry(*row))
mat = CoordinateMatrix(rdd)
mat.entries.collect()
Thanks
Yanbo
2016-07-22 13:14 GMT-07:00 Gourav Sengupta <go...@gmail.com>:
> Hi,
>
> I had a sparse matrix and I wanted to add the value of a particular row
> which is identified by a particular number.
>
> from pyspark.mllib.linalg.distributed import CoordinateMatrix, MatrixEntry
> mat =
> CoordinateMatrix(all_scores_df.select('ID_1','ID_2','value').rdd.map(lambda
> row: MatrixEntry(*row)))
>
>
> This gives me the number or rows and columns. But I am not able to extract
> the values and it always reports back the error:
>
> AttributeError: 'NoneType' object has no attribute 'setCallSite'
>
>
> Thanks and Regards,
>
> Gourav Sengupta
>
>