You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Gourav Sengupta <go...@gmail.com> on 2016/07/22 20:14:20 UTC

Distributed Matrices - spark mllib

Hi,

I had a sparse matrix and I wanted to add the value of a particular row
which is identified by a particular number.

from pyspark.mllib.linalg.distributed import CoordinateMatrix, MatrixEntry
mat =
CoordinateMatrix(all_scores_df.select('ID_1','ID_2','value').rdd.map(lambda
row: MatrixEntry(*row)))


This gives me the number or rows and columns. But I am not able to extract
the values and it always reports back the error:

AttributeError: 'NoneType' object has no attribute 'setCallSite'


Thanks and Regards,

Gourav Sengupta

Re: Distributed Matrices - spark mllib

Posted by Yanbo Liang <yb...@gmail.com>.
Hi Gourav,

I can not reproduce your problem. The following code snippets works well on
my local machine, you can try to verify it in your environment. Or could
you provide more information to make others can reproduce your problem?

from pyspark.mllib.linalg.distributed import CoordinateMatrix, MatrixEntry
l = [(1, 1, 10), (2, 2, 20), (3, 3, 30)]
df = sqlContext.createDataFrame(l, ['row', 'column', 'value'])
rdd = df.select('row', 'column', 'value').rdd.map(lambda row:
MatrixEntry(*row))
mat = CoordinateMatrix(rdd)
mat.entries.collect()

Thanks
Yanbo



2016-07-22 13:14 GMT-07:00 Gourav Sengupta <go...@gmail.com>:

> Hi,
>
> I had a sparse matrix and I wanted to add the value of a particular row
> which is identified by a particular number.
>
> from pyspark.mllib.linalg.distributed import CoordinateMatrix, MatrixEntry
> mat =
> CoordinateMatrix(all_scores_df.select('ID_1','ID_2','value').rdd.map(lambda
> row: MatrixEntry(*row)))
>
>
> This gives me the number or rows and columns. But I am not able to extract
> the values and it always reports back the error:
>
> AttributeError: 'NoneType' object has no attribute 'setCallSite'
>
>
> Thanks and Regards,
>
> Gourav Sengupta
>
>