You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Xi Shen <da...@gmail.com> on 2015/03/13 09:49:39 UTC

How to do spares vector product in Spark?

Hi,

I have two RDD[Vector], both Vector are spares and of the form:

    (id, value)

"id" indicates the position of the value in the vector space. I want to
apply dot product on two of such RDD[Vector] and get a scale value. The
none exist values are treated as zero.

Any convenient tool to do this in Spark?


Thanks,
David

Re: How to do spares vector product in Spark?

Posted by Sean Owen <so...@cloudera.com>.
In Java/Scala-land, the intent is to use Breeze for this. "Vector" in
Spark is an opaque wrapper around the Breeze representation, which
contains a bunch of methods like this.

On Fri, Mar 13, 2015 at 3:28 PM, Daniel, Ronald (ELS-SDG)
<R....@elsevier.com> wrote:
>> Any convenient tool to do this [sparse vector product] in Spark?
>
>
>
> Unfortunately, it seems that there are very few operations defined for
> sparse vectors. I needed to add some, and ended up converting them to
> (dense) numpy vectors and doing the addition on those.
>
>
>
> Best regards,
>
> Ron
>
>
>
>
>
> From: Xi Shen [mailto:davidshen84@gmail.com]
> Sent: Friday, March 13, 2015 1:50 AM
> To: user@spark.apache.org
> Subject: How to do spares vector product in Spark?
>
>
>
> Hi,
>
>
>
> I have two RDD[Vector], both Vector are spares and of the form:
>
>
>
>     (id, value)
>
>
>
> "id" indicates the position of the value in the vector space. I want to
> apply dot product on two of such RDD[Vector] and get a scale value. The none
> exist values are treated as zero.
>
>
>
> Any convenient tool to do this in Spark?
>
>
>
>
>
> Thanks,
>
> David

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


RE: How to do spares vector product in Spark?

Posted by "Daniel, Ronald (ELS-SDG)" <R....@elsevier.com>.
> Any convenient tool to do this [sparse vector product] in Spark?

Unfortunately, it seems that there are very few operations defined for sparse vectors. I needed to add some, and ended up converting them to (dense) numpy vectors and doing the addition on those.

Best regards,
Ron


From: Xi Shen [mailto:davidshen84@gmail.com]
Sent: Friday, March 13, 2015 1:50 AM
To: user@spark.apache.org
Subject: How to do spares vector product in Spark?

Hi,

I have two RDD[Vector], both Vector are spares and of the form:

    (id, value)

"id" indicates the position of the value in the vector space. I want to apply dot product on two of such RDD[Vector] and get a scale value. The none exist values are treated as zero.

Any convenient tool to do this in Spark?


Thanks,
David