You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by John Norstad <j-...@northwestern.edu> on 2009/12/08 17:44:43 UTC

A MapReduce Algorithm for Matrix Multiplication

As an exercise while learning MapReduce, I developed an algorithm for matrix multiplication and wrote it up on my web site. If you're interested, it's at:

   http://homepage.mac.com/j.norstad/matrix-multiply

I present the algorithm in English, as pseudo-code, and as Java source code for Hadoop.

-- 
John Norstad
Academic and Research Technologies
Northwestern University

Re: Fwd: A MapReduce Algorithm for Matrix Multiplication

Posted by chwaqas254 <wa...@gmail.com>.

Hi all,

The implementation given in the following link only works while working with
standalone mode and not while  in pseudodistributed mode or distributed
mode:
http://homepage.mac.com/j.norstad/matrix-multiply

To be percise for sparse matrices it works fine but with dense matrices
strategy 1 and 4 does not work. Does anyone have solution for this or at
lease suggest that what can be the problem and how to solve it.Thanks

Regards,
wl

--
View this message in context: http://lucene.472066.n3.nabble.com/Fwd-A-MapReduce-Algorithm-for-Matrix-Multiplication-tp639798p3585667.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Fwd: A MapReduce Algorithm for Matrix Multiplication

Posted by "Edward J. Yoon" <ed...@apache.org>.

Hi mahout,

OK, I'm also somewhat skeptical about the M/R based algebraic
computations after testing them on the large cluster (100 ~ 1000
nodes). M/R baed M/L algorithms? IMHO, It's about the same. :)

Therefore, currently Hama team is working to implement the BSP package
on hadoop, and another computing model/framework based on its
mechanism.

The BSP package of Hama will be introduced to hadoop communities in this week.

Any questions are welcome.

On Thu, Dec 10, 2009 at 3:46 AM, Ted Dunning <te...@gmail.com> wrote:
> I haven't heard of any progress from HAMA and I am not optimistic about
> their proposed approach.
>
> On Wed, Dec 9, 2009 at 10:13 AM, Avram Aelony <aa...@mac.com> wrote:
>
>>
>> This is also a project called HAMA  <http://wiki.apache.org/hama/>.
>> Perhaps your efforts can help further (or coordinate with) HAMA?
>>
>>
>

-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Fwd: A MapReduce Algorithm for Matrix Multiplication

Posted by "Edward J. Yoon" <ed...@udanax.org>.

Hi mahout,

I'm also somewhat skeptical about the M/R based algebraic
computations after testing them on the large cluster (100 ~ 1000
nodes). M/R baed M/L algorithms? IMHO, It's about the same. :)

Therefore, currently Hama team is working to implement the BSP package
on hadoop, and another computing model/framework based on its
mechanism.

The BSP package of Hama will be introduced to hadoop communities in this week.

Any questions are welcome.

On Thu, Dec 10, 2009 at 3:46 AM, Ted Dunning <te...@gmail.com> wrote:
> I haven't heard of any progress from HAMA and I am not optimistic about
> their proposed approach.
>
> On Wed, Dec 9, 2009 at 10:13 AM, Avram Aelony <aa...@mac.com> wrote:
>
>>
>> This is also a project called HAMA  <http://wiki.apache.org/hama/>.
>> Perhaps your efforts can help further (or coordinate with) HAMA?
>>
>>
>

-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Fwd: A MapReduce Algorithm for Matrix Multiplication

Posted by Ted Dunning <te...@gmail.com>.

I haven't heard of any progress from HAMA and I am not optimistic about
their proposed approach.

On Wed, Dec 9, 2009 at 10:13 AM, Avram Aelony <aa...@mac.com> wrote:

>
> This is also a project called HAMA  <http://wiki.apache.org/hama/>.
> Perhaps your efforts can help further (or coordinate with) HAMA?
>
>

Re: Fwd: A MapReduce Algorithm for Matrix Multiplication

Posted by Avram Aelony <aa...@mac.com>.

This is also a project called HAMA  <http://wiki.apache.org/hama/>.  
Perhaps your efforts can help further (or coordinate with) HAMA?

Best regards,
Avram


 
On Wednesday, December 09, 2009, at 10:05AM, "Ted Dunning" <te...@gmail.com> wrote:
>See here:
>
>http://arxiv.org/abs/0909.4061v1
>
>On Tue, Dec 8, 2009 at 6:47 PM, Zhengguo 'Mike' SUN
><zh...@yahoo.com>wrote:
>
>> Hi Jake,
>>
>> I am implementing the classical multiplicative update rule of NMF. The
>> matrix to be factorized is really big and sparse. Are you suggesting that I
>> can use some specialised algorithms for sparse matrix instead of the
>> standard multiplication algorithm? But what algorithms are you referring to?
>> Could you please provide some pointers?
>>
>> Thanks
>>
>>
>>
>>
>> ________________________________
>> From: Jake Mannix <ja...@gmail.com>
>> To: mahout-user@lucene.apache.org
>> Sent: Tue, December 8, 2009 8:53:39 PM
>> Subject: Re: Fwd: A MapReduce Algorithm for Matrix Multiplication
>>
>> Hi Mike,
>>
>>   In the NMF work you're doing, the matrices you're multiplying are very
>> thin, right?  They're a handful of dense vectors on the left, an equal
>> number of dense vectors on the right, and you're only computing the values
>> in the outer product which coincide with the nonzero entries of the
>> (sparse)
>> input matrix?  In which case you're not really doing the full matrix
>> multiplication, and doing this a slightly specialized way should provide
>> some serious speedup.
>>
>>   Unless you're not dealing with the "really big, but sparse" case, and
>> minimizing error only over the nonzero inputs?
>>
>>   -jake
>>
>> On Tue, Dec 8, 2009 at 5:43 PM, Zhengguo 'Mike' SUN
>> <zh...@yahoo.com>wrote:
>>
>> > Hi Ted, John,
>> >
>> > I am also interested in matrix multiplication. Actually, I am
>> implementing
>> > Non-negative Matrix Factorization, which basically is iterations of
>> matrix
>> > multiplication. What I did is exactly as what Ted said: doing an outer
>> > product using one column from left matrix and one row from right matrix
>> and
>> > summing up all the results. Idealy, this could be done using only one
>> job:
>> > mappers doing multiplication and reducers doing summing. But the thing is
>> > that how do you match a column from left matrix and a row from right
>> matrix.
>> > Assuming that two input matrices are give in column major order and row
>> > major order, it seems to me that you have to implement a customized
>> > InputFormat to do the matching. Or maybe you could use one matrix as
>> input
>> > and read the other inside mappers, but this also has problem when you
>> have a
>> > lot of mappers. The straightforward way seems to be using two jobs. The
>> > mappers of the first job emit the id and the content of the row or column
>> it
>> >  reads and the reducers do the multiplication. The second job is to do
>> the
>> > summing. This is kind of similar to what John did. But it is inefficient
>> if
>> > you have many iterations. Any comments?
>> >
>> >
>> >
>> >
>> > ________________________________
>> > From: Ted Dunning <te...@gmail.com>
>> > To: j-norstad@northwestern.edu; mahout-user <
>> mahout-user@lucene.apache.org
>> > >
>> > Sent: Tue, December 8, 2009 2:59:06 PM
>> > Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication
>> >
>> > John,
>> >
>> > Your comment about matrix multiplication was forwarded to the mahout-user
>> > emailing list.
>> >
>> > I have a question for you about your approach.  Have you considered doing
>> > the multiplication in a single step by storing the the first matrix in
>> > column major order and the second in row major order?  The idea would be
>> to
>> > do a block-wise outer product and use the combiner to sum elements ahead
>> of
>> > the reducer.  This could be done by reading a (block) column from the
>> left
>> > matrix and a (block) row from the second and emitting all of the (block)
>> > elements of the outer product of this column and row.  The key emitted
>> with
>> > these blocks would be the i,j index of the final location of each block
>> in
>> > the final product.  The combiner and reducer could add all of the blocks
>> > together and the use of the combiner would serve to substantially
>> minimize
>> > the amount of network traffic.
>> >
>> > This alternative is equivalent to inverting the loop nesting in a
>> > conventional matrix multiplication algorithm.  It is also very closely
>> > related to the way that cooccurrence counting is typically done in
>> Hadoop.
>> >
>> > Is your project an ongoing effort?
>> >
>> > ---------- Forwarded message ----------
>> > From: Robin Anil <ro...@gmail.com>
>> > Date: Tue, Dec 8, 2009 at 11:09 AM
>> > Subject: Fwd: A MapReduce Algorithm for Matrix Multiplication
>> > To: mahout-user@lucene.apache.org
>> > Cc: John Norstad <j-...@northwestern.edu>
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: John Norstad <j-...@northwestern.edu>
>> > Date: Tue, Dec 8, 2009 at 10:14 PM
>> > Subject: A MapReduce Algorithm for Matrix Multiplication
>> > To: MapReduce <ma...@hadoop.apache.org>
>> >
>> >
>> > As an exercise while learning MapReduce, I developed an algorithm for
>> > matrix multiplication and wrote it up on my web site. If you're
>> > interested, it's at:
>> >
>> >  http://homepage.mac.com/j.norstad/matrix-multiply
>> >
>> > I present the algorithm in English, as pseudo-code, and as Java source
>> > code for Hadoop.
>> >
>> > --
>> > John Norstad
>> > Academic and Research Technologies
>> > Northwestern University
>> >
>> >
>> >
>> > --
>> > Ted Dunning, CTO
>> > DeepDyve
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>
>
>
>
>-- 
>Ted Dunning, CTO
>DeepDyve
>
>