You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Gimick <gi...@gmail.com> on 2009/11/13 08:21:50 UTC

Using hadoop for Matrix Multiplication in NFS?

Hi, I am new to hadoop.  I am planning to do matrix multiplication(of order
millions) using hadoop.  

I have a few queries regarding the above.

i) Will using hadoop be a fix for this or should I try some other
approaches?
ii) I will be using it in NFS.  Will using hadoop still be a good option?

If I can use hadoop for this problem, could you plz send links to configure
hadoop-site.xml file for a nfs system.

P.S. I tried a few setup instructions via search, but everything seems to
give "Unable to connect to ...." error.

-- 
View this message in context: http://old.nabble.com/Using-hadoop-for-Matrix-Multiplication-in-NFS--tp26332382p26332382.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Re: Using hadoop for Matrix Multiplication in NFS?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
I think another thing to look at is Mahout - http://lucene.apache.org/mahout 

See http://mahout.markmail.org/search/matrix+multiplication

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Brian Bockelman <bb...@cse.unl.edu>
> To: common-user@hadoop.apache.org
> Cc: core-user@hadoop.apache.org
> Sent: Fri, November 13, 2009 12:06:37 PM
> Subject: Re: Using hadoop for Matrix Multiplication in NFS?
> 
> Hi,
> 
> Assuming you're doing math...
> What you want is PETSc for sparse matrices: 
> http://www.mcs.anl.gov/petsc/petsc-as/
> If you're doing dense matrices, probable scalapack: 
> http://www.netlib.org/scalapack/
> 
> You benefit from working with someone who has a background in numerical 
> analysis.
> 
> Brian
> 
> On Nov 14, 2009, at 12:42 AM, zjffdu wrote:
> 
> > See my comments
> > 
> > 
> > -----Original Message-----
> > From: Gimick [mailto:gimmickivek@gmail.com]
> > Sent: 2009年11月12日 23:22
> > To: core-user@hadoop.apache.org
> > Subject: Using hadoop for Matrix Multiplication in NFS?
> > 
> > 
> > Hi, I am new to hadoop.  I am planning to do matrix multiplication(of order
> > millions) using hadoop.
> > 
> > I have a few queries regarding the above.
> > 
> > i) Will using hadoop be a fix for this or should I try some other
> > approaches?
> > 
> > --- Hama maybe such a tool that fit for your requirement,
> > http://incubator.apache.org/hama/
> > 
> > ii) I will be using it in NFS.  Will using hadoop still be a good option?
> > --- If you want to use NFS, I guess you have to provide your own
> > InputFormat. So you'd better put your data into hdfs, it will make your work
> > easy and improve your program's performance
> > 
> > 
> > 
> > If I can use hadoop for this problem, could you plz send links to configure
> > hadoop-site.xml file for a nfs system.
> > 
> > P.S. I tried a few setup instructions via search, but everything seems to
> > give "Unable to connect to ...." error.
> > 
> > --View this message in context:
> > http://old.nabble.com/Using-hadoop-for-Matrix-Multiplication-in-NFS--tp26332
> > 382p26332382.html
> > Sent from the Hadoop core-user mailing list archive at Nabble.com.


Re: Using hadoop for Matrix Multiplication in NFS?

Posted by Brian Bockelman <bb...@cse.unl.edu>.
Hi,

Assuming you're doing math...
What you want is PETSc for sparse matrices: http://www.mcs.anl.gov/petsc/petsc-as/
If you're doing dense matrices, probable scalapack: http://www.netlib.org/scalapack/

You benefit from working with someone who has a background in  
numerical analysis.

Brian

On Nov 14, 2009, at 12:42 AM, zjffdu wrote:

> See my comments
>
>
> -----Original Message-----
> From: Gimick [mailto:gimmickivek@gmail.com]
> Sent: 2009年11月12日 23:22
> To: core-user@hadoop.apache.org
> Subject: Using hadoop for Matrix Multiplication in NFS?
>
>
> Hi, I am new to hadoop.  I am planning to do matrix multiplication 
> (of order
> millions) using hadoop.
>
> I have a few queries regarding the above.
>
> i) Will using hadoop be a fix for this or should I try some other
> approaches?
>
> --- Hama maybe such a tool that fit for your requirement,
> http://incubator.apache.org/hama/
>
> ii) I will be using it in NFS.  Will using hadoop still be a good  
> option?
> --- If you want to use NFS, I guess you have to provide your own
> InputFormat. So you'd better put your data into hdfs, it will make  
> your work
> easy and improve your program's performance
>
>
>
> If I can use hadoop for this problem, could you plz send links to  
> configure
> hadoop-site.xml file for a nfs system.
>
> P.S. I tried a few setup instructions via search, but everything  
> seems to
> give "Unable to connect to ...." error.
>
> -- 
> View this message in context:
> http://old.nabble.com/Using-hadoop-for-Matrix-Multiplication-in-NFS--tp26332
> 382p26332382.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.


RE: Using hadoop for Matrix Multiplication in NFS?

Posted by zjffdu <zj...@gmail.com>.
See my comments


-----Original Message-----
From: Gimick [mailto:gimmickivek@gmail.com] 
Sent: 2009年11月12日 23:22
To: core-user@hadoop.apache.org
Subject: Using hadoop for Matrix Multiplication in NFS?


Hi, I am new to hadoop.  I am planning to do matrix multiplication(of order
millions) using hadoop.  

I have a few queries regarding the above.

i) Will using hadoop be a fix for this or should I try some other
approaches?

--- Hama maybe such a tool that fit for your requirement,
http://incubator.apache.org/hama/

ii) I will be using it in NFS.  Will using hadoop still be a good option?
--- If you want to use NFS, I guess you have to provide your own
InputFormat. So you'd better put your data into hdfs, it will make your work
easy and improve your program's performance



If I can use hadoop for this problem, could you plz send links to configure
hadoop-site.xml file for a nfs system.

P.S. I tried a few setup instructions via search, but everything seems to
give "Unable to connect to ...." error.

-- 
View this message in context:
http://old.nabble.com/Using-hadoop-for-Matrix-Multiplication-in-NFS--tp26332
382p26332382.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Re: Using hadoop for Matrix Multiplication in NFS?

Posted by Martin Mituzas <xi...@hotmail.com>.
ii) I once run the mapred program DistCp to copy data from NFS into HDFS. I
mount the file system directory to each node. Thus it can be connected.
 

Gimick wrote:
> 
> Hi, I am new to hadoop.  I am planning to do matrix multiplication(of
> order millions) using hadoop.  
> 
> I have a few queries regarding the above.
> 
> i) Will using hadoop be a fix for this or should I try some other
> approaches?
> ii) I will be using it in NFS.  Will using hadoop still be a good option?
> 
> If I can use hadoop for this problem, could you plz send links to
> configure hadoop-site.xml file for a nfs system.
> 
> P.S. I tried a few setup instructions via search, but everything seems to
> give "Unable to connect to ...." error.
> 
> 

-- 
View this message in context: http://old.nabble.com/Using-hadoop-for-Matrix-Multiplication-in-NFS--tp26332382p26491406.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Re: Using hadoop for Matrix Multiplication in NFS?

Posted by "Tsz Wo (Nicholas), Sze" <s2...@yahoo.com>.
Hi Gimich,

Could you describe your matrix multiplication problem in more details?  Are the matrices sparse or dense?  How big is the on-disk-size of a matrix?

Thanks.
Nicholas Sze




----- Original Message ----
> From: Edward J. Yoon <ed...@apache.org>
> To: common-user@hadoop.apache.org
> Sent: Tue, November 24, 2009 2:07:57 AM
> Subject: Re: Using hadoop for Matrix Multiplication in NFS?
> 
> Just FYI, Hadoop and M/R is a distributed computing system. So, there
> is a problem of locality and location of sub-matrix blocks. Moreover,
> M/R iteration method is really slow.
> 
> To perform the matrix multiplication (and also graph algorithm) on
> Hadoop, Apache Hama team is considering a BSP (bulk synchronous
> parallel) model using Hadoop RPC instead of M/R.
> 
> On Fri, Nov 13, 2009 at 4:21 PM, Gimick wrote:
> >
> > Hi, I am new to hadoop.  I am planning to do matrix multiplication(of order
> > millions) using hadoop.
> >
> > I have a few queries regarding the above.
> >
> > i) Will using hadoop be a fix for this or should I try some other
> > approaches?
> > ii) I will be using it in NFS.  Will using hadoop still be a good option?
> >
> > If I can use hadoop for this problem, could you plz send links to configure
> > hadoop-site.xml file for a nfs system.
> >
> > P.S. I tried a few setup instructions via search, but everything seems to
> > give "Unable to connect to ...." error.
> >
> > --
> > View this message in context: 
> http://old.nabble.com/Using-hadoop-for-Matrix-Multiplication-in-NFS--tp26332382p26332382.html
> > Sent from the Hadoop core-user mailing list archive at Nabble.com.
> >
> >
> 
> 
> 
> -- 
> Best Regards, Edward J. Yoon @ NHN, corp.
> edwardyoon@apache.org
> http://blog.udanax.org


Re: Using hadoop for Matrix Multiplication in NFS?

Posted by "Edward J. Yoon" <ed...@apache.org>.
Just FYI, Hadoop and M/R is a distributed computing system. So, there
is a problem of locality and location of sub-matrix blocks. Moreover,
M/R iteration method is really slow.

To perform the matrix multiplication (and also graph algorithm) on
Hadoop, Apache Hama team is considering a BSP (bulk synchronous
parallel) model using Hadoop RPC instead of M/R.

On Fri, Nov 13, 2009 at 4:21 PM, Gimick <gi...@gmail.com> wrote:
>
> Hi, I am new to hadoop.  I am planning to do matrix multiplication(of order
> millions) using hadoop.
>
> I have a few queries regarding the above.
>
> i) Will using hadoop be a fix for this or should I try some other
> approaches?
> ii) I will be using it in NFS.  Will using hadoop still be a good option?
>
> If I can use hadoop for this problem, could you plz send links to configure
> hadoop-site.xml file for a nfs system.
>
> P.S. I tried a few setup instructions via search, but everything seems to
> give "Unable to connect to ...." error.
>
> --
> View this message in context: http://old.nabble.com/Using-hadoop-for-Matrix-Multiplication-in-NFS--tp26332382p26332382.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org

Re: Using hadoop for Matrix Multiplication in NFS?

Posted by Allen Wittenauer <aw...@linkedin.com>.
On 11/12/09 11:21 PM, "Gimick" <gi...@gmail.com> wrote:
> ii) I will be using it in NFS.  Will using hadoop still be a good option?

If you are using NFS, then no.

You should be looking at something in the more traditional HPC space: Sun
Grid Engine, Torque/Maui, etc.