You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sameer Tilak <sa...@gmail.com> on 2009/04/21 19:55:42 UTC

Hadoop and Matlab

Hi there,

We're working on an image analysis project. The image processing code is
written in Matlab. If I invoke that code from a shell script and then use
that shell script within Hadoop streaming, will that work? Has anyone done
something along these lines?

Many thaks,
--ST.

Re: Hadoop and Matlab

Posted by Peter Skomoroch <pe...@gmail.com>.
If you can compile the matlab code to an executable with the matlab  
compiler and send it to the nodes with the distributed cache that  
should work... You probably want to avoid licensing fees for running  
copies of matlab itself on the cluster.

Sent from my iPhone

On Apr 21, 2009, at 1:55 PM, Sameer Tilak <sa...@gmail.com> wrote:

> Hi there,
>
> We're working on an image analysis project. The image processing  
> code is
> written in Matlab. If I invoke that code from a shell script and  
> then use
> that shell script within Hadoop streaming, will that work? Has  
> anyone done
> something along these lines?
>
> Many thaks,
> --ST.

RE: Hadoop and Matlab

Posted by "Patterson, Josh" <jp...@tva.gov>.
Sameer,
I'd also be interested in that as well; We are constructing a hadoop
cluster for energy data (PMU) for the NERC and we will be potentially
running jobs for a number of groups and researchers. I know some
researchers will know nothing of map reduce, yet are very keen on
MatLab, so we're looking at ways to make that transition as smooth as
possible. 

Josh Patterson
TVA

-----Original Message-----
From: Sameer Tilak [mailto:sameer.ucsd@gmail.com] 
Sent: Tuesday, April 21, 2009 1:56 PM
To: core-user@hadoop.apache.org
Subject: Hadoop and Matlab

Hi there,

We're working on an image analysis project. The image processing code is
written in Matlab. If I invoke that code from a shell script and then
use
that shell script within Hadoop streaming, will that work? Has anyone
done
something along these lines?

Many thaks,
--ST.

Re: Hadoop and Matlab

Posted by nitesh bhatia <ni...@gmail.com>.
Hi
The simplest way for you to run Matlab would be to use distributed toolkit
provided in matlab. You just need to configure matlab to discover other
matlab-machines. In that way you will not require to setup a hadoop cluster.
However if you want to use hadoop as a backend framework for distributed
processing, I would suggest you to go for Octave which is open source
toolkit just like matlab. It provides interfaces for c/c++. I think that
would be more easy to configure it with hadoop than going for matlab which
is not open source and licenced.

--nitesh


On Wed, Apr 22, 2009 at 7:10 AM, Edward J. Yoon <ed...@apache.org>wrote:

> Hi,
> Where to store the images? How to retrieval the images?
>
> If you have a metadata for the images, the map task can receives a
> 'filename' of image as a key, and file properies (host, file path,
> ..,etc) as its value. Then, I guess you can handle the matlab process
> using runtime object on hadoop cluster.
>
> On Wed, Apr 22, 2009 at 9:30 AM, Sameer Tilak <sa...@gmail.com>
> wrote:
> > Hi Edward,
> > Yes, we're building this for handling hundreds of thousands images (at
> > least). We're thinking processing of individual images (or a set of
> images
> > together) will be done in Matlab itself. However, we can use Hadoop
> > framework to process the data in parallel fashion. One Matlab instance
> > handling few hundred images (as a mapper) and have hundreds of such
> > instances and then combine (reducer) the o/p of each instance.
> >
> > On Tue, Apr 21, 2009 at 5:06 PM, Edward J. Yoon <edwardyoon@apache.org
> >wrote:
> >
> >> Hi, What is the input data?
> >>
> >> According to my understanding, you have a lot of images and want to
> >> process all images using your matlab script. Then, You should write
> >> some code yourself. I did similar thing for plotting graph with
> >> gnuplot. However, If you want to do large-scale linear algebra
> >> operations for large image processing, I would recommend investigating
> >> other solutions. Hadoop is not a general purpose clustering software,
> >> and it cannot run matlab.
> >>
> >> On Wed, Apr 22, 2009 at 2:55 AM, Sameer Tilak <sa...@gmail.com>
> >> wrote:
> >> > Hi there,
> >> >
> >> > We're working on an image analysis project. The image processing code
> is
> >> > written in Matlab. If I invoke that code from a shell script and then
> use
> >> > that shell script within Hadoop streaming, will that work? Has anyone
> >> done
> >> > something along these lines?
> >> >
> >> > Many thaks,
> >> > --ST.
> >> >
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >> edwardyoon@apache.org
> >> http://blog.udanax.org
> >>
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> edwardyoon@apache.org
> http://blog.udanax.org
>



-- 
Nitesh Bhatia
Dhirubhai Ambani Institute of Information & Communication Technology
Gandhinagar
Gujarat

"Life is never perfect. It just depends where you draw the line."

visit:
http://www.awaaaz.com - connecting through music
http://www.volstreet.com - lets volunteer for better tomorrow
http://www.instibuzz.com - Voice opinions, Transact easily, Have fun

Re: Hadoop and Matlab

Posted by "Edward J. Yoon" <ed...@apache.org>.
Hi,
Where to store the images? How to retrieval the images?

If you have a metadata for the images, the map task can receives a
'filename' of image as a key, and file properies (host, file path,
..,etc) as its value. Then, I guess you can handle the matlab process
using runtime object on hadoop cluster.

On Wed, Apr 22, 2009 at 9:30 AM, Sameer Tilak <sa...@gmail.com> wrote:
> Hi Edward,
> Yes, we're building this for handling hundreds of thousands images (at
> least). We're thinking processing of individual images (or a set of images
> together) will be done in Matlab itself. However, we can use Hadoop
> framework to process the data in parallel fashion. One Matlab instance
> handling few hundred images (as a mapper) and have hundreds of such
> instances and then combine (reducer) the o/p of each instance.
>
> On Tue, Apr 21, 2009 at 5:06 PM, Edward J. Yoon <ed...@apache.org>wrote:
>
>> Hi, What is the input data?
>>
>> According to my understanding, you have a lot of images and want to
>> process all images using your matlab script. Then, You should write
>> some code yourself. I did similar thing for plotting graph with
>> gnuplot. However, If you want to do large-scale linear algebra
>> operations for large image processing, I would recommend investigating
>> other solutions. Hadoop is not a general purpose clustering software,
>> and it cannot run matlab.
>>
>> On Wed, Apr 22, 2009 at 2:55 AM, Sameer Tilak <sa...@gmail.com>
>> wrote:
>> > Hi there,
>> >
>> > We're working on an image analysis project. The image processing code is
>> > written in Matlab. If I invoke that code from a shell script and then use
>> > that shell script within Hadoop streaming, will that work? Has anyone
>> done
>> > something along these lines?
>> >
>> > Many thaks,
>> > --ST.
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> edwardyoon@apache.org
>> http://blog.udanax.org
>>
>



-- 
Best Regards, Edward J. Yoon
edwardyoon@apache.org
http://blog.udanax.org

Re: Hadoop and Matlab

Posted by Sameer Tilak <sa...@gmail.com>.
Hi Edward,
Yes, we're building this for handling hundreds of thousands images (at
least). We're thinking processing of individual images (or a set of images
together) will be done in Matlab itself. However, we can use Hadoop
framework to process the data in parallel fashion. One Matlab instance
handling few hundred images (as a mapper) and have hundreds of such
instances and then combine (reducer) the o/p of each instance.

On Tue, Apr 21, 2009 at 5:06 PM, Edward J. Yoon <ed...@apache.org>wrote:

> Hi, What is the input data?
>
> According to my understanding, you have a lot of images and want to
> process all images using your matlab script. Then, You should write
> some code yourself. I did similar thing for plotting graph with
> gnuplot. However, If you want to do large-scale linear algebra
> operations for large image processing, I would recommend investigating
> other solutions. Hadoop is not a general purpose clustering software,
> and it cannot run matlab.
>
> On Wed, Apr 22, 2009 at 2:55 AM, Sameer Tilak <sa...@gmail.com>
> wrote:
> > Hi there,
> >
> > We're working on an image analysis project. The image processing code is
> > written in Matlab. If I invoke that code from a shell script and then use
> > that shell script within Hadoop streaming, will that work? Has anyone
> done
> > something along these lines?
> >
> > Many thaks,
> > --ST.
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> edwardyoon@apache.org
> http://blog.udanax.org
>

Re: Hadoop and Matlab

Posted by "Edward J. Yoon" <ed...@apache.org>.
Hi, What is the input data?

According to my understanding, you have a lot of images and want to
process all images using your matlab script. Then, You should write
some code yourself. I did similar thing for plotting graph with
gnuplot. However, If you want to do large-scale linear algebra
operations for large image processing, I would recommend investigating
other solutions. Hadoop is not a general purpose clustering software,
and it cannot run matlab.

On Wed, Apr 22, 2009 at 2:55 AM, Sameer Tilak <sa...@gmail.com> wrote:
> Hi there,
>
> We're working on an image analysis project. The image processing code is
> written in Matlab. If I invoke that code from a shell script and then use
> that shell script within Hadoop streaming, will that work? Has anyone done
> something along these lines?
>
> Many thaks,
> --ST.
>



-- 
Best Regards, Edward J. Yoon
edwardyoon@apache.org
http://blog.udanax.org