You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Oleg Ruchovets <or...@gmail.com> on 2012/09/24 17:30:26 UTC

Hadoop and Cuda , JCuda (CPU+GPU architecture)

Hi

I am going to process video analytics using hadoop
I am very interested about CPU+GPU architercute espessially using CUDA (
http://www.nvidia.com/object/cuda_home_new.html) and JCUDA (
http://jcuda.org/)
Does using HADOOP and CPU+GPU architecture bring significant performance
improvement and does someone succeeded to implement it in production
quality?

I didn't fine any projects / examples  to use such technology.
If someone could give me a link to best practices and example using
CUDA/JCUDA + hadoop that would be great.
Thanks in advane
Oleg.

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

Posted by Andrew Purtell <ap...@apache.org>.
On Mon, Sep 24, 2012 at 10:38 AM, Harsh J <ha...@cloudera.com> wrote:
> Make sure to checkout the rootbeer compiler that makes life easy:
> https://github.com/pcpratts/rootbeer1

Indeed. Interesting to think about how one might plumb Mapper and
Reducer to Rootbeer's ParallelRuntime.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

Posted by Harsh J <ha...@cloudera.com>.
Make sure to checkout the rootbeer compiler that makes life easy:
https://github.com/pcpratts/rootbeer1

On Mon, Sep 24, 2012 at 10:26 PM, Chen He <ai...@gmail.com> wrote:
> Hi Oleg
>
> I will answer your questions one by one.
>
> 1) file size
>
> There is no exactly number of file size that will definitely works well for
> GPGPU+Hadoop. You need to do your project POC to get the number.
>
> I think the GPU+Hadoop is very suitable for computation-intensive and
> data-intensive applications. However, be aware of the bottleneck between
> the GPU memory and CPU memory. I mean the benefit you obtained from using
> GPGPU should be larger than the performance that you sacrificed by shipping
> data between GPU memory and CPU memory.
>
> If you only have computation-intensive applications and can be parallelized
> by GPGPU, CUDA+Hadoop can also provide a parallel framework for you to
> distribute your work among the cluster nodes with fault-tolerance.
>
>
>  2) Is it good Idea to process data as locally as possble (I mean process a
> data like one file per one map)
>
> Local Map tasks are shorter than non-local tasks in the Hadoop MapReduce
> framework.
>
> 3) During your project did you face with limitations , problems?
>
> During my project, the video card was not fancy, it only allowed one CUDA
> program using the card in anytime. Then, we only  configured one map slot
> and one reduce slot in a cluster node. Now, nvidia has some powerful
> products that support multiple program run on the same card simultaneously.
>
> 4)  By the way I didn't fine code Jcuda example with Hadoop. :-)
>
> Your MapReduce code is written in Java, right? Integrate your Jcude code to
> either map() or reduce() method of your MapReduce code (you can also do
> this in the combiner, partitioner or whatever you need). Jcuda example only
> helps you know how Jcuda works.
>
> Chen
>
> On Mon, Sep 24, 2012 at 11:22 AM, Oleg Ruchovets <or...@gmail.com>wrote:
>
>> Great ,
>>    Can you give some tips or best practices like:
>> 1) file size
>> 2) Is it good Idea to process data as locally as possble (I mean process a
>> data like one file per one map)
>> 3) During your project did you face with limitations , problems?
>>
>>
>>    Can you point me on which hartware is better to use( I understand in
>> order to use GPU I need NVIDIA) .
>> I mean using CPU only arthitecture I have 8-12 core per one computer(for
>> example).
>>  What should I do in orger to use CPU+GPU arthitecture? What kind of NVIDIA
>> do I need for this.
>>
>> By the way I didn't fine code Jcuda example with Hadoop. :-)
>>
>> Thanks in advane
>> Oleg.
>>
>> On Mon, Sep 24, 2012 at 6:07 PM, Chen He <ai...@gmail.com> wrote:
>>
>> > Please see the Jcuda example. I do refer from there. BTW, you can also
>> > compile your cuda code in advance and let your hadoop code call those
>> > compiled code through Jcuda. That is what I did in my program.
>> >
>> > On Mon, Sep 24, 2012 at 10:45 AM, Oleg Ruchovets <oruchovets@gmail.com
>> > >wrote:
>> >
>> > > Thank you very much.  I saw this link !!!  . Do you have any code ,
>> > example
>> > > shared in the network (github for example).
>> > >
>> > > On Mon, Sep 24, 2012 at 5:33 PM, Chen He <ai...@gmail.com> wrote:
>> > >
>> > > > http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop
>> > > >
>> > > > On Mon, Sep 24, 2012 at 10:30 AM, Oleg Ruchovets <
>> oruchovets@gmail.com
>> > > > >wrote:
>> > > >
>> > > > > Hi
>> > > > >
>> > > > > I am going to process video analytics using hadoop
>> > > > > I am very interested about CPU+GPU architercute espessially using
>> > CUDA
>> > > (
>> > > > > http://www.nvidia.com/object/cuda_home_new.html) and JCUDA (
>> > > > > http://jcuda.org/)
>> > > > > Does using HADOOP and CPU+GPU architecture bring significant
>> > > performance
>> > > > > improvement and does someone succeeded to implement it in
>> production
>> > > > > quality?
>> > > > >
>> > > > > I didn't fine any projects / examples  to use such technology.
>> > > > > If someone could give me a link to best practices and example using
>> > > > > CUDA/JCUDA + hadoop that would be great.
>> > > > > Thanks in advane
>> > > > > Oleg.
>> > > > >
>> > > >
>> > >
>> >
>>



-- 
Harsh J

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

Posted by Chen He <ai...@gmail.com>.
Hi Oleg

I will answer your questions one by one.

1) file size

There is no exactly number of file size that will definitely works well for
GPGPU+Hadoop. You need to do your project POC to get the number.

I think the GPU+Hadoop is very suitable for computation-intensive and
data-intensive applications. However, be aware of the bottleneck between
the GPU memory and CPU memory. I mean the benefit you obtained from using
GPGPU should be larger than the performance that you sacrificed by shipping
data between GPU memory and CPU memory.

If you only have computation-intensive applications and can be parallelized
by GPGPU, CUDA+Hadoop can also provide a parallel framework for you to
distribute your work among the cluster nodes with fault-tolerance.


 2) Is it good Idea to process data as locally as possble (I mean process a
data like one file per one map)

Local Map tasks are shorter than non-local tasks in the Hadoop MapReduce
framework.

3) During your project did you face with limitations , problems?

During my project, the video card was not fancy, it only allowed one CUDA
program using the card in anytime. Then, we only  configured one map slot
and one reduce slot in a cluster node. Now, nvidia has some powerful
products that support multiple program run on the same card simultaneously.

4)  By the way I didn't fine code Jcuda example with Hadoop. :-)

Your MapReduce code is written in Java, right? Integrate your Jcude code to
either map() or reduce() method of your MapReduce code (you can also do
this in the combiner, partitioner or whatever you need). Jcuda example only
helps you know how Jcuda works.

Chen

On Mon, Sep 24, 2012 at 11:22 AM, Oleg Ruchovets <or...@gmail.com>wrote:

> Great ,
>    Can you give some tips or best practices like:
> 1) file size
> 2) Is it good Idea to process data as locally as possble (I mean process a
> data like one file per one map)
> 3) During your project did you face with limitations , problems?
>
>
>    Can you point me on which hartware is better to use( I understand in
> order to use GPU I need NVIDIA) .
> I mean using CPU only arthitecture I have 8-12 core per one computer(for
> example).
>  What should I do in orger to use CPU+GPU arthitecture? What kind of NVIDIA
> do I need for this.
>
> By the way I didn't fine code Jcuda example with Hadoop. :-)
>
> Thanks in advane
> Oleg.
>
> On Mon, Sep 24, 2012 at 6:07 PM, Chen He <ai...@gmail.com> wrote:
>
> > Please see the Jcuda example. I do refer from there. BTW, you can also
> > compile your cuda code in advance and let your hadoop code call those
> > compiled code through Jcuda. That is what I did in my program.
> >
> > On Mon, Sep 24, 2012 at 10:45 AM, Oleg Ruchovets <oruchovets@gmail.com
> > >wrote:
> >
> > > Thank you very much.  I saw this link !!!  . Do you have any code ,
> > example
> > > shared in the network (github for example).
> > >
> > > On Mon, Sep 24, 2012 at 5:33 PM, Chen He <ai...@gmail.com> wrote:
> > >
> > > > http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop
> > > >
> > > > On Mon, Sep 24, 2012 at 10:30 AM, Oleg Ruchovets <
> oruchovets@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > I am going to process video analytics using hadoop
> > > > > I am very interested about CPU+GPU architercute espessially using
> > CUDA
> > > (
> > > > > http://www.nvidia.com/object/cuda_home_new.html) and JCUDA (
> > > > > http://jcuda.org/)
> > > > > Does using HADOOP and CPU+GPU architecture bring significant
> > > performance
> > > > > improvement and does someone succeeded to implement it in
> production
> > > > > quality?
> > > > >
> > > > > I didn't fine any projects / examples  to use such technology.
> > > > > If someone could give me a link to best practices and example using
> > > > > CUDA/JCUDA + hadoop that would be great.
> > > > > Thanks in advane
> > > > > Oleg.
> > > > >
> > > >
> > >
> >
>

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

Posted by Oleg Ruchovets <or...@gmail.com>.
Great ,
   Can you give some tips or best practices like:
1) file size
2) Is it good Idea to process data as locally as possble (I mean process a
data like one file per one map)
3) During your project did you face with limitations , problems?


   Can you point me on which hartware is better to use( I understand in
order to use GPU I need NVIDIA) .
I mean using CPU only arthitecture I have 8-12 core per one computer(for
example).
 What should I do in orger to use CPU+GPU arthitecture? What kind of NVIDIA
do I need for this.

By the way I didn't fine code Jcuda example with Hadoop. :-)

Thanks in advane
Oleg.

On Mon, Sep 24, 2012 at 6:07 PM, Chen He <ai...@gmail.com> wrote:

> Please see the Jcuda example. I do refer from there. BTW, you can also
> compile your cuda code in advance and let your hadoop code call those
> compiled code through Jcuda. That is what I did in my program.
>
> On Mon, Sep 24, 2012 at 10:45 AM, Oleg Ruchovets <oruchovets@gmail.com
> >wrote:
>
> > Thank you very much.  I saw this link !!!  . Do you have any code ,
> example
> > shared in the network (github for example).
> >
> > On Mon, Sep 24, 2012 at 5:33 PM, Chen He <ai...@gmail.com> wrote:
> >
> > > http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop
> > >
> > > On Mon, Sep 24, 2012 at 10:30 AM, Oleg Ruchovets <oruchovets@gmail.com
> > > >wrote:
> > >
> > > > Hi
> > > >
> > > > I am going to process video analytics using hadoop
> > > > I am very interested about CPU+GPU architercute espessially using
> CUDA
> > (
> > > > http://www.nvidia.com/object/cuda_home_new.html) and JCUDA (
> > > > http://jcuda.org/)
> > > > Does using HADOOP and CPU+GPU architecture bring significant
> > performance
> > > > improvement and does someone succeeded to implement it in production
> > > > quality?
> > > >
> > > > I didn't fine any projects / examples  to use such technology.
> > > > If someone could give me a link to best practices and example using
> > > > CUDA/JCUDA + hadoop that would be great.
> > > > Thanks in advane
> > > > Oleg.
> > > >
> > >
> >
>

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

Posted by Chen He <ai...@gmail.com>.
Please see the Jcuda example. I do refer from there. BTW, you can also
compile your cuda code in advance and let your hadoop code call those
compiled code through Jcuda. That is what I did in my program.

On Mon, Sep 24, 2012 at 10:45 AM, Oleg Ruchovets <or...@gmail.com>wrote:

> Thank you very much.  I saw this link !!!  . Do you have any code , example
> shared in the network (github for example).
>
> On Mon, Sep 24, 2012 at 5:33 PM, Chen He <ai...@gmail.com> wrote:
>
> > http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop
> >
> > On Mon, Sep 24, 2012 at 10:30 AM, Oleg Ruchovets <oruchovets@gmail.com
> > >wrote:
> >
> > > Hi
> > >
> > > I am going to process video analytics using hadoop
> > > I am very interested about CPU+GPU architercute espessially using CUDA
> (
> > > http://www.nvidia.com/object/cuda_home_new.html) and JCUDA (
> > > http://jcuda.org/)
> > > Does using HADOOP and CPU+GPU architecture bring significant
> performance
> > > improvement and does someone succeeded to implement it in production
> > > quality?
> > >
> > > I didn't fine any projects / examples  to use such technology.
> > > If someone could give me a link to best practices and example using
> > > CUDA/JCUDA + hadoop that would be great.
> > > Thanks in advane
> > > Oleg.
> > >
> >
>

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

Posted by Oleg Ruchovets <or...@gmail.com>.
Thank you very much.  I saw this link !!!  . Do you have any code , example
shared in the network (github for example).

On Mon, Sep 24, 2012 at 5:33 PM, Chen He <ai...@gmail.com> wrote:

> http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop
>
> On Mon, Sep 24, 2012 at 10:30 AM, Oleg Ruchovets <oruchovets@gmail.com
> >wrote:
>
> > Hi
> >
> > I am going to process video analytics using hadoop
> > I am very interested about CPU+GPU architercute espessially using CUDA (
> > http://www.nvidia.com/object/cuda_home_new.html) and JCUDA (
> > http://jcuda.org/)
> > Does using HADOOP and CPU+GPU architecture bring significant performance
> > improvement and does someone succeeded to implement it in production
> > quality?
> >
> > I didn't fine any projects / examples  to use such technology.
> > If someone could give me a link to best practices and example using
> > CUDA/JCUDA + hadoop that would be great.
> > Thanks in advane
> > Oleg.
> >
>

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

Posted by Chen He <ai...@gmail.com>.
http://wiki.apache.org/hadoop/CUDA%20On%20Hadoop

On Mon, Sep 24, 2012 at 10:30 AM, Oleg Ruchovets <or...@gmail.com>wrote:

> Hi
>
> I am going to process video analytics using hadoop
> I am very interested about CPU+GPU architercute espessially using CUDA (
> http://www.nvidia.com/object/cuda_home_new.html) and JCUDA (
> http://jcuda.org/)
> Does using HADOOP and CPU+GPU architecture bring significant performance
> improvement and does someone succeeded to implement it in production
> quality?
>
> I didn't fine any projects / examples  to use such technology.
> If someone could give me a link to best practices and example using
> CUDA/JCUDA + hadoop that would be great.
> Thanks in advane
> Oleg.
>

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

Posted by Mark Kerzner <ma...@shmsoft.com>.
Oleg,

I, on the other hand, have a project that might benefit, but not the
implementation as yet. http://freeeed.org/ is very CPU intensive. So please
share your notes.

Mark

On Mon, Sep 24, 2012 at 10:30 AM, Oleg Ruchovets <or...@gmail.com>wrote:

> Hi
>
> I am going to process video analytics using hadoop
> I am very interested about CPU+GPU architercute espessially using CUDA (
> http://www.nvidia.com/object/cuda_home_new.html) and JCUDA (
> http://jcuda.org/)
> Does using HADOOP and CPU+GPU architecture bring significant performance
> improvement and does someone succeeded to implement it in production
> quality?
>
> I didn't fine any projects / examples  to use such technology.
> If someone could give me a link to best practices and example using
> CUDA/JCUDA + hadoop that would be great.
> Thanks in advane
> Oleg.
>