You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Adarsh Sharma <ad...@orkash.com> on 2011/02/09 14:08:41 UTC

CUDA on Hadoop

Dear all,

I am going to work on a Project that includes " Working on CUDA in 
Hadoop Environment ".

I work on Hadoop Platforms ( Hadoop, Hive, Hbase, Map-Reduce ) from the 
past 8 months.

If anyone has some working experience or some pointers to basic steps 
includes Basic Introduction, Configuring & Running CUDA programs in 
Hadoop Cluster , any White Papers or any sort of helpful information, 
Please let me know through links or materials.

I shall be grateful for any kindness.



Thanks & Best Regards

Adarsh Sharma

Re: CUDA on Hadoop

Posted by Adarsh Sharma <ad...@orkash.com>.
Thanx Harsh, I find the below link to start with some practical knowledge.

http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_2.2_--_Running_C%2B%2B_Programs_on_Hadoop

But Is HAMA Project has some usefulness for making a sort of Analysis 
Engine that analysis TB's data in Hadoop HDFS.



Best Regards

Adarsh Sharma


Harsh J wrote:
> You can check-out this project which did some work for Hama+CUDA:
> http://code.google.com/p/mrcl/
>
> On Wed, Feb 9, 2011 at 6:38 PM, Adarsh Sharma <ad...@orkash.com> wrote:
>   
>> Dear all,
>>
>> I am going to work on a Project that includes " Working on CUDA in Hadoop
>> Environment ".
>>
>> I work on Hadoop Platforms ( Hadoop, Hive, Hbase, Map-Reduce ) from the past
>> 8 months.
>>
>> If anyone has some working experience or some pointers to basic steps
>> includes Basic Introduction, Configuring & Running CUDA programs in Hadoop
>> Cluster , any White Papers or any sort of helpful information, Please let me
>> know through links or materials.
>>
>> I shall be grateful for any kindness.
>>
>>
>>
>> Thanks & Best Regards
>>
>> Adarsh Sharma
>>
>>     
>
>
>
>   


Re: CUDA on Hadoop

Posted by Milind Bhandarkar <mb...@linkedin.com>.
My ex-colleague, Sanjiv Satoor (currently at NVidia in Pune, India), and I have had some discussions about it. He is (obviously) very interested. Please contact him for more info.

- milind

On Feb 9, 2011, at 6:45 AM, Steve Loughran wrote:

> On 09/02/11 13:58, Harsh J wrote:
>> You can check-out this project which did some work for Hama+CUDA:
>> http://code.google.com/p/mrcl/
> 
> Amazon let you bring up a Hadoop cluster on machines with GPUs you can code against, but I haven't heard of anyone using it. The big issue is bandwidth; it just doesn't make sense for a classic "scan through the logs" kind of problem as the disk:GPU bandwidth ratio is even worse than disk:CPU.
> 
> That said, if you were doing something that involved a lot of compute on a block of data (e.g. rendering tiles in a map), this could work.

---
Milind Bhandarkar
mbhandarkar@linkedin.com




Re: CUDA on Hadoop

Posted by Lance Norskog <go...@gmail.com>.
If you want to use Python, one of the Py+CUDA projects generates CUDA
C from the Python byte-codes. You don't have to write any C. I don't
remember which project it is.

This lets you debug the CUDA code in isolation, then run it from the
Hadoop streaming mode.


On 2/9/11, Adarsh Sharma <ad...@orkash.com> wrote:
> He Chen wrote:
>> Hi sharma
>>
>> I shared our slides about CUDA performance on Hadoop clusters. Feel
>> free to modified it, please mention the copyright!
>>
>> Chen
>>
>> On Wed, Feb 9, 2011 at 11:13 AM, He Chen <airbots@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>>     Hi  Sharma
>>
>>     I have some experiences on working Hybrid Hadoop with GPU. Our
>>     group has tested CUDA performance on Hadoop clusters. We obtain 20
>>     times speedup and save up to 95% power consumption in some
>>     computation-intensive test case.
>>
>>     You can parallel your Java code by using JCUDA which is a kind of
>>     API to help you call CUDA in your Java code.
>>
>>     Chen
>>
>>
>>     On Wed, Feb 9, 2011 at 8:45 AM, Steve Loughran <stevel@apache.org
>>     <ma...@apache.org>> wrote:
>>
>>         On 09/02/11 13:58, Harsh J wrote:
>>
>>             You can check-out this project which did some work for
>>             Hama+CUDA:
>>             http://code.google.com/p/mrcl/
>>
>>
>>         Amazon let you bring up a Hadoop cluster on machines with GPUs
>>         you can code against, but I haven't heard of anyone using it.
>>         The big issue is bandwidth; it just doesn't make sense for a
>>         classic "scan through the logs" kind of problem as the
>>         disk:GPU bandwidth ratio is even worse than disk:CPU.
>>
>>         That said, if you were doing something that involved a lot of
>>         compute on a block of data (e.g. rendering tiles in a map),
>>         this could work.
>>
>>
>>
> Thanks Chen , I am looking for some White-Papers on the mentioned topic
> or concerning.
> I think no one has write any white paper on this topic Or I'm wrong.
>
> However U'r Ppt is very nice.
> Thanx Once again .
>
> Adarsh
>


-- 
Lance Norskog
goksron@gmail.com

Re: CUDA on Hadoop

Posted by Adarsh Sharma <ad...@orkash.com>.
He Chen wrote:
> Hi sharma
>
> I shared our slides about CUDA performance on Hadoop clusters. Feel 
> free to modified it, please mention the copyright!
>
> Chen
>
> On Wed, Feb 9, 2011 at 11:13 AM, He Chen <airbots@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hi  Sharma
>
>     I have some experiences on working Hybrid Hadoop with GPU. Our
>     group has tested CUDA performance on Hadoop clusters. We obtain 20
>     times speedup and save up to 95% power consumption in some
>     computation-intensive test case. 
>
>     You can parallel your Java code by using JCUDA which is a kind of
>     API to help you call CUDA in your Java code.
>
>     Chen 
>
>
>     On Wed, Feb 9, 2011 at 8:45 AM, Steve Loughran <stevel@apache.org
>     <ma...@apache.org>> wrote:
>
>         On 09/02/11 13:58, Harsh J wrote:
>
>             You can check-out this project which did some work for
>             Hama+CUDA:
>             http://code.google.com/p/mrcl/
>
>
>         Amazon let you bring up a Hadoop cluster on machines with GPUs
>         you can code against, but I haven't heard of anyone using it.
>         The big issue is bandwidth; it just doesn't make sense for a
>         classic "scan through the logs" kind of problem as the
>         disk:GPU bandwidth ratio is even worse than disk:CPU.
>
>         That said, if you were doing something that involved a lot of
>         compute on a block of data (e.g. rendering tiles in a map),
>         this could work.
>
>
>
Thanks Chen , I am looking for some White-Papers on the mentioned topic 
or concerning.
I think no one has write any white paper on this topic Or I'm wrong.

However U'r Ppt is very nice.
Thanx Once again .

Adarsh

Re: CUDA on Hadoop

Posted by He Chen <ai...@gmail.com>.
Thank you Steve Loughran. I just created a new page on Hadoop wiki, however,
how can I create a new document page on Hadoop Wiki?

Best wishes

Chen

On Thu, Feb 10, 2011 at 5:38 AM, Steve Loughran <st...@apache.org> wrote:

> On 09/02/11 17:31, He Chen wrote:
>
>> Hi sharma
>>
>> I shared our slides about CUDA performance on Hadoop clusters. Feel free
>> to
>> modified it, please mention the copyright!
>>
>
> This is nice. If you stick it up online you should link to it from the
> Hadoop wiki pages -maybe start a hadoop+cuda page and refer to it
>
>

Re: CUDA on Hadoop

Posted by Adarsh Sharma <ad...@orkash.com>.
Steve Loughran wrote:
> On 09/02/11 17:31, He Chen wrote:
>> Hi sharma
>>
>> I shared our slides about CUDA performance on Hadoop clusters. Feel 
>> free to
>> modified it, please mention the copyright!
>
> This is nice. If you stick it up online you should link to it from the 
> Hadoop wiki pages -maybe start a hadoop+cuda page and refer to it
>
Yes,  This will be very helpful for others too. But This much 
information is not sufficient , need more.



Best Regards

Adarsh Sharma




Re: CUDA on Hadoop

Posted by Steve Loughran <st...@apache.org>.
On 09/02/11 17:31, He Chen wrote:
> Hi sharma
>
> I shared our slides about CUDA performance on Hadoop clusters. Feel free to
> modified it, please mention the copyright!

This is nice. If you stick it up online you should link to it from the 
Hadoop wiki pages -maybe start a hadoop+cuda page and refer to it


Re: CUDA on Hadoop

Posted by He Chen <ai...@gmail.com>.
Hi sharma

I shared our slides about CUDA performance on Hadoop clusters. Feel free to
modified it, please mention the copyright!

Chen

On Wed, Feb 9, 2011 at 11:13 AM, He Chen <ai...@gmail.com> wrote:

> Hi  Sharma
>
> I have some experiences on working Hybrid Hadoop with GPU. Our group has
> tested CUDA performance on Hadoop clusters. We obtain 20 times speedup and
> save up to 95% power consumption in some computation-intensive test case.
>
> You can parallel your Java code by using JCUDA which is a kind of API to
> help you call CUDA in your Java code.
>
> Chen
>
>
> On Wed, Feb 9, 2011 at 8:45 AM, Steve Loughran <st...@apache.org> wrote:
>
>> On 09/02/11 13:58, Harsh J wrote:
>>
>>> You can check-out this project which did some work for Hama+CUDA:
>>> http://code.google.com/p/mrcl/
>>>
>>
>> Amazon let you bring up a Hadoop cluster on machines with GPUs you can
>> code against, but I haven't heard of anyone using it. The big issue is
>> bandwidth; it just doesn't make sense for a classic "scan through the logs"
>> kind of problem as the disk:GPU bandwidth ratio is even worse than disk:CPU.
>>
>> That said, if you were doing something that involved a lot of compute on a
>> block of data (e.g. rendering tiles in a map), this could work.
>>
>
>

Re: CUDA on Hadoop

Posted by He Chen <ai...@gmail.com>.
Hi  Sharma

I have some experiences on working Hybrid Hadoop with GPU. Our group has
tested CUDA performance on Hadoop clusters. We obtain 20 times speedup and
save up to 95% power consumption in some computation-intensive test case.

You can parallel your Java code by using JCUDA which is a kind of API to
help you call CUDA in your Java code.

Chen

On Wed, Feb 9, 2011 at 8:45 AM, Steve Loughran <st...@apache.org> wrote:

> On 09/02/11 13:58, Harsh J wrote:
>
>> You can check-out this project which did some work for Hama+CUDA:
>> http://code.google.com/p/mrcl/
>>
>
> Amazon let you bring up a Hadoop cluster on machines with GPUs you can code
> against, but I haven't heard of anyone using it. The big issue is bandwidth;
> it just doesn't make sense for a classic "scan through the logs" kind of
> problem as the disk:GPU bandwidth ratio is even worse than disk:CPU.
>
> That said, if you were doing something that involved a lot of compute on a
> block of data (e.g. rendering tiles in a map), this could work.
>

Re: CUDA on Hadoop

Posted by Steve Loughran <st...@apache.org>.
On 09/02/11 13:58, Harsh J wrote:
> You can check-out this project which did some work for Hama+CUDA:
> http://code.google.com/p/mrcl/

Amazon let you bring up a Hadoop cluster on machines with GPUs you can 
code against, but I haven't heard of anyone using it. The big issue is 
bandwidth; it just doesn't make sense for a classic "scan through the 
logs" kind of problem as the disk:GPU bandwidth ratio is even worse than 
disk:CPU.

That said, if you were doing something that involved a lot of compute on 
a block of data (e.g. rendering tiles in a map), this could work.

Re: CUDA on Hadoop

Posted by Harsh J <qw...@gmail.com>.
You can check-out this project which did some work for Hama+CUDA:
http://code.google.com/p/mrcl/

On Wed, Feb 9, 2011 at 6:38 PM, Adarsh Sharma <ad...@orkash.com> wrote:
> Dear all,
>
> I am going to work on a Project that includes " Working on CUDA in Hadoop
> Environment ".
>
> I work on Hadoop Platforms ( Hadoop, Hive, Hbase, Map-Reduce ) from the past
> 8 months.
>
> If anyone has some working experience or some pointers to basic steps
> includes Basic Introduction, Configuring & Running CUDA programs in Hadoop
> Cluster , any White Papers or any sort of helpful information, Please let me
> know through links or materials.
>
> I shall be grateful for any kindness.
>
>
>
> Thanks & Best Regards
>
> Adarsh Sharma
>



-- 
Harsh J
www.harshj.com

RE: CUDA on Hadoop

Posted by Michael Segel <mi...@hotmail.com>.

First, CUDA means C/C++ on the GPU, so if you're going to do M/R you will need to bone up on your JNI.
Second... I'd make sure you have written some CUDA modified code first and test it outside of a M/R framework.


Beyond that... I'd say it was still leading edge.

> Date: Wed, 9 Feb 2011 18:38:41 +0530
> From: adarsh.sharma@orkash.com
> To: common-user@hadoop.apache.org
> Subject: CUDA on Hadoop
> 
> Dear all,
> 
> I am going to work on a Project that includes " Working on CUDA in 
> Hadoop Environment ".
> 
> I work on Hadoop Platforms ( Hadoop, Hive, Hbase, Map-Reduce ) from the 
> past 8 months.
> 
> If anyone has some working experience or some pointers to basic steps 
> includes Basic Introduction, Configuring & Running CUDA programs in 
> Hadoop Cluster , any White Papers or any sort of helpful information, 
> Please let me know through links or materials.
> 
> I shall be grateful for any kindness.
> 
> 
> 
> Thanks & Best Regards
> 
> Adarsh Sharma