You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by "Basu,Indrashish" <in...@ufl.edu> on 2014/03/01 19:52:11 UTC

Drawbacks of Hadoop Pipes

Hello,

I am trying to execute a CUDA benchmark in a Hadoop Framework and using 
Hadoop Pipes for invoking the CUDA code which is written in a C++ 
interface from the Hadoop Framework. I am just a bit interested in 
knowing what can be the drawbacks of using Hadoop Pipes for this and 
whether the implementation of Hadoop Streaming and JNI interface will be 
a better choice. I am a bit unclear on this, so if anyone can throw some 
light on this and clarify.

Regards,
Indrashish

-- 
Indrashish Basu
Graduate Student
Department of Electrical and Computer Engineering
University of Florida

Re: Drawbacks of Hadoop Pipes

Posted by Silvina Caíno Lores <si...@gmail.com>.
Hi there,

I've been working with pipes for some months and I've finally managed to
get it working as I wanted with some legacy code I had. However, I had many
many issues regarding not only my implementation (it had to be adapted in
several ways to fit pipes, it is very restrictive) but pipes itself (bugs,
obscure errors and lack of proper logging with the subsequent mad
debugging).

I also tried streaming but I found it even more complex to debug and I
found some deal-breaker errors that I couldn't overcome regarding buffering
and such. I also tried a SWIG interface to wrap my code into a Java
library, I'd never recommend that for you might end up introducing a lot of
memory issues and potential bugs into your already working code, and you
basically don't get anything useful from it.

I've never worked with CUDA though, but it shouldn't be any different from
my Hadoop Pipes deployment besides the specific libraries you need. Be
prepared to deal with configuration issues and many esoteric logs,
nevertheless.

My advise, based in my experience, is that you should be 99% sure that your
original code is solid before migrating to Hadoop Pipes, you will have
enough problems there anyway.

Good luck on your work :)
Regards,
Silvina


On 3 March 2014 16:11, Basu,Indrashish <in...@ufl.edu> wrote:

>
> Hello,
>
> Anyone can help regarding the below query.
>
> Regards,
> Indrashish
>
>
> On Sat, 01 Mar 2014 13:52:11 -0500, Basu,Indrashish wrote:
>
>> Hello,
>>
>> I am trying to execute a CUDA benchmark in a Hadoop Framework and
>> using Hadoop Pipes for invoking the CUDA code which is written in a
>> C++ interface from the Hadoop Framework. I am just a bit interested in
>> knowing what can be the drawbacks of using Hadoop Pipes for this and
>> whether the implementation of Hadoop Streaming and JNI interface will
>> be a better choice. I am a bit unclear on this, so if anyone can throw
>> some light on this and clarify.
>>
>> Regards,
>> Indrashish
>>
>
> --
> Indrashish Basu
> Graduate Student
> Department of Electrical and Computer Engineering
> University of Florida
>

Re: Drawbacks of Hadoop Pipes

Posted by Silvina Caíno Lores <si...@gmail.com>.
Hi there,

I've been working with pipes for some months and I've finally managed to
get it working as I wanted with some legacy code I had. However, I had many
many issues regarding not only my implementation (it had to be adapted in
several ways to fit pipes, it is very restrictive) but pipes itself (bugs,
obscure errors and lack of proper logging with the subsequent mad
debugging).

I also tried streaming but I found it even more complex to debug and I
found some deal-breaker errors that I couldn't overcome regarding buffering
and such. I also tried a SWIG interface to wrap my code into a Java
library, I'd never recommend that for you might end up introducing a lot of
memory issues and potential bugs into your already working code, and you
basically don't get anything useful from it.

I've never worked with CUDA though, but it shouldn't be any different from
my Hadoop Pipes deployment besides the specific libraries you need. Be
prepared to deal with configuration issues and many esoteric logs,
nevertheless.

My advise, based in my experience, is that you should be 99% sure that your
original code is solid before migrating to Hadoop Pipes, you will have
enough problems there anyway.

Good luck on your work :)
Regards,
Silvina


On 3 March 2014 16:11, Basu,Indrashish <in...@ufl.edu> wrote:

>
> Hello,
>
> Anyone can help regarding the below query.
>
> Regards,
> Indrashish
>
>
> On Sat, 01 Mar 2014 13:52:11 -0500, Basu,Indrashish wrote:
>
>> Hello,
>>
>> I am trying to execute a CUDA benchmark in a Hadoop Framework and
>> using Hadoop Pipes for invoking the CUDA code which is written in a
>> C++ interface from the Hadoop Framework. I am just a bit interested in
>> knowing what can be the drawbacks of using Hadoop Pipes for this and
>> whether the implementation of Hadoop Streaming and JNI interface will
>> be a better choice. I am a bit unclear on this, so if anyone can throw
>> some light on this and clarify.
>>
>> Regards,
>> Indrashish
>>
>
> --
> Indrashish Basu
> Graduate Student
> Department of Electrical and Computer Engineering
> University of Florida
>

Re: Drawbacks of Hadoop Pipes

Posted by Silvina Caíno Lores <si...@gmail.com>.
Hi there,

I've been working with pipes for some months and I've finally managed to
get it working as I wanted with some legacy code I had. However, I had many
many issues regarding not only my implementation (it had to be adapted in
several ways to fit pipes, it is very restrictive) but pipes itself (bugs,
obscure errors and lack of proper logging with the subsequent mad
debugging).

I also tried streaming but I found it even more complex to debug and I
found some deal-breaker errors that I couldn't overcome regarding buffering
and such. I also tried a SWIG interface to wrap my code into a Java
library, I'd never recommend that for you might end up introducing a lot of
memory issues and potential bugs into your already working code, and you
basically don't get anything useful from it.

I've never worked with CUDA though, but it shouldn't be any different from
my Hadoop Pipes deployment besides the specific libraries you need. Be
prepared to deal with configuration issues and many esoteric logs,
nevertheless.

My advise, based in my experience, is that you should be 99% sure that your
original code is solid before migrating to Hadoop Pipes, you will have
enough problems there anyway.

Good luck on your work :)
Regards,
Silvina


On 3 March 2014 16:11, Basu,Indrashish <in...@ufl.edu> wrote:

>
> Hello,
>
> Anyone can help regarding the below query.
>
> Regards,
> Indrashish
>
>
> On Sat, 01 Mar 2014 13:52:11 -0500, Basu,Indrashish wrote:
>
>> Hello,
>>
>> I am trying to execute a CUDA benchmark in a Hadoop Framework and
>> using Hadoop Pipes for invoking the CUDA code which is written in a
>> C++ interface from the Hadoop Framework. I am just a bit interested in
>> knowing what can be the drawbacks of using Hadoop Pipes for this and
>> whether the implementation of Hadoop Streaming and JNI interface will
>> be a better choice. I am a bit unclear on this, so if anyone can throw
>> some light on this and clarify.
>>
>> Regards,
>> Indrashish
>>
>
> --
> Indrashish Basu
> Graduate Student
> Department of Electrical and Computer Engineering
> University of Florida
>

Re: Drawbacks of Hadoop Pipes

Posted by Silvina Caíno Lores <si...@gmail.com>.
Hi there,

I've been working with pipes for some months and I've finally managed to
get it working as I wanted with some legacy code I had. However, I had many
many issues regarding not only my implementation (it had to be adapted in
several ways to fit pipes, it is very restrictive) but pipes itself (bugs,
obscure errors and lack of proper logging with the subsequent mad
debugging).

I also tried streaming but I found it even more complex to debug and I
found some deal-breaker errors that I couldn't overcome regarding buffering
and such. I also tried a SWIG interface to wrap my code into a Java
library, I'd never recommend that for you might end up introducing a lot of
memory issues and potential bugs into your already working code, and you
basically don't get anything useful from it.

I've never worked with CUDA though, but it shouldn't be any different from
my Hadoop Pipes deployment besides the specific libraries you need. Be
prepared to deal with configuration issues and many esoteric logs,
nevertheless.

My advise, based in my experience, is that you should be 99% sure that your
original code is solid before migrating to Hadoop Pipes, you will have
enough problems there anyway.

Good luck on your work :)
Regards,
Silvina


On 3 March 2014 16:11, Basu,Indrashish <in...@ufl.edu> wrote:

>
> Hello,
>
> Anyone can help regarding the below query.
>
> Regards,
> Indrashish
>
>
> On Sat, 01 Mar 2014 13:52:11 -0500, Basu,Indrashish wrote:
>
>> Hello,
>>
>> I am trying to execute a CUDA benchmark in a Hadoop Framework and
>> using Hadoop Pipes for invoking the CUDA code which is written in a
>> C++ interface from the Hadoop Framework. I am just a bit interested in
>> knowing what can be the drawbacks of using Hadoop Pipes for this and
>> whether the implementation of Hadoop Streaming and JNI interface will
>> be a better choice. I am a bit unclear on this, so if anyone can throw
>> some light on this and clarify.
>>
>> Regards,
>> Indrashish
>>
>
> --
> Indrashish Basu
> Graduate Student
> Department of Electrical and Computer Engineering
> University of Florida
>

Re: Drawbacks of Hadoop Pipes

Posted by "Basu,Indrashish" <in...@ufl.edu>.
Hello,

Anyone can help regarding the below query.

Regards,
Indrashish

On Sat, 01 Mar 2014 13:52:11 -0500, Basu,Indrashish wrote:
> Hello,
>
> I am trying to execute a CUDA benchmark in a Hadoop Framework and
> using Hadoop Pipes for invoking the CUDA code which is written in a
> C++ interface from the Hadoop Framework. I am just a bit interested 
> in
> knowing what can be the drawbacks of using Hadoop Pipes for this and
> whether the implementation of Hadoop Streaming and JNI interface will
> be a better choice. I am a bit unclear on this, so if anyone can 
> throw
> some light on this and clarify.
>
> Regards,
> Indrashish

-- 
Indrashish Basu
Graduate Student
Department of Electrical and Computer Engineering
University of Florida

Re: Drawbacks of Hadoop Pipes

Posted by "Basu,Indrashish" <in...@ufl.edu>.
Hello,

Anyone can help regarding the below query.

Regards,
Indrashish

On Sat, 01 Mar 2014 13:52:11 -0500, Basu,Indrashish wrote:
> Hello,
>
> I am trying to execute a CUDA benchmark in a Hadoop Framework and
> using Hadoop Pipes for invoking the CUDA code which is written in a
> C++ interface from the Hadoop Framework. I am just a bit interested 
> in
> knowing what can be the drawbacks of using Hadoop Pipes for this and
> whether the implementation of Hadoop Streaming and JNI interface will
> be a better choice. I am a bit unclear on this, so if anyone can 
> throw
> some light on this and clarify.
>
> Regards,
> Indrashish

-- 
Indrashish Basu
Graduate Student
Department of Electrical and Computer Engineering
University of Florida

Re: Drawbacks of Hadoop Pipes

Posted by "Basu,Indrashish" <in...@ufl.edu>.
Hello,

Anyone can help regarding the below query.

Regards,
Indrashish

On Sat, 01 Mar 2014 13:52:11 -0500, Basu,Indrashish wrote:
> Hello,
>
> I am trying to execute a CUDA benchmark in a Hadoop Framework and
> using Hadoop Pipes for invoking the CUDA code which is written in a
> C++ interface from the Hadoop Framework. I am just a bit interested 
> in
> knowing what can be the drawbacks of using Hadoop Pipes for this and
> whether the implementation of Hadoop Streaming and JNI interface will
> be a better choice. I am a bit unclear on this, so if anyone can 
> throw
> some light on this and clarify.
>
> Regards,
> Indrashish

-- 
Indrashish Basu
Graduate Student
Department of Electrical and Computer Engineering
University of Florida

Re: Drawbacks of Hadoop Pipes

Posted by "Basu,Indrashish" <in...@ufl.edu>.
Hello,

Anyone can help regarding the below query.

Regards,
Indrashish

On Sat, 01 Mar 2014 13:52:11 -0500, Basu,Indrashish wrote:
> Hello,
>
> I am trying to execute a CUDA benchmark in a Hadoop Framework and
> using Hadoop Pipes for invoking the CUDA code which is written in a
> C++ interface from the Hadoop Framework. I am just a bit interested 
> in
> knowing what can be the drawbacks of using Hadoop Pipes for this and
> whether the implementation of Hadoop Streaming and JNI interface will
> be a better choice. I am a bit unclear on this, so if anyone can 
> throw
> some light on this and clarify.
>
> Regards,
> Indrashish

-- 
Indrashish Basu
Graduate Student
Department of Electrical and Computer Engineering
University of Florida