You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by praveenesh kumar <pr...@gmail.com> on 2012/01/29 08:24:07 UTC

Any info on R+Hadoop

Does anyone has done any work with "R" + Hadoop ?

I know there are some flavors of R+Hadoop available such as "rmr","rhdfs",
"RHIPE", "R-hive"

But as far as I know submitting jobs using Hadoop Streaming is the best way
right now available. Am I right ?


Any info on R on Hadoop ?

Thanks,
Praveenesh

Re: Any info on R+Hadoop

Posted by praveenesh kumar <pr...@gmail.com>.
Yeah, but I am facing a weird situation, in which my Rhadoop job (using
rmr) is taking much more time than My Hadoop streaming job in R. So wanted
to see if others also faced same problem or did anyone did any performance
evalulation of Revolutions - rmr ?

Thanks,
Praveenesh

On Mon, Jan 30, 2012 at 11:01 AM, Prashant Sharma
<pr...@imaginea.com>wrote:

> Praveenesh,
> Well, It gives you more convenience :). If you have worked on R, then you
> might notice with R you can write mapper as a lapply(using rmr). They have
> already abstracted a lot of stuff for you so you have less control over
> things. But still as far as convenience is concerned its damn cool. For
> example you can process data inside R using Hadoop (Nodoubt It uses hadoop
> streaming behind the scenes) and have the process data easily loaded back
> into R command line from hdfs(using rhdfs). Generally R developers do not
> like being engrossed with hassles that hadoop streaming can bring.
>
> -P
>
> P.S. I am not endorsing anyone. It's just my view.
>
> On Sun, Jan 29, 2012 at 12:54 PM, praveenesh kumar <praveenesh@gmail.com
> >wrote:
>
> > Does anyone has done any work with "R" + Hadoop ?
> >
> > I know there are some flavors of R+Hadoop available such as
> "rmr","rhdfs",
> > "RHIPE", "R-hive"
> >
> > But as far as I know submitting jobs using Hadoop Streaming is the best
> way
> > right now available. Am I right ?
> >
> >
> > Any info on R on Hadoop ?
> >
> > Thanks,
> > Praveenesh
> >
>

Re: Any info on R+Hadoop

Posted by Prashant Sharma <pr...@imaginea.com>.
Praveenesh,
Well, It gives you more convenience :). If you have worked on R, then you
might notice with R you can write mapper as a lapply(using rmr). They have
already abstracted a lot of stuff for you so you have less control over
things. But still as far as convenience is concerned its damn cool. For
example you can process data inside R using Hadoop (Nodoubt It uses hadoop
streaming behind the scenes) and have the process data easily loaded back
into R command line from hdfs(using rhdfs). Generally R developers do not
like being engrossed with hassles that hadoop streaming can bring.

-P

P.S. I am not endorsing anyone. It's just my view.

On Sun, Jan 29, 2012 at 12:54 PM, praveenesh kumar <pr...@gmail.com>wrote:

> Does anyone has done any work with "R" + Hadoop ?
>
> I know there are some flavors of R+Hadoop available such as "rmr","rhdfs",
> "RHIPE", "R-hive"
>
> But as far as I know submitting jobs using Hadoop Streaming is the best way
> right now available. Am I right ?
>
>
> Any info on R on Hadoop ?
>
> Thanks,
> Praveenesh
>