You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Evert Lammerts <Ev...@sara.nl> on 2010/07/05 10:12:43 UTC

Hadoop versions & distributions

There are a number of different versions and distributions of Hadoop which,
as far as I understand, all differ from each other. I know that in the
0.20-append branch, files in HDFS can be appended, and that the Y!
distribution (0.20.S) implements security features through Kerberos. And
then there are the 0.20.3 and 0.22.0 branches. And trunk of course, which I
guess is 0.20.2 nowadays? In addition to that there are distributions by
Cloudera(CDH2 / 3beta) and IBM (IDAH).

 

>From my perspective, setting up a pilot cluster for a small number of users
from different institutes, security (0.20.S) is very attractive - scientists
like the idea of shielding their data and logic from other users. But what
will I miss if I choose Y!'s distribution over all of these other options?

 

Regards,

 

Evert Lammerts


Re: Hadoop versions & distributions

Posted by Todd Lipcon <to...@cloudera.com>.
On Mon, Jul 5, 2010 at 1:12 AM, Evert Lammerts <Ev...@sara.nl>wrote:

>  There are a number of different versions and distributions of Hadoop
> which, as far as I understand, all differ from each other. I know that in
> the 0.20-append branch, files in HDFS can be appended, and that the Y!
> distribution (0.20.S) implements security features through Kerberos. And
> then there are the 0.20.3 and 0.22.0 branches. And trunk of course, which I
> guess is 0.20.2 nowadays? In addition to that there are distributions by
> Cloudera(CDH2 / 3beta) and IBM (IDAH).
>
>
>
> From my perspective, setting up a pilot cluster for a small number of users
> from different institutes, security (0.20.S) is very attractive – scientists
> like the idea of shielding their data and logic from other users. But what
> will I miss if I choose Y!’s distribution over all of these other options?
>
>
Hi Evert,

Y!'s distribution does contain a good set of patches, and we at Cloudera are
always keeping track of the ydist git repository to incorporate those
changes into CDH. Currently, ydist contains the security patch series, but
doesn't include the recent append work. CDH3b2 includes the append work, but
not security as of yet -- we are currently integrating security and it
should be available in the next beta.

Aside from the specific patches included, it's worth noting that the Y! dist
is a git repository, rather than a full binary-and-source distribution of
Hadoop and related tools. CDH includes not just the core hadoop components
but also integrates many other important ecosystem components including Pig,
Hive, Oozie, HBase, Zookeeper, Flume, etc.

Thanks
-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera