You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Kasi Subrahmanyam <ka...@gmail.com> on 2013/01/28 04:35:03 UTC

Choose an distribution

How to decide which distribution(cloudera,mapr,hortonworks) to use while
working on hadoop?

Re: Choose an distribution

Posted by Ankam Venkat <an...@gmail.com>.
I would list out my organization technical needs (like HBase, Cassandra,
Hive, Flume, Sqoop, Mahout etc including my future requirements)  and then
see which distribution fits for my use cases.  But, In general, consider
below points to make a right choice of your distribution.

Choose one of the two – 1) build a custom stack or 2) Use a Distribution

-  Choose your approach (Custom or Distribution) based on your team’s
skills,  whether this is a tactical experiment or strategic initiative and
how well available distributions fit your use case.
-  Balance the enterprise’s longer term needs with immediate pressures to
deliver, consider  projects that may be needed for future initiatives.
-  Evaluate distribution vendor as a whole if you plan to run Hadoop for
the long term. Look at financial viability, support capabilities,
partnerships and future technology plans.

Regards,
Venkat Ankam




On Mon, Jan 28, 2013 at 9:05 AM, Kasi Subrahmanyam
<ka...@gmail.com>wrote:

> How to decide which distribution(cloudera,mapr,hortonworks) to use while
> working on hadoop?
>

Re: Choose an distribution

Posted by Al Thompson <at...@ngmoco.com>.
Hi Kasi:

I think all three companies have great engineering teams and do excellent
distributions of Hadoop.

Below is my $0.02, and quick-and-dirty analysis of the question.
I don't have hands-on experience with all of these in production, so take
with abundant grains of salt.

* If you want to support production environments and need corporate
support, use Cloudera
* If you want to stay with Hadoop from Apache and stay current with open
source developments, use Hortonworks
* If you need absolute native performance or to run on Microsoft Windows,
use MapR

There are other considerations that may apply, such as choice of real-time
BigData platform (e.g., Storm or Impala),
and whether you want to integrate patches from other sources such as
Facebook's AvatarNode (HDFS-976)
or Prism (patch presumed forthcoming). You and your team are the best
judges of these concerns.

Regards,
Al

On Sun, Jan 27, 2013 at 7:35 PM, Kasi Subrahmanyam
<ka...@gmail.com>wrote:

> How to decide which distribution(cloudera,mapr,hortonworks) to use while
> working on hadoop?
>