You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Hadoop Raj <ha...@yahoo.com> on 2013/09/12 19:19:36 UTC

Cloudera Vs Hortonworks Vs MapR

Hi,

We are trying to evaluate different implementations of Hadoop for our big data enterprise project.

Can the forum members advise on what are the advantages and disadvantages of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.

Thanks in advance.

Regards,
Raj

RE: Cloudera Vs Hortonworks Vs MapR

Posted by "Smith, Joshua D." <Jo...@gd-ais.com>.
Cloudera has the widest distribution and distinguishes itself with Cloudera Impala, Cloudera Search and Sentry (all open source). It also comes with Cloudera Manager which is proprietary, but free for selected functionality.

Hortonworks distinguishes itself as being pure open source (no proprietary extensions) and being able to run on Microsoft Windows as well as Linux. Hortonworks comes with Ambari to perform functions similar to Cloudera Manager, but Ambari is open source.

MapR has a number of proprietary pieces. They distinguish themselves based on performance.

Of course every vendor may disagree with one or more of the characterizations that I've given above, but that's how I've come to view them. Of course, the landscape is always changing, so you'll have to evaluate the current offerings.

Josh

-----Original Message-----
From: Hadoop Raj [mailto:hadoopraj@yahoo.com] 
Sent: Thursday, September 12, 2013 1:20 PM
To: User
Subject: Cloudera Vs Hortonworks Vs MapR

Hi,

We are trying to evaluate different implementations of Hadoop for our big data enterprise project.

Can the forum members advise on what are the advantages and disadvantages of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.

Thanks in advance.

Regards,
Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Suresh Srinivas <su...@hortonworks.com>.
Raj,

You can also use Apache Hadoop releases. Bigtop does fine job as well
putting together consumable Hadoop stack.

As regards to vendor solutions, this is not the right forum. There are
other forums for this. Please refrain from this type of discussions on
Apache forum.

Regards,
Suresh


On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
>
> We are trying to evaluate different implementations of Hadoop for our big
> data enterprise project.
>
> Can the forum members advise on what are the advantages and disadvantages
> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
> Thanks in advance.
>
> Regards,
> Raj




-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Embree <ce...@gmail.com>.
The only problem is around the degeneration of the discussion.  See
years long threads around vi vs. emacs, Windows vs. Linux, Java vs.
C/Python/Perl/Ruby.


On 9/13/13, Chris Mattmann <ma...@apache.org> wrote:
> Errr, what's wrong with discussing these types of issues on list?
>
> Nothing public here, and as long as it's kept to facts, this should
> not be a problem and Apache is a fine place to have such discussions.
>
> My 2c.
>
>
>
>
>
> -----Original Message-----
> From: Xuri Nagarin <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, September 12, 2013 4:39 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
>>I understand it can be contentious issue especially given that a lot of
>>contributors to this list work for one or the other vendor or have some
>>stake in any kind of evaluation. But, I see no reason why users should
>>not be able to compare notes
>> and share experiences. Over time, genuine pain points or issues or
>>claims will bubble up and should only help the community. Sure, there
>>will be a few flame wars but this already isn't a very tightly moderated
>>list.
>>
>>
>>
>>
>>
>>
>>
>>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>><ae...@maprtech.com> wrote:
>>
>>Raj,
>>
>>
>>As others noted, this is not a great place for this discussion.  I'd
>>suggest contacting the vendors you are interested in as I'm sure we'd all
>>be happy to provide you more details.
>>
>>
>>I don't know about the others, but for MapR, just send an email to
>>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>>to you with more information.
>>
>>
>>Best Regards,
>>Aaron Eng
>>
>>
>>
>>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>>
>>
>>Hi,
>>
>>We are trying to evaluate different implementations of Hadoop for our big
>>data enterprise project.
>>
>>Can the forum members advise on what are the advantages and disadvantages
>>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>>Thanks in advance.
>>
>>Regards,
>>Raj
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Embree <ce...@gmail.com>.
Our evaluation was similar except we did not consider the "management"
tools any vendor provided as that's just as much lock in as any proprietary
tool.  What if I want trade vendors?  I have to re-tool to use there mgmt?
 Nope, wrote our own.

Being in a large enterprise, we went with the "perceived" more stable
platform.  Draw your own conclusions.


On Mon, Sep 16, 2013 at 6:10 PM, Xuri Nagarin <se...@gmail.com> wrote:

> So I will try to answer the OP's question best I can without deviating too
> much into opinions and stick to facts. Disclaimer: I am not an employee of
> either vendor or any partner of theirs.
>
> Context is important: My team's use case was general data exploration of
> semi-structured log data and we had no typical data-warehouse type of
> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
> terms of ops/maintenance, we only have one person. I point this out because
> lots of hadoop shops have dedicated team for each - OS administration,
> Hadoop admin, Hadoop developers. And, they are very mature in terms of
> their compute use cases. To my mind, these aspects can significantly impact
> your vendor choices.
>
> MapR: My team simply did not consider them because of all the proprietary
> code in there. We are trying to move from a monolithic proprietary product
> and one of the criteria we set was - if we decided to move away from the
> chosen hadoop vendor, can we easily unlock our data?
> HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
> management is via Ambari. Compared to Cloudera's CM, Ambari has very
> rudimentary features. But you have to keep in mind that Ambari is only an
> year old where as CM already has been under development for several years.
> This was a major selection factor for us because Ambari did not have all
> the automation/feature-set compared to CM for a single
> administrator/developer to easily maintain the cluster. Also, during the
> trial period, Hortonwork's packing format/structure apparently kept
> changing which made things a bit difficult to centrally deploy/administer.
>
> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
> management which is via their proprietary Cloudera Manager tool. It is free
> for use without certain feature like auditing and cluster replication
> features. Maybe a few more features are restricted to
> Enterprise/Licensed-only version. Offers much more features than Ambari. In
> terms of cluster administration, I found CM much easy to work with than
> Ambari. Pretty much all aspects from deploying new nodes to configuration
> and troubleshooting is much more refined than Ambari.
>
> During the selection process, what I found was that both vendors are very
> aggressive in their pitch. So much so that each pushes some FUD regarding
> the competition.
>
> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
> Cloudera's distro is heavily patched off-course from the core Apache trunk
> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
> patches over apache's Hadoop distro but (1) they aren't private patches.
> You can pull the list and verify that yourself just as I did. (2) In our
> testing and talking to other Cloudera customers, I couldn't find any issues
> with data corruption. It is true though that HDFS 2.x is still in beta but
> so is MRv2 that HW uses. I think both are stable and work well - depending
> on what you need but each uses that point to create FUD.
>
> HW also claimed that a new SQL engine that Cloudera's including in their
> distro - Impala is proprietary. Not true. The software is open source. But
> if you want support for Impala then Cloudera will charge you separately per
> node for Impala over and above what they charge per node for Hadoop support.
>
> In my experience, both products have plenty of issues when it comes to
> compute engines - Hive, Pig etc and their cluster management software. HDFS
> seem to be solid in both distros. So I wouldn't call either of them
> trouble-free and neither is at the maturity level of other popular
> enterprise products like say, Oracle. That said, you have to keep in mind
> that both vendors/products are successfully used by several customers so
> again, it is more a question of what fits your needs.
>
> In the end, we chose to go with Cloudera mostly because a more positive
> experience with CM in terms of administration/operations and their
> pre-sales team when compared to HW. Again, that said, another team that we
> closely work with chose HW for their cluster. I use both vendors/clusters
> at work and neither has any significant issues.
>
>
>
>
> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Here's the deal, folks can post questions to the list that aren't
>> abusive and simply asking what the difference between different vendor
>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>> or abusive question.
>>
>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>> PMC push off potentially useful questions that may have upstream
>> implications to the Apache  Hadoop core and let all the innovation
>> occur downstream?
>>
>> Have the conversations here if you'd like. I wouldn't turn anyone
>> away..
>>
>> My 2c.
>>
>> Cheers,
>> Chris
>>
>> ----Original Message-----
>>
>> From: Shahab Yunus <sh...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Friday, September 13, 2013 10:48 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I think, in my opinion, it is a wrong idea because:
>> >
>> >
>> >1- Many of the participants here are employees for these very companies
>> >that are under discussion. This puts these respective employees in very
>> >difficult position. It is very hard to come with a correct response.
>> >Comments can be misconstrued easily.
>> >2- Also, when we talk about vendor distributions of the software, it is
>> >not longer purely about open source. Now companies with the related
>> >corporate legal baggage also gets in the mix.
>> >3- The discussion would be on not only positive things about each vendor
>> >but in fact negatives. The latter type of  discussion which can get
>> >unpleasant very easily.
>> >
>> >4- Somebody mentioned that, this is a very lightly moderated platform and
>> >thus this discussion should be allowed. I think this is one of the
>> >reasons that it should not be because, people can say things casually,
>> >without much thought, or without taking
>> > care of the context or the possible interpretations and get in trouble.
>> >5- The risk here is not only that serious repercussions can occur (which
>> >very well can) but the greater risk is that it can cause misunderstanding
>> >between individuals, industries and companies.
>> >6-People here lot of time reply quickly just to resolve or help the
>> >'technical' issue. Now they will have to take care how they frame the
>> >response. Re: 4
>> >
>> >
>> >I know some will feel that I have created a highly exaggerated scenario
>> >above, but what I am trying to say is that, it is a slippery slope. If we
>> >allow this then this can go anywhere.
>> >
>> >
>> >By the way, I do not work for any of these vendors.
>> >
>> >
>> >More importantly, I am not saying that this discussion should not be had,
>> >I am just saying that this is a wrong forum.
>> >
>> >
>> >Just my 2 cents (or,...this was rather a dollar.)
>> >
>> >
>> >Regards,
>> >Shahab
>> >
>> >
>> >
>> >
>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>> ><ma...@apache.org> wrote:
>> >
>> >Errr, what's wrong with discussing these types of issues on list?
>> >
>> >Nothing public here, and as long as it's kept to facts, this should
>> >not be a problem and Apache is a fine place to have such discussions.
>> >
>> >My 2c.
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Xuri Nagarin <se...@gmail.com>
>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Date: Thursday, September 12, 2013 4:39 PM
>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>> >
>> >>I understand it can be contentious issue especially given that a lot of
>> >>contributors to this list work for one or the other vendor or have some
>> >>stake in any kind of evaluation. But, I see no reason why users should
>> >>not be able to compare notes
>> >> and share experiences. Over time, genuine pain points or issues or
>> >>claims will bubble up and should only help the community. Sure, there
>> >>will be a few flame wars but this already isn't a very tightly moderated
>> >>list.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> >><ae...@maprtech.com> wrote:
>> >>
>> >>Raj,
>> >>
>> >>
>> >>As others noted, this is not a great place for this discussion.  I'd
>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>> all
>> >>be happy to provide you more details.
>> >>
>> >>
>> >>I don't know about the others, but for MapR, just send an email to
>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >>to you with more information.
>> >>
>> >>
>> >>Best Regards,
>> >>Aaron Eng
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >>
>> >>
>> >>Hi,
>> >>
>> >>We are trying to evaluate different implementations of Hadoop for our
>> big
>> >>data enterprise project.
>> >>
>> >>Can the forum members advise on what are the advantages and
>> disadvantages
>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >>
>> >>Thanks in advance.
>> >>
>> >>Regards,
>> >>Raj
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
I simply stated the process my team went through but in hindsight, given
that I understand the Hadoop ecosystem better, I think yes, given that MapR
uses HDFS, we could simply use distcp to move data out. To be honest,
neither HW nor Cloudera made any claims to us about MapR. I think I found
some articles on the web that compared the three distros where it said MapR
is mostly proprietary.

So if I were doing the evaluation again, I would probably include MapR
again. Another factor is that most teams I know that use Hadoop are either
using CDH, HW or the Apache distro so there is a bit of inertia is
evaluating something that you aren't sure is being used around much by your
peers.

On Mon, Sep 16, 2013 at 8:37 PM, M. C. Srivas <mc...@gmail.com> wrote:

>
> So here's an example of marketing FUD at work.
>
> On Mon, Sep 16, 2013 at 3:10 PM, Xuri Nagarin <se...@gmail.com> wrote:
>
>> So I will try to answer the OP's question best I can without deviating
>> too much into opinions and stick to facts. Disclaimer: I am not an employee
>> of either vendor or any partner of theirs.
>>
>> Context is important: My team's use case was general data exploration of
>> semi-structured log data and we had no typical data-warehouse type of
>> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
>> terms of ops/maintenance, we only have one person. I point this out because
>> lots of hadoop shops have dedicated team for each - OS administration,
>> Hadoop admin, Hadoop developers. And, they are very mature in terms of
>> their compute use cases. To my mind, these aspects can significantly impact
>> your vendor choices.
>>
>> MapR: My team simply did not consider them because of all the proprietary
>> code in there. We are trying to move from a monolithic proprietary product
>> and one of the criteria we set was - if we decided to move away from the
>> chosen hadoop vendor, can we easily unlock our data?
>>
>
> Unlock your data? How about disctp? Or just "cp"?
>
> The fact is there are 10x  more standard ways to access your data in a
> MapR cluster versus a Cloudera or Hortonworks data.
>


Yes, Cloudera has proprietary CM but I really don't think HW has any
proprietary code. Can you point at any? In fact, even for Cloudera, other
than CM, what's proprietary? I ask not in a rhetoric way but to clear up
facts. I did not find any but if you know of any proprietary code please
let everyone know.


>
> MapR is entirely open source, with proprietary add-ons, just like Cloudera
> or Hortonworks.
>
> The difference is MapR has innovated both above and below the Hadoop
> stack, while Cloudera and Horton have only done so above the stack. MapR's
> innovations have set the bar so high that its competition likes to spread
> FUD.
>
>
As a user/customer, the best suggestion I can make, since you are a MapR
employee is to not focus on other distros and try to point out facts about
your product. The biggest turn off in the purchase cycle for me was the way
both Cloudera and HW attack each other and lot of times with FUD. If they
simply presented facts, I think I am intelligent enough to tell the
differences. Trying to convince a potential customer by attacking the
competition sort of assumes that the customer isn't smart enough to figure
things out.




> [disclaimer: I work for MapR ]
>
>
>
>>  HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
>> management is via Ambari. Compared to Cloudera's CM, Ambari has very
>> rudimentary features. But you have to keep in mind that Ambari is only an
>> year old where as CM already has been under development for several years.
>> This was a major selection factor for us because Ambari did not have all
>> the automation/feature-set compared to CM for a single
>> administrator/developer to easily maintain the cluster. Also, during the
>> trial period, Hortonwork's packing format/structure apparently kept
>> changing which made things a bit difficult to centrally deploy/administer.
>>
>> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
>> management which is via their proprietary Cloudera Manager tool. It is free
>> for use without certain feature like auditing and cluster replication
>> features. Maybe a few more features are restricted to
>> Enterprise/Licensed-only version. Offers much more features than Ambari. In
>> terms of cluster administration, I found CM much easy to work with than
>> Ambari. Pretty much all aspects from deploying new nodes to configuration
>> and troubleshooting is much more refined than Ambari.
>>
>> During the selection process, what I found was that both vendors are very
>> aggressive in their pitch. So much so that each pushes some FUD regarding
>> the competition.
>>
>
> Obviously some of it worked, given some of the statements earlier.
>
>
>
>>
>> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
>> Cloudera's distro is heavily patched off-course from the core Apache trunk
>> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
>> patches over apache's Hadoop distro but (1) they aren't private patches.
>> You can pull the list and verify that yourself just as I did. (2) In our
>> testing and talking to other Cloudera customers, I couldn't find any issues
>> with data corruption. It is true though that HDFS 2.x is still in beta but
>> so is MRv2 that HW uses. I think both are stable and work well - depending
>> on what you need but each uses that point to create FUD.
>>
>> HW also claimed that a new SQL engine that Cloudera's including in their
>> distro - Impala is proprietary. Not true. The software is open source. But
>> if you want support for Impala then Cloudera will charge you separately per
>> node for Impala over and above what they charge per node for Hadoop support.
>>
>> In my experience, both products have plenty of issues when it comes to
>> compute engines - Hive, Pig etc and their cluster management software. HDFS
>> seem to be solid in both distros. So I wouldn't call either of them
>> trouble-free and neither is at the maturity level of other popular
>> enterprise products like say, Oracle. That said, you have to keep in mind
>> that both vendors/products are successfully used by several customers so
>> again, it is more a question of what fits your needs.
>>
>> In the end, we chose to go with Cloudera mostly because a more positive
>> experience with CM in terms of administration/operations and their
>> pre-sales team when compared to HW. Again, that said, another team that we
>> closely work with chose HW for their cluster. I use both vendors/clusters
>> at work and neither has any significant issues.
>>
>>
>>
>>
>> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>>
>>> Here's the deal, folks can post questions to the list that aren't
>>> abusive and simply asking what the difference between different vendor
>>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>>> or abusive question.
>>>
>>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>>> PMC push off potentially useful questions that may have upstream
>>> implications to the Apache  Hadoop core and let all the innovation
>>> occur downstream?
>>>
>>> Have the conversations here if you'd like. I wouldn't turn anyone
>>> away..
>>>
>>> My 2c.
>>>
>>> Cheers,
>>> Chris
>>>
>>> ----Original Message-----
>>>
>>> From: Shahab Yunus <sh...@gmail.com>
>>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Date: Friday, September 13, 2013 10:48 AM
>>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>>
>>> >I think, in my opinion, it is a wrong idea because:
>>> >
>>> >
>>> >1- Many of the participants here are employees for these very companies
>>> >that are under discussion. This puts these respective employees in very
>>> >difficult position. It is very hard to come with a correct response.
>>> >Comments can be misconstrued easily.
>>> >2- Also, when we talk about vendor distributions of the software, it is
>>> >not longer purely about open source. Now companies with the related
>>> >corporate legal baggage also gets in the mix.
>>> >3- The discussion would be on not only positive things about each vendor
>>> >but in fact negatives. The latter type of  discussion which can get
>>> >unpleasant very easily.
>>> >
>>> >4- Somebody mentioned that, this is a very lightly moderated platform
>>> and
>>> >thus this discussion should be allowed. I think this is one of the
>>> >reasons that it should not be because, people can say things casually,
>>> >without much thought, or without taking
>>> > care of the context or the possible interpretations and get in trouble.
>>> >5- The risk here is not only that serious repercussions can occur (which
>>> >very well can) but the greater risk is that it can cause
>>> misunderstanding
>>> >between individuals, industries and companies.
>>> >6-People here lot of time reply quickly just to resolve or help the
>>> >'technical' issue. Now they will have to take care how they frame the
>>> >response. Re: 4
>>> >
>>> >
>>> >I know some will feel that I have created a highly exaggerated scenario
>>> >above, but what I am trying to say is that, it is a slippery slope. If
>>> we
>>> >allow this then this can go anywhere.
>>> >
>>> >
>>> >By the way, I do not work for any of these vendors.
>>> >
>>> >
>>> >More importantly, I am not saying that this discussion should not be
>>> had,
>>> >I am just saying that this is a wrong forum.
>>> >
>>> >
>>> >Just my 2 cents (or,...this was rather a dollar.)
>>> >
>>> >
>>> >Regards,
>>> >Shahab
>>> >
>>> >
>>> >
>>> >
>>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>>> ><ma...@apache.org> wrote:
>>> >
>>> >Errr, what's wrong with discussing these types of issues on list?
>>> >
>>> >Nothing public here, and as long as it's kept to facts, this should
>>> >not be a problem and Apache is a fine place to have such discussions.
>>> >
>>> >My 2c.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >-----Original Message-----
>>> >From: Xuri Nagarin <se...@gmail.com>
>>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> >Date: Thursday, September 12, 2013 4:39 PM
>>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>> >
>>> >>I understand it can be contentious issue especially given that a lot of
>>> >>contributors to this list work for one or the other vendor or have some
>>> >>stake in any kind of evaluation. But, I see no reason why users should
>>> >>not be able to compare notes
>>> >> and share experiences. Over time, genuine pain points or issues or
>>> >>claims will bubble up and should only help the community. Sure, there
>>> >>will be a few flame wars but this already isn't a very tightly
>>> moderated
>>> >>list.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>>> >><ae...@maprtech.com> wrote:
>>> >>
>>> >>Raj,
>>> >>
>>> >>
>>> >>As others noted, this is not a great place for this discussion.  I'd
>>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>>> all
>>> >>be happy to provide you more details.
>>> >>
>>> >>
>>> >>I don't know about the others, but for MapR, just send an email to
>>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>>> back
>>> >>to you with more information.
>>> >>
>>> >>
>>> >>Best Regards,
>>> >>Aaron Eng
>>> >>
>>> >>
>>> >>
>>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>>> wrote:
>>> >>
>>> >>
>>> >>Hi,
>>> >>
>>> >>We are trying to evaluate different implementations of Hadoop for our
>>> big
>>> >>data enterprise project.
>>> >>
>>> >>Can the forum members advise on what are the advantages and
>>> disadvantages
>>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>> >>
>>> >>Thanks in advance.
>>> >>
>>> >>Regards,
>>> >>Raj
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
I simply stated the process my team went through but in hindsight, given
that I understand the Hadoop ecosystem better, I think yes, given that MapR
uses HDFS, we could simply use distcp to move data out. To be honest,
neither HW nor Cloudera made any claims to us about MapR. I think I found
some articles on the web that compared the three distros where it said MapR
is mostly proprietary.

So if I were doing the evaluation again, I would probably include MapR
again. Another factor is that most teams I know that use Hadoop are either
using CDH, HW or the Apache distro so there is a bit of inertia is
evaluating something that you aren't sure is being used around much by your
peers.

On Mon, Sep 16, 2013 at 8:37 PM, M. C. Srivas <mc...@gmail.com> wrote:

>
> So here's an example of marketing FUD at work.
>
> On Mon, Sep 16, 2013 at 3:10 PM, Xuri Nagarin <se...@gmail.com> wrote:
>
>> So I will try to answer the OP's question best I can without deviating
>> too much into opinions and stick to facts. Disclaimer: I am not an employee
>> of either vendor or any partner of theirs.
>>
>> Context is important: My team's use case was general data exploration of
>> semi-structured log data and we had no typical data-warehouse type of
>> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
>> terms of ops/maintenance, we only have one person. I point this out because
>> lots of hadoop shops have dedicated team for each - OS administration,
>> Hadoop admin, Hadoop developers. And, they are very mature in terms of
>> their compute use cases. To my mind, these aspects can significantly impact
>> your vendor choices.
>>
>> MapR: My team simply did not consider them because of all the proprietary
>> code in there. We are trying to move from a monolithic proprietary product
>> and one of the criteria we set was - if we decided to move away from the
>> chosen hadoop vendor, can we easily unlock our data?
>>
>
> Unlock your data? How about disctp? Or just "cp"?
>
> The fact is there are 10x  more standard ways to access your data in a
> MapR cluster versus a Cloudera or Hortonworks data.
>


Yes, Cloudera has proprietary CM but I really don't think HW has any
proprietary code. Can you point at any? In fact, even for Cloudera, other
than CM, what's proprietary? I ask not in a rhetoric way but to clear up
facts. I did not find any but if you know of any proprietary code please
let everyone know.


>
> MapR is entirely open source, with proprietary add-ons, just like Cloudera
> or Hortonworks.
>
> The difference is MapR has innovated both above and below the Hadoop
> stack, while Cloudera and Horton have only done so above the stack. MapR's
> innovations have set the bar so high that its competition likes to spread
> FUD.
>
>
As a user/customer, the best suggestion I can make, since you are a MapR
employee is to not focus on other distros and try to point out facts about
your product. The biggest turn off in the purchase cycle for me was the way
both Cloudera and HW attack each other and lot of times with FUD. If they
simply presented facts, I think I am intelligent enough to tell the
differences. Trying to convince a potential customer by attacking the
competition sort of assumes that the customer isn't smart enough to figure
things out.




> [disclaimer: I work for MapR ]
>
>
>
>>  HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
>> management is via Ambari. Compared to Cloudera's CM, Ambari has very
>> rudimentary features. But you have to keep in mind that Ambari is only an
>> year old where as CM already has been under development for several years.
>> This was a major selection factor for us because Ambari did not have all
>> the automation/feature-set compared to CM for a single
>> administrator/developer to easily maintain the cluster. Also, during the
>> trial period, Hortonwork's packing format/structure apparently kept
>> changing which made things a bit difficult to centrally deploy/administer.
>>
>> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
>> management which is via their proprietary Cloudera Manager tool. It is free
>> for use without certain feature like auditing and cluster replication
>> features. Maybe a few more features are restricted to
>> Enterprise/Licensed-only version. Offers much more features than Ambari. In
>> terms of cluster administration, I found CM much easy to work with than
>> Ambari. Pretty much all aspects from deploying new nodes to configuration
>> and troubleshooting is much more refined than Ambari.
>>
>> During the selection process, what I found was that both vendors are very
>> aggressive in their pitch. So much so that each pushes some FUD regarding
>> the competition.
>>
>
> Obviously some of it worked, given some of the statements earlier.
>
>
>
>>
>> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
>> Cloudera's distro is heavily patched off-course from the core Apache trunk
>> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
>> patches over apache's Hadoop distro but (1) they aren't private patches.
>> You can pull the list and verify that yourself just as I did. (2) In our
>> testing and talking to other Cloudera customers, I couldn't find any issues
>> with data corruption. It is true though that HDFS 2.x is still in beta but
>> so is MRv2 that HW uses. I think both are stable and work well - depending
>> on what you need but each uses that point to create FUD.
>>
>> HW also claimed that a new SQL engine that Cloudera's including in their
>> distro - Impala is proprietary. Not true. The software is open source. But
>> if you want support for Impala then Cloudera will charge you separately per
>> node for Impala over and above what they charge per node for Hadoop support.
>>
>> In my experience, both products have plenty of issues when it comes to
>> compute engines - Hive, Pig etc and their cluster management software. HDFS
>> seem to be solid in both distros. So I wouldn't call either of them
>> trouble-free and neither is at the maturity level of other popular
>> enterprise products like say, Oracle. That said, you have to keep in mind
>> that both vendors/products are successfully used by several customers so
>> again, it is more a question of what fits your needs.
>>
>> In the end, we chose to go with Cloudera mostly because a more positive
>> experience with CM in terms of administration/operations and their
>> pre-sales team when compared to HW. Again, that said, another team that we
>> closely work with chose HW for their cluster. I use both vendors/clusters
>> at work and neither has any significant issues.
>>
>>
>>
>>
>> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>>
>>> Here's the deal, folks can post questions to the list that aren't
>>> abusive and simply asking what the difference between different vendor
>>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>>> or abusive question.
>>>
>>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>>> PMC push off potentially useful questions that may have upstream
>>> implications to the Apache  Hadoop core and let all the innovation
>>> occur downstream?
>>>
>>> Have the conversations here if you'd like. I wouldn't turn anyone
>>> away..
>>>
>>> My 2c.
>>>
>>> Cheers,
>>> Chris
>>>
>>> ----Original Message-----
>>>
>>> From: Shahab Yunus <sh...@gmail.com>
>>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Date: Friday, September 13, 2013 10:48 AM
>>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>>
>>> >I think, in my opinion, it is a wrong idea because:
>>> >
>>> >
>>> >1- Many of the participants here are employees for these very companies
>>> >that are under discussion. This puts these respective employees in very
>>> >difficult position. It is very hard to come with a correct response.
>>> >Comments can be misconstrued easily.
>>> >2- Also, when we talk about vendor distributions of the software, it is
>>> >not longer purely about open source. Now companies with the related
>>> >corporate legal baggage also gets in the mix.
>>> >3- The discussion would be on not only positive things about each vendor
>>> >but in fact negatives. The latter type of  discussion which can get
>>> >unpleasant very easily.
>>> >
>>> >4- Somebody mentioned that, this is a very lightly moderated platform
>>> and
>>> >thus this discussion should be allowed. I think this is one of the
>>> >reasons that it should not be because, people can say things casually,
>>> >without much thought, or without taking
>>> > care of the context or the possible interpretations and get in trouble.
>>> >5- The risk here is not only that serious repercussions can occur (which
>>> >very well can) but the greater risk is that it can cause
>>> misunderstanding
>>> >between individuals, industries and companies.
>>> >6-People here lot of time reply quickly just to resolve or help the
>>> >'technical' issue. Now they will have to take care how they frame the
>>> >response. Re: 4
>>> >
>>> >
>>> >I know some will feel that I have created a highly exaggerated scenario
>>> >above, but what I am trying to say is that, it is a slippery slope. If
>>> we
>>> >allow this then this can go anywhere.
>>> >
>>> >
>>> >By the way, I do not work for any of these vendors.
>>> >
>>> >
>>> >More importantly, I am not saying that this discussion should not be
>>> had,
>>> >I am just saying that this is a wrong forum.
>>> >
>>> >
>>> >Just my 2 cents (or,...this was rather a dollar.)
>>> >
>>> >
>>> >Regards,
>>> >Shahab
>>> >
>>> >
>>> >
>>> >
>>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>>> ><ma...@apache.org> wrote:
>>> >
>>> >Errr, what's wrong with discussing these types of issues on list?
>>> >
>>> >Nothing public here, and as long as it's kept to facts, this should
>>> >not be a problem and Apache is a fine place to have such discussions.
>>> >
>>> >My 2c.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >-----Original Message-----
>>> >From: Xuri Nagarin <se...@gmail.com>
>>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> >Date: Thursday, September 12, 2013 4:39 PM
>>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>> >
>>> >>I understand it can be contentious issue especially given that a lot of
>>> >>contributors to this list work for one or the other vendor or have some
>>> >>stake in any kind of evaluation. But, I see no reason why users should
>>> >>not be able to compare notes
>>> >> and share experiences. Over time, genuine pain points or issues or
>>> >>claims will bubble up and should only help the community. Sure, there
>>> >>will be a few flame wars but this already isn't a very tightly
>>> moderated
>>> >>list.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>>> >><ae...@maprtech.com> wrote:
>>> >>
>>> >>Raj,
>>> >>
>>> >>
>>> >>As others noted, this is not a great place for this discussion.  I'd
>>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>>> all
>>> >>be happy to provide you more details.
>>> >>
>>> >>
>>> >>I don't know about the others, but for MapR, just send an email to
>>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>>> back
>>> >>to you with more information.
>>> >>
>>> >>
>>> >>Best Regards,
>>> >>Aaron Eng
>>> >>
>>> >>
>>> >>
>>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>>> wrote:
>>> >>
>>> >>
>>> >>Hi,
>>> >>
>>> >>We are trying to evaluate different implementations of Hadoop for our
>>> big
>>> >>data enterprise project.
>>> >>
>>> >>Can the forum members advise on what are the advantages and
>>> disadvantages
>>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>> >>
>>> >>Thanks in advance.
>>> >>
>>> >>Regards,
>>> >>Raj
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
I simply stated the process my team went through but in hindsight, given
that I understand the Hadoop ecosystem better, I think yes, given that MapR
uses HDFS, we could simply use distcp to move data out. To be honest,
neither HW nor Cloudera made any claims to us about MapR. I think I found
some articles on the web that compared the three distros where it said MapR
is mostly proprietary.

So if I were doing the evaluation again, I would probably include MapR
again. Another factor is that most teams I know that use Hadoop are either
using CDH, HW or the Apache distro so there is a bit of inertia is
evaluating something that you aren't sure is being used around much by your
peers.

On Mon, Sep 16, 2013 at 8:37 PM, M. C. Srivas <mc...@gmail.com> wrote:

>
> So here's an example of marketing FUD at work.
>
> On Mon, Sep 16, 2013 at 3:10 PM, Xuri Nagarin <se...@gmail.com> wrote:
>
>> So I will try to answer the OP's question best I can without deviating
>> too much into opinions and stick to facts. Disclaimer: I am not an employee
>> of either vendor or any partner of theirs.
>>
>> Context is important: My team's use case was general data exploration of
>> semi-structured log data and we had no typical data-warehouse type of
>> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
>> terms of ops/maintenance, we only have one person. I point this out because
>> lots of hadoop shops have dedicated team for each - OS administration,
>> Hadoop admin, Hadoop developers. And, they are very mature in terms of
>> their compute use cases. To my mind, these aspects can significantly impact
>> your vendor choices.
>>
>> MapR: My team simply did not consider them because of all the proprietary
>> code in there. We are trying to move from a monolithic proprietary product
>> and one of the criteria we set was - if we decided to move away from the
>> chosen hadoop vendor, can we easily unlock our data?
>>
>
> Unlock your data? How about disctp? Or just "cp"?
>
> The fact is there are 10x  more standard ways to access your data in a
> MapR cluster versus a Cloudera or Hortonworks data.
>


Yes, Cloudera has proprietary CM but I really don't think HW has any
proprietary code. Can you point at any? In fact, even for Cloudera, other
than CM, what's proprietary? I ask not in a rhetoric way but to clear up
facts. I did not find any but if you know of any proprietary code please
let everyone know.


>
> MapR is entirely open source, with proprietary add-ons, just like Cloudera
> or Hortonworks.
>
> The difference is MapR has innovated both above and below the Hadoop
> stack, while Cloudera and Horton have only done so above the stack. MapR's
> innovations have set the bar so high that its competition likes to spread
> FUD.
>
>
As a user/customer, the best suggestion I can make, since you are a MapR
employee is to not focus on other distros and try to point out facts about
your product. The biggest turn off in the purchase cycle for me was the way
both Cloudera and HW attack each other and lot of times with FUD. If they
simply presented facts, I think I am intelligent enough to tell the
differences. Trying to convince a potential customer by attacking the
competition sort of assumes that the customer isn't smart enough to figure
things out.




> [disclaimer: I work for MapR ]
>
>
>
>>  HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
>> management is via Ambari. Compared to Cloudera's CM, Ambari has very
>> rudimentary features. But you have to keep in mind that Ambari is only an
>> year old where as CM already has been under development for several years.
>> This was a major selection factor for us because Ambari did not have all
>> the automation/feature-set compared to CM for a single
>> administrator/developer to easily maintain the cluster. Also, during the
>> trial period, Hortonwork's packing format/structure apparently kept
>> changing which made things a bit difficult to centrally deploy/administer.
>>
>> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
>> management which is via their proprietary Cloudera Manager tool. It is free
>> for use without certain feature like auditing and cluster replication
>> features. Maybe a few more features are restricted to
>> Enterprise/Licensed-only version. Offers much more features than Ambari. In
>> terms of cluster administration, I found CM much easy to work with than
>> Ambari. Pretty much all aspects from deploying new nodes to configuration
>> and troubleshooting is much more refined than Ambari.
>>
>> During the selection process, what I found was that both vendors are very
>> aggressive in their pitch. So much so that each pushes some FUD regarding
>> the competition.
>>
>
> Obviously some of it worked, given some of the statements earlier.
>
>
>
>>
>> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
>> Cloudera's distro is heavily patched off-course from the core Apache trunk
>> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
>> patches over apache's Hadoop distro but (1) they aren't private patches.
>> You can pull the list and verify that yourself just as I did. (2) In our
>> testing and talking to other Cloudera customers, I couldn't find any issues
>> with data corruption. It is true though that HDFS 2.x is still in beta but
>> so is MRv2 that HW uses. I think both are stable and work well - depending
>> on what you need but each uses that point to create FUD.
>>
>> HW also claimed that a new SQL engine that Cloudera's including in their
>> distro - Impala is proprietary. Not true. The software is open source. But
>> if you want support for Impala then Cloudera will charge you separately per
>> node for Impala over and above what they charge per node for Hadoop support.
>>
>> In my experience, both products have plenty of issues when it comes to
>> compute engines - Hive, Pig etc and their cluster management software. HDFS
>> seem to be solid in both distros. So I wouldn't call either of them
>> trouble-free and neither is at the maturity level of other popular
>> enterprise products like say, Oracle. That said, you have to keep in mind
>> that both vendors/products are successfully used by several customers so
>> again, it is more a question of what fits your needs.
>>
>> In the end, we chose to go with Cloudera mostly because a more positive
>> experience with CM in terms of administration/operations and their
>> pre-sales team when compared to HW. Again, that said, another team that we
>> closely work with chose HW for their cluster. I use both vendors/clusters
>> at work and neither has any significant issues.
>>
>>
>>
>>
>> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>>
>>> Here's the deal, folks can post questions to the list that aren't
>>> abusive and simply asking what the difference between different vendor
>>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>>> or abusive question.
>>>
>>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>>> PMC push off potentially useful questions that may have upstream
>>> implications to the Apache  Hadoop core and let all the innovation
>>> occur downstream?
>>>
>>> Have the conversations here if you'd like. I wouldn't turn anyone
>>> away..
>>>
>>> My 2c.
>>>
>>> Cheers,
>>> Chris
>>>
>>> ----Original Message-----
>>>
>>> From: Shahab Yunus <sh...@gmail.com>
>>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Date: Friday, September 13, 2013 10:48 AM
>>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>>
>>> >I think, in my opinion, it is a wrong idea because:
>>> >
>>> >
>>> >1- Many of the participants here are employees for these very companies
>>> >that are under discussion. This puts these respective employees in very
>>> >difficult position. It is very hard to come with a correct response.
>>> >Comments can be misconstrued easily.
>>> >2- Also, when we talk about vendor distributions of the software, it is
>>> >not longer purely about open source. Now companies with the related
>>> >corporate legal baggage also gets in the mix.
>>> >3- The discussion would be on not only positive things about each vendor
>>> >but in fact negatives. The latter type of  discussion which can get
>>> >unpleasant very easily.
>>> >
>>> >4- Somebody mentioned that, this is a very lightly moderated platform
>>> and
>>> >thus this discussion should be allowed. I think this is one of the
>>> >reasons that it should not be because, people can say things casually,
>>> >without much thought, or without taking
>>> > care of the context or the possible interpretations and get in trouble.
>>> >5- The risk here is not only that serious repercussions can occur (which
>>> >very well can) but the greater risk is that it can cause
>>> misunderstanding
>>> >between individuals, industries and companies.
>>> >6-People here lot of time reply quickly just to resolve or help the
>>> >'technical' issue. Now they will have to take care how they frame the
>>> >response. Re: 4
>>> >
>>> >
>>> >I know some will feel that I have created a highly exaggerated scenario
>>> >above, but what I am trying to say is that, it is a slippery slope. If
>>> we
>>> >allow this then this can go anywhere.
>>> >
>>> >
>>> >By the way, I do not work for any of these vendors.
>>> >
>>> >
>>> >More importantly, I am not saying that this discussion should not be
>>> had,
>>> >I am just saying that this is a wrong forum.
>>> >
>>> >
>>> >Just my 2 cents (or,...this was rather a dollar.)
>>> >
>>> >
>>> >Regards,
>>> >Shahab
>>> >
>>> >
>>> >
>>> >
>>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>>> ><ma...@apache.org> wrote:
>>> >
>>> >Errr, what's wrong with discussing these types of issues on list?
>>> >
>>> >Nothing public here, and as long as it's kept to facts, this should
>>> >not be a problem and Apache is a fine place to have such discussions.
>>> >
>>> >My 2c.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >-----Original Message-----
>>> >From: Xuri Nagarin <se...@gmail.com>
>>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> >Date: Thursday, September 12, 2013 4:39 PM
>>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>> >
>>> >>I understand it can be contentious issue especially given that a lot of
>>> >>contributors to this list work for one or the other vendor or have some
>>> >>stake in any kind of evaluation. But, I see no reason why users should
>>> >>not be able to compare notes
>>> >> and share experiences. Over time, genuine pain points or issues or
>>> >>claims will bubble up and should only help the community. Sure, there
>>> >>will be a few flame wars but this already isn't a very tightly
>>> moderated
>>> >>list.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>>> >><ae...@maprtech.com> wrote:
>>> >>
>>> >>Raj,
>>> >>
>>> >>
>>> >>As others noted, this is not a great place for this discussion.  I'd
>>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>>> all
>>> >>be happy to provide you more details.
>>> >>
>>> >>
>>> >>I don't know about the others, but for MapR, just send an email to
>>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>>> back
>>> >>to you with more information.
>>> >>
>>> >>
>>> >>Best Regards,
>>> >>Aaron Eng
>>> >>
>>> >>
>>> >>
>>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>>> wrote:
>>> >>
>>> >>
>>> >>Hi,
>>> >>
>>> >>We are trying to evaluate different implementations of Hadoop for our
>>> big
>>> >>data enterprise project.
>>> >>
>>> >>Can the forum members advise on what are the advantages and
>>> disadvantages
>>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>> >>
>>> >>Thanks in advance.
>>> >>
>>> >>Regards,
>>> >>Raj
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
I simply stated the process my team went through but in hindsight, given
that I understand the Hadoop ecosystem better, I think yes, given that MapR
uses HDFS, we could simply use distcp to move data out. To be honest,
neither HW nor Cloudera made any claims to us about MapR. I think I found
some articles on the web that compared the three distros where it said MapR
is mostly proprietary.

So if I were doing the evaluation again, I would probably include MapR
again. Another factor is that most teams I know that use Hadoop are either
using CDH, HW or the Apache distro so there is a bit of inertia is
evaluating something that you aren't sure is being used around much by your
peers.

On Mon, Sep 16, 2013 at 8:37 PM, M. C. Srivas <mc...@gmail.com> wrote:

>
> So here's an example of marketing FUD at work.
>
> On Mon, Sep 16, 2013 at 3:10 PM, Xuri Nagarin <se...@gmail.com> wrote:
>
>> So I will try to answer the OP's question best I can without deviating
>> too much into opinions and stick to facts. Disclaimer: I am not an employee
>> of either vendor or any partner of theirs.
>>
>> Context is important: My team's use case was general data exploration of
>> semi-structured log data and we had no typical data-warehouse type of
>> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
>> terms of ops/maintenance, we only have one person. I point this out because
>> lots of hadoop shops have dedicated team for each - OS administration,
>> Hadoop admin, Hadoop developers. And, they are very mature in terms of
>> their compute use cases. To my mind, these aspects can significantly impact
>> your vendor choices.
>>
>> MapR: My team simply did not consider them because of all the proprietary
>> code in there. We are trying to move from a monolithic proprietary product
>> and one of the criteria we set was - if we decided to move away from the
>> chosen hadoop vendor, can we easily unlock our data?
>>
>
> Unlock your data? How about disctp? Or just "cp"?
>
> The fact is there are 10x  more standard ways to access your data in a
> MapR cluster versus a Cloudera or Hortonworks data.
>


Yes, Cloudera has proprietary CM but I really don't think HW has any
proprietary code. Can you point at any? In fact, even for Cloudera, other
than CM, what's proprietary? I ask not in a rhetoric way but to clear up
facts. I did not find any but if you know of any proprietary code please
let everyone know.


>
> MapR is entirely open source, with proprietary add-ons, just like Cloudera
> or Hortonworks.
>
> The difference is MapR has innovated both above and below the Hadoop
> stack, while Cloudera and Horton have only done so above the stack. MapR's
> innovations have set the bar so high that its competition likes to spread
> FUD.
>
>
As a user/customer, the best suggestion I can make, since you are a MapR
employee is to not focus on other distros and try to point out facts about
your product. The biggest turn off in the purchase cycle for me was the way
both Cloudera and HW attack each other and lot of times with FUD. If they
simply presented facts, I think I am intelligent enough to tell the
differences. Trying to convince a potential customer by attacking the
competition sort of assumes that the customer isn't smart enough to figure
things out.




> [disclaimer: I work for MapR ]
>
>
>
>>  HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
>> management is via Ambari. Compared to Cloudera's CM, Ambari has very
>> rudimentary features. But you have to keep in mind that Ambari is only an
>> year old where as CM already has been under development for several years.
>> This was a major selection factor for us because Ambari did not have all
>> the automation/feature-set compared to CM for a single
>> administrator/developer to easily maintain the cluster. Also, during the
>> trial period, Hortonwork's packing format/structure apparently kept
>> changing which made things a bit difficult to centrally deploy/administer.
>>
>> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
>> management which is via their proprietary Cloudera Manager tool. It is free
>> for use without certain feature like auditing and cluster replication
>> features. Maybe a few more features are restricted to
>> Enterprise/Licensed-only version. Offers much more features than Ambari. In
>> terms of cluster administration, I found CM much easy to work with than
>> Ambari. Pretty much all aspects from deploying new nodes to configuration
>> and troubleshooting is much more refined than Ambari.
>>
>> During the selection process, what I found was that both vendors are very
>> aggressive in their pitch. So much so that each pushes some FUD regarding
>> the competition.
>>
>
> Obviously some of it worked, given some of the statements earlier.
>
>
>
>>
>> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
>> Cloudera's distro is heavily patched off-course from the core Apache trunk
>> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
>> patches over apache's Hadoop distro but (1) they aren't private patches.
>> You can pull the list and verify that yourself just as I did. (2) In our
>> testing and talking to other Cloudera customers, I couldn't find any issues
>> with data corruption. It is true though that HDFS 2.x is still in beta but
>> so is MRv2 that HW uses. I think both are stable and work well - depending
>> on what you need but each uses that point to create FUD.
>>
>> HW also claimed that a new SQL engine that Cloudera's including in their
>> distro - Impala is proprietary. Not true. The software is open source. But
>> if you want support for Impala then Cloudera will charge you separately per
>> node for Impala over and above what they charge per node for Hadoop support.
>>
>> In my experience, both products have plenty of issues when it comes to
>> compute engines - Hive, Pig etc and their cluster management software. HDFS
>> seem to be solid in both distros. So I wouldn't call either of them
>> trouble-free and neither is at the maturity level of other popular
>> enterprise products like say, Oracle. That said, you have to keep in mind
>> that both vendors/products are successfully used by several customers so
>> again, it is more a question of what fits your needs.
>>
>> In the end, we chose to go with Cloudera mostly because a more positive
>> experience with CM in terms of administration/operations and their
>> pre-sales team when compared to HW. Again, that said, another team that we
>> closely work with chose HW for their cluster. I use both vendors/clusters
>> at work and neither has any significant issues.
>>
>>
>>
>>
>> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>>
>>> Here's the deal, folks can post questions to the list that aren't
>>> abusive and simply asking what the difference between different vendor
>>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>>> or abusive question.
>>>
>>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>>> PMC push off potentially useful questions that may have upstream
>>> implications to the Apache  Hadoop core and let all the innovation
>>> occur downstream?
>>>
>>> Have the conversations here if you'd like. I wouldn't turn anyone
>>> away..
>>>
>>> My 2c.
>>>
>>> Cheers,
>>> Chris
>>>
>>> ----Original Message-----
>>>
>>> From: Shahab Yunus <sh...@gmail.com>
>>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Date: Friday, September 13, 2013 10:48 AM
>>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>>
>>> >I think, in my opinion, it is a wrong idea because:
>>> >
>>> >
>>> >1- Many of the participants here are employees for these very companies
>>> >that are under discussion. This puts these respective employees in very
>>> >difficult position. It is very hard to come with a correct response.
>>> >Comments can be misconstrued easily.
>>> >2- Also, when we talk about vendor distributions of the software, it is
>>> >not longer purely about open source. Now companies with the related
>>> >corporate legal baggage also gets in the mix.
>>> >3- The discussion would be on not only positive things about each vendor
>>> >but in fact negatives. The latter type of  discussion which can get
>>> >unpleasant very easily.
>>> >
>>> >4- Somebody mentioned that, this is a very lightly moderated platform
>>> and
>>> >thus this discussion should be allowed. I think this is one of the
>>> >reasons that it should not be because, people can say things casually,
>>> >without much thought, or without taking
>>> > care of the context or the possible interpretations and get in trouble.
>>> >5- The risk here is not only that serious repercussions can occur (which
>>> >very well can) but the greater risk is that it can cause
>>> misunderstanding
>>> >between individuals, industries and companies.
>>> >6-People here lot of time reply quickly just to resolve or help the
>>> >'technical' issue. Now they will have to take care how they frame the
>>> >response. Re: 4
>>> >
>>> >
>>> >I know some will feel that I have created a highly exaggerated scenario
>>> >above, but what I am trying to say is that, it is a slippery slope. If
>>> we
>>> >allow this then this can go anywhere.
>>> >
>>> >
>>> >By the way, I do not work for any of these vendors.
>>> >
>>> >
>>> >More importantly, I am not saying that this discussion should not be
>>> had,
>>> >I am just saying that this is a wrong forum.
>>> >
>>> >
>>> >Just my 2 cents (or,...this was rather a dollar.)
>>> >
>>> >
>>> >Regards,
>>> >Shahab
>>> >
>>> >
>>> >
>>> >
>>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>>> ><ma...@apache.org> wrote:
>>> >
>>> >Errr, what's wrong with discussing these types of issues on list?
>>> >
>>> >Nothing public here, and as long as it's kept to facts, this should
>>> >not be a problem and Apache is a fine place to have such discussions.
>>> >
>>> >My 2c.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >-----Original Message-----
>>> >From: Xuri Nagarin <se...@gmail.com>
>>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> >Date: Thursday, September 12, 2013 4:39 PM
>>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>> >
>>> >>I understand it can be contentious issue especially given that a lot of
>>> >>contributors to this list work for one or the other vendor or have some
>>> >>stake in any kind of evaluation. But, I see no reason why users should
>>> >>not be able to compare notes
>>> >> and share experiences. Over time, genuine pain points or issues or
>>> >>claims will bubble up and should only help the community. Sure, there
>>> >>will be a few flame wars but this already isn't a very tightly
>>> moderated
>>> >>list.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>>> >><ae...@maprtech.com> wrote:
>>> >>
>>> >>Raj,
>>> >>
>>> >>
>>> >>As others noted, this is not a great place for this discussion.  I'd
>>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>>> all
>>> >>be happy to provide you more details.
>>> >>
>>> >>
>>> >>I don't know about the others, but for MapR, just send an email to
>>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>>> back
>>> >>to you with more information.
>>> >>
>>> >>
>>> >>Best Regards,
>>> >>Aaron Eng
>>> >>
>>> >>
>>> >>
>>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>>> wrote:
>>> >>
>>> >>
>>> >>Hi,
>>> >>
>>> >>We are trying to evaluate different implementations of Hadoop for our
>>> big
>>> >>data enterprise project.
>>> >>
>>> >>Can the forum members advise on what are the advantages and
>>> disadvantages
>>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>> >>
>>> >>Thanks in advance.
>>> >>
>>> >>Regards,
>>> >>Raj
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by "M. C. Srivas" <mc...@gmail.com>.
So here's an example of marketing FUD at work.

On Mon, Sep 16, 2013 at 3:10 PM, Xuri Nagarin <se...@gmail.com> wrote:

> So I will try to answer the OP's question best I can without deviating too
> much into opinions and stick to facts. Disclaimer: I am not an employee of
> either vendor or any partner of theirs.
>
> Context is important: My team's use case was general data exploration of
> semi-structured log data and we had no typical data-warehouse type of
> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
> terms of ops/maintenance, we only have one person. I point this out because
> lots of hadoop shops have dedicated team for each - OS administration,
> Hadoop admin, Hadoop developers. And, they are very mature in terms of
> their compute use cases. To my mind, these aspects can significantly impact
> your vendor choices.
>
> MapR: My team simply did not consider them because of all the proprietary
> code in there. We are trying to move from a monolithic proprietary product
> and one of the criteria we set was - if we decided to move away from the
> chosen hadoop vendor, can we easily unlock our data?
>

Unlock your data? How about disctp? Or just "cp"?

The fact is there are 10x  more standard ways to access your data in a MapR
cluster versus a Cloudera or Hortonworks data.

MapR is entirely open source, with proprietary add-ons, just like Cloudera
or Hortonworks.

The difference is MapR has innovated both above and below the Hadoop stack,
while Cloudera and Horton have only done so above the stack. MapR's
innovations have set the bar so high that its competition likes to spread
FUD.

[disclaimer: I work for MapR ]



> HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
> management is via Ambari. Compared to Cloudera's CM, Ambari has very
> rudimentary features. But you have to keep in mind that Ambari is only an
> year old where as CM already has been under development for several years.
> This was a major selection factor for us because Ambari did not have all
> the automation/feature-set compared to CM for a single
> administrator/developer to easily maintain the cluster. Also, during the
> trial period, Hortonwork's packing format/structure apparently kept
> changing which made things a bit difficult to centrally deploy/administer.
>
> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
> management which is via their proprietary Cloudera Manager tool. It is free
> for use without certain feature like auditing and cluster replication
> features. Maybe a few more features are restricted to
> Enterprise/Licensed-only version. Offers much more features than Ambari. In
> terms of cluster administration, I found CM much easy to work with than
> Ambari. Pretty much all aspects from deploying new nodes to configuration
> and troubleshooting is much more refined than Ambari.
>
> During the selection process, what I found was that both vendors are very
> aggressive in their pitch. So much so that each pushes some FUD regarding
> the competition.
>

Obviously some of it worked, given some of the statements earlier.



>
> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
> Cloudera's distro is heavily patched off-course from the core Apache trunk
> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
> patches over apache's Hadoop distro but (1) they aren't private patches.
> You can pull the list and verify that yourself just as I did. (2) In our
> testing and talking to other Cloudera customers, I couldn't find any issues
> with data corruption. It is true though that HDFS 2.x is still in beta but
> so is MRv2 that HW uses. I think both are stable and work well - depending
> on what you need but each uses that point to create FUD.
>
> HW also claimed that a new SQL engine that Cloudera's including in their
> distro - Impala is proprietary. Not true. The software is open source. But
> if you want support for Impala then Cloudera will charge you separately per
> node for Impala over and above what they charge per node for Hadoop support.
>
> In my experience, both products have plenty of issues when it comes to
> compute engines - Hive, Pig etc and their cluster management software. HDFS
> seem to be solid in both distros. So I wouldn't call either of them
> trouble-free and neither is at the maturity level of other popular
> enterprise products like say, Oracle. That said, you have to keep in mind
> that both vendors/products are successfully used by several customers so
> again, it is more a question of what fits your needs.
>
> In the end, we chose to go with Cloudera mostly because a more positive
> experience with CM in terms of administration/operations and their
> pre-sales team when compared to HW. Again, that said, another team that we
> closely work with chose HW for their cluster. I use both vendors/clusters
> at work and neither has any significant issues.
>
>
>
>
> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Here's the deal, folks can post questions to the list that aren't
>> abusive and simply asking what the difference between different vendor
>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>> or abusive question.
>>
>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>> PMC push off potentially useful questions that may have upstream
>> implications to the Apache  Hadoop core and let all the innovation
>> occur downstream?
>>
>> Have the conversations here if you'd like. I wouldn't turn anyone
>> away..
>>
>> My 2c.
>>
>> Cheers,
>> Chris
>>
>> ----Original Message-----
>>
>> From: Shahab Yunus <sh...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Friday, September 13, 2013 10:48 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I think, in my opinion, it is a wrong idea because:
>> >
>> >
>> >1- Many of the participants here are employees for these very companies
>> >that are under discussion. This puts these respective employees in very
>> >difficult position. It is very hard to come with a correct response.
>> >Comments can be misconstrued easily.
>> >2- Also, when we talk about vendor distributions of the software, it is
>> >not longer purely about open source. Now companies with the related
>> >corporate legal baggage also gets in the mix.
>> >3- The discussion would be on not only positive things about each vendor
>> >but in fact negatives. The latter type of  discussion which can get
>> >unpleasant very easily.
>> >
>> >4- Somebody mentioned that, this is a very lightly moderated platform and
>> >thus this discussion should be allowed. I think this is one of the
>> >reasons that it should not be because, people can say things casually,
>> >without much thought, or without taking
>> > care of the context or the possible interpretations and get in trouble.
>> >5- The risk here is not only that serious repercussions can occur (which
>> >very well can) but the greater risk is that it can cause misunderstanding
>> >between individuals, industries and companies.
>> >6-People here lot of time reply quickly just to resolve or help the
>> >'technical' issue. Now they will have to take care how they frame the
>> >response. Re: 4
>> >
>> >
>> >I know some will feel that I have created a highly exaggerated scenario
>> >above, but what I am trying to say is that, it is a slippery slope. If we
>> >allow this then this can go anywhere.
>> >
>> >
>> >By the way, I do not work for any of these vendors.
>> >
>> >
>> >More importantly, I am not saying that this discussion should not be had,
>> >I am just saying that this is a wrong forum.
>> >
>> >
>> >Just my 2 cents (or,...this was rather a dollar.)
>> >
>> >
>> >Regards,
>> >Shahab
>> >
>> >
>> >
>> >
>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>> ><ma...@apache.org> wrote:
>> >
>> >Errr, what's wrong with discussing these types of issues on list?
>> >
>> >Nothing public here, and as long as it's kept to facts, this should
>> >not be a problem and Apache is a fine place to have such discussions.
>> >
>> >My 2c.
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Xuri Nagarin <se...@gmail.com>
>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Date: Thursday, September 12, 2013 4:39 PM
>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>> >
>> >>I understand it can be contentious issue especially given that a lot of
>> >>contributors to this list work for one or the other vendor or have some
>> >>stake in any kind of evaluation. But, I see no reason why users should
>> >>not be able to compare notes
>> >> and share experiences. Over time, genuine pain points or issues or
>> >>claims will bubble up and should only help the community. Sure, there
>> >>will be a few flame wars but this already isn't a very tightly moderated
>> >>list.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> >><ae...@maprtech.com> wrote:
>> >>
>> >>Raj,
>> >>
>> >>
>> >>As others noted, this is not a great place for this discussion.  I'd
>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>> all
>> >>be happy to provide you more details.
>> >>
>> >>
>> >>I don't know about the others, but for MapR, just send an email to
>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >>to you with more information.
>> >>
>> >>
>> >>Best Regards,
>> >>Aaron Eng
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >>
>> >>
>> >>Hi,
>> >>
>> >>We are trying to evaluate different implementations of Hadoop for our
>> big
>> >>data enterprise project.
>> >>
>> >>Can the forum members advise on what are the advantages and
>> disadvantages
>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >>
>> >>Thanks in advance.
>> >>
>> >>Regards,
>> >>Raj
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Embree <ce...@gmail.com>.
Our evaluation was similar except we did not consider the "management"
tools any vendor provided as that's just as much lock in as any proprietary
tool.  What if I want trade vendors?  I have to re-tool to use there mgmt?
 Nope, wrote our own.

Being in a large enterprise, we went with the "perceived" more stable
platform.  Draw your own conclusions.


On Mon, Sep 16, 2013 at 6:10 PM, Xuri Nagarin <se...@gmail.com> wrote:

> So I will try to answer the OP's question best I can without deviating too
> much into opinions and stick to facts. Disclaimer: I am not an employee of
> either vendor or any partner of theirs.
>
> Context is important: My team's use case was general data exploration of
> semi-structured log data and we had no typical data-warehouse type of
> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
> terms of ops/maintenance, we only have one person. I point this out because
> lots of hadoop shops have dedicated team for each - OS administration,
> Hadoop admin, Hadoop developers. And, they are very mature in terms of
> their compute use cases. To my mind, these aspects can significantly impact
> your vendor choices.
>
> MapR: My team simply did not consider them because of all the proprietary
> code in there. We are trying to move from a monolithic proprietary product
> and one of the criteria we set was - if we decided to move away from the
> chosen hadoop vendor, can we easily unlock our data?
> HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
> management is via Ambari. Compared to Cloudera's CM, Ambari has very
> rudimentary features. But you have to keep in mind that Ambari is only an
> year old where as CM already has been under development for several years.
> This was a major selection factor for us because Ambari did not have all
> the automation/feature-set compared to CM for a single
> administrator/developer to easily maintain the cluster. Also, during the
> trial period, Hortonwork's packing format/structure apparently kept
> changing which made things a bit difficult to centrally deploy/administer.
>
> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
> management which is via their proprietary Cloudera Manager tool. It is free
> for use without certain feature like auditing and cluster replication
> features. Maybe a few more features are restricted to
> Enterprise/Licensed-only version. Offers much more features than Ambari. In
> terms of cluster administration, I found CM much easy to work with than
> Ambari. Pretty much all aspects from deploying new nodes to configuration
> and troubleshooting is much more refined than Ambari.
>
> During the selection process, what I found was that both vendors are very
> aggressive in their pitch. So much so that each pushes some FUD regarding
> the competition.
>
> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
> Cloudera's distro is heavily patched off-course from the core Apache trunk
> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
> patches over apache's Hadoop distro but (1) they aren't private patches.
> You can pull the list and verify that yourself just as I did. (2) In our
> testing and talking to other Cloudera customers, I couldn't find any issues
> with data corruption. It is true though that HDFS 2.x is still in beta but
> so is MRv2 that HW uses. I think both are stable and work well - depending
> on what you need but each uses that point to create FUD.
>
> HW also claimed that a new SQL engine that Cloudera's including in their
> distro - Impala is proprietary. Not true. The software is open source. But
> if you want support for Impala then Cloudera will charge you separately per
> node for Impala over and above what they charge per node for Hadoop support.
>
> In my experience, both products have plenty of issues when it comes to
> compute engines - Hive, Pig etc and their cluster management software. HDFS
> seem to be solid in both distros. So I wouldn't call either of them
> trouble-free and neither is at the maturity level of other popular
> enterprise products like say, Oracle. That said, you have to keep in mind
> that both vendors/products are successfully used by several customers so
> again, it is more a question of what fits your needs.
>
> In the end, we chose to go with Cloudera mostly because a more positive
> experience with CM in terms of administration/operations and their
> pre-sales team when compared to HW. Again, that said, another team that we
> closely work with chose HW for their cluster. I use both vendors/clusters
> at work and neither has any significant issues.
>
>
>
>
> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Here's the deal, folks can post questions to the list that aren't
>> abusive and simply asking what the difference between different vendor
>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>> or abusive question.
>>
>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>> PMC push off potentially useful questions that may have upstream
>> implications to the Apache  Hadoop core and let all the innovation
>> occur downstream?
>>
>> Have the conversations here if you'd like. I wouldn't turn anyone
>> away..
>>
>> My 2c.
>>
>> Cheers,
>> Chris
>>
>> ----Original Message-----
>>
>> From: Shahab Yunus <sh...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Friday, September 13, 2013 10:48 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I think, in my opinion, it is a wrong idea because:
>> >
>> >
>> >1- Many of the participants here are employees for these very companies
>> >that are under discussion. This puts these respective employees in very
>> >difficult position. It is very hard to come with a correct response.
>> >Comments can be misconstrued easily.
>> >2- Also, when we talk about vendor distributions of the software, it is
>> >not longer purely about open source. Now companies with the related
>> >corporate legal baggage also gets in the mix.
>> >3- The discussion would be on not only positive things about each vendor
>> >but in fact negatives. The latter type of  discussion which can get
>> >unpleasant very easily.
>> >
>> >4- Somebody mentioned that, this is a very lightly moderated platform and
>> >thus this discussion should be allowed. I think this is one of the
>> >reasons that it should not be because, people can say things casually,
>> >without much thought, or without taking
>> > care of the context or the possible interpretations and get in trouble.
>> >5- The risk here is not only that serious repercussions can occur (which
>> >very well can) but the greater risk is that it can cause misunderstanding
>> >between individuals, industries and companies.
>> >6-People here lot of time reply quickly just to resolve or help the
>> >'technical' issue. Now they will have to take care how they frame the
>> >response. Re: 4
>> >
>> >
>> >I know some will feel that I have created a highly exaggerated scenario
>> >above, but what I am trying to say is that, it is a slippery slope. If we
>> >allow this then this can go anywhere.
>> >
>> >
>> >By the way, I do not work for any of these vendors.
>> >
>> >
>> >More importantly, I am not saying that this discussion should not be had,
>> >I am just saying that this is a wrong forum.
>> >
>> >
>> >Just my 2 cents (or,...this was rather a dollar.)
>> >
>> >
>> >Regards,
>> >Shahab
>> >
>> >
>> >
>> >
>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>> ><ma...@apache.org> wrote:
>> >
>> >Errr, what's wrong with discussing these types of issues on list?
>> >
>> >Nothing public here, and as long as it's kept to facts, this should
>> >not be a problem and Apache is a fine place to have such discussions.
>> >
>> >My 2c.
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Xuri Nagarin <se...@gmail.com>
>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Date: Thursday, September 12, 2013 4:39 PM
>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>> >
>> >>I understand it can be contentious issue especially given that a lot of
>> >>contributors to this list work for one or the other vendor or have some
>> >>stake in any kind of evaluation. But, I see no reason why users should
>> >>not be able to compare notes
>> >> and share experiences. Over time, genuine pain points or issues or
>> >>claims will bubble up and should only help the community. Sure, there
>> >>will be a few flame wars but this already isn't a very tightly moderated
>> >>list.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> >><ae...@maprtech.com> wrote:
>> >>
>> >>Raj,
>> >>
>> >>
>> >>As others noted, this is not a great place for this discussion.  I'd
>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>> all
>> >>be happy to provide you more details.
>> >>
>> >>
>> >>I don't know about the others, but for MapR, just send an email to
>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >>to you with more information.
>> >>
>> >>
>> >>Best Regards,
>> >>Aaron Eng
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >>
>> >>
>> >>Hi,
>> >>
>> >>We are trying to evaluate different implementations of Hadoop for our
>> big
>> >>data enterprise project.
>> >>
>> >>Can the forum members advise on what are the advantages and
>> disadvantages
>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >>
>> >>Thanks in advance.
>> >>
>> >>Regards,
>> >>Raj
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Embree <ce...@gmail.com>.
Our evaluation was similar except we did not consider the "management"
tools any vendor provided as that's just as much lock in as any proprietary
tool.  What if I want trade vendors?  I have to re-tool to use there mgmt?
 Nope, wrote our own.

Being in a large enterprise, we went with the "perceived" more stable
platform.  Draw your own conclusions.


On Mon, Sep 16, 2013 at 6:10 PM, Xuri Nagarin <se...@gmail.com> wrote:

> So I will try to answer the OP's question best I can without deviating too
> much into opinions and stick to facts. Disclaimer: I am not an employee of
> either vendor or any partner of theirs.
>
> Context is important: My team's use case was general data exploration of
> semi-structured log data and we had no typical data-warehouse type of
> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
> terms of ops/maintenance, we only have one person. I point this out because
> lots of hadoop shops have dedicated team for each - OS administration,
> Hadoop admin, Hadoop developers. And, they are very mature in terms of
> their compute use cases. To my mind, these aspects can significantly impact
> your vendor choices.
>
> MapR: My team simply did not consider them because of all the proprietary
> code in there. We are trying to move from a monolithic proprietary product
> and one of the criteria we set was - if we decided to move away from the
> chosen hadoop vendor, can we easily unlock our data?
> HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
> management is via Ambari. Compared to Cloudera's CM, Ambari has very
> rudimentary features. But you have to keep in mind that Ambari is only an
> year old where as CM already has been under development for several years.
> This was a major selection factor for us because Ambari did not have all
> the automation/feature-set compared to CM for a single
> administrator/developer to easily maintain the cluster. Also, during the
> trial period, Hortonwork's packing format/structure apparently kept
> changing which made things a bit difficult to centrally deploy/administer.
>
> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
> management which is via their proprietary Cloudera Manager tool. It is free
> for use without certain feature like auditing and cluster replication
> features. Maybe a few more features are restricted to
> Enterprise/Licensed-only version. Offers much more features than Ambari. In
> terms of cluster administration, I found CM much easy to work with than
> Ambari. Pretty much all aspects from deploying new nodes to configuration
> and troubleshooting is much more refined than Ambari.
>
> During the selection process, what I found was that both vendors are very
> aggressive in their pitch. So much so that each pushes some FUD regarding
> the competition.
>
> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
> Cloudera's distro is heavily patched off-course from the core Apache trunk
> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
> patches over apache's Hadoop distro but (1) they aren't private patches.
> You can pull the list and verify that yourself just as I did. (2) In our
> testing and talking to other Cloudera customers, I couldn't find any issues
> with data corruption. It is true though that HDFS 2.x is still in beta but
> so is MRv2 that HW uses. I think both are stable and work well - depending
> on what you need but each uses that point to create FUD.
>
> HW also claimed that a new SQL engine that Cloudera's including in their
> distro - Impala is proprietary. Not true. The software is open source. But
> if you want support for Impala then Cloudera will charge you separately per
> node for Impala over and above what they charge per node for Hadoop support.
>
> In my experience, both products have plenty of issues when it comes to
> compute engines - Hive, Pig etc and their cluster management software. HDFS
> seem to be solid in both distros. So I wouldn't call either of them
> trouble-free and neither is at the maturity level of other popular
> enterprise products like say, Oracle. That said, you have to keep in mind
> that both vendors/products are successfully used by several customers so
> again, it is more a question of what fits your needs.
>
> In the end, we chose to go with Cloudera mostly because a more positive
> experience with CM in terms of administration/operations and their
> pre-sales team when compared to HW. Again, that said, another team that we
> closely work with chose HW for their cluster. I use both vendors/clusters
> at work and neither has any significant issues.
>
>
>
>
> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Here's the deal, folks can post questions to the list that aren't
>> abusive and simply asking what the difference between different vendor
>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>> or abusive question.
>>
>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>> PMC push off potentially useful questions that may have upstream
>> implications to the Apache  Hadoop core and let all the innovation
>> occur downstream?
>>
>> Have the conversations here if you'd like. I wouldn't turn anyone
>> away..
>>
>> My 2c.
>>
>> Cheers,
>> Chris
>>
>> ----Original Message-----
>>
>> From: Shahab Yunus <sh...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Friday, September 13, 2013 10:48 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I think, in my opinion, it is a wrong idea because:
>> >
>> >
>> >1- Many of the participants here are employees for these very companies
>> >that are under discussion. This puts these respective employees in very
>> >difficult position. It is very hard to come with a correct response.
>> >Comments can be misconstrued easily.
>> >2- Also, when we talk about vendor distributions of the software, it is
>> >not longer purely about open source. Now companies with the related
>> >corporate legal baggage also gets in the mix.
>> >3- The discussion would be on not only positive things about each vendor
>> >but in fact negatives. The latter type of  discussion which can get
>> >unpleasant very easily.
>> >
>> >4- Somebody mentioned that, this is a very lightly moderated platform and
>> >thus this discussion should be allowed. I think this is one of the
>> >reasons that it should not be because, people can say things casually,
>> >without much thought, or without taking
>> > care of the context or the possible interpretations and get in trouble.
>> >5- The risk here is not only that serious repercussions can occur (which
>> >very well can) but the greater risk is that it can cause misunderstanding
>> >between individuals, industries and companies.
>> >6-People here lot of time reply quickly just to resolve or help the
>> >'technical' issue. Now they will have to take care how they frame the
>> >response. Re: 4
>> >
>> >
>> >I know some will feel that I have created a highly exaggerated scenario
>> >above, but what I am trying to say is that, it is a slippery slope. If we
>> >allow this then this can go anywhere.
>> >
>> >
>> >By the way, I do not work for any of these vendors.
>> >
>> >
>> >More importantly, I am not saying that this discussion should not be had,
>> >I am just saying that this is a wrong forum.
>> >
>> >
>> >Just my 2 cents (or,...this was rather a dollar.)
>> >
>> >
>> >Regards,
>> >Shahab
>> >
>> >
>> >
>> >
>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>> ><ma...@apache.org> wrote:
>> >
>> >Errr, what's wrong with discussing these types of issues on list?
>> >
>> >Nothing public here, and as long as it's kept to facts, this should
>> >not be a problem and Apache is a fine place to have such discussions.
>> >
>> >My 2c.
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Xuri Nagarin <se...@gmail.com>
>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Date: Thursday, September 12, 2013 4:39 PM
>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>> >
>> >>I understand it can be contentious issue especially given that a lot of
>> >>contributors to this list work for one or the other vendor or have some
>> >>stake in any kind of evaluation. But, I see no reason why users should
>> >>not be able to compare notes
>> >> and share experiences. Over time, genuine pain points or issues or
>> >>claims will bubble up and should only help the community. Sure, there
>> >>will be a few flame wars but this already isn't a very tightly moderated
>> >>list.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> >><ae...@maprtech.com> wrote:
>> >>
>> >>Raj,
>> >>
>> >>
>> >>As others noted, this is not a great place for this discussion.  I'd
>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>> all
>> >>be happy to provide you more details.
>> >>
>> >>
>> >>I don't know about the others, but for MapR, just send an email to
>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >>to you with more information.
>> >>
>> >>
>> >>Best Regards,
>> >>Aaron Eng
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >>
>> >>
>> >>Hi,
>> >>
>> >>We are trying to evaluate different implementations of Hadoop for our
>> big
>> >>data enterprise project.
>> >>
>> >>Can the forum members advise on what are the advantages and
>> disadvantages
>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >>
>> >>Thanks in advance.
>> >>
>> >>Regards,
>> >>Raj
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Embree <ce...@gmail.com>.
Our evaluation was similar except we did not consider the "management"
tools any vendor provided as that's just as much lock in as any proprietary
tool.  What if I want trade vendors?  I have to re-tool to use there mgmt?
 Nope, wrote our own.

Being in a large enterprise, we went with the "perceived" more stable
platform.  Draw your own conclusions.


On Mon, Sep 16, 2013 at 6:10 PM, Xuri Nagarin <se...@gmail.com> wrote:

> So I will try to answer the OP's question best I can without deviating too
> much into opinions and stick to facts. Disclaimer: I am not an employee of
> either vendor or any partner of theirs.
>
> Context is important: My team's use case was general data exploration of
> semi-structured log data and we had no typical data-warehouse type of
> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
> terms of ops/maintenance, we only have one person. I point this out because
> lots of hadoop shops have dedicated team for each - OS administration,
> Hadoop admin, Hadoop developers. And, they are very mature in terms of
> their compute use cases. To my mind, these aspects can significantly impact
> your vendor choices.
>
> MapR: My team simply did not consider them because of all the proprietary
> code in there. We are trying to move from a monolithic proprietary product
> and one of the criteria we set was - if we decided to move away from the
> chosen hadoop vendor, can we easily unlock our data?
> HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
> management is via Ambari. Compared to Cloudera's CM, Ambari has very
> rudimentary features. But you have to keep in mind that Ambari is only an
> year old where as CM already has been under development for several years.
> This was a major selection factor for us because Ambari did not have all
> the automation/feature-set compared to CM for a single
> administrator/developer to easily maintain the cluster. Also, during the
> trial period, Hortonwork's packing format/structure apparently kept
> changing which made things a bit difficult to centrally deploy/administer.
>
> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
> management which is via their proprietary Cloudera Manager tool. It is free
> for use without certain feature like auditing and cluster replication
> features. Maybe a few more features are restricted to
> Enterprise/Licensed-only version. Offers much more features than Ambari. In
> terms of cluster administration, I found CM much easy to work with than
> Ambari. Pretty much all aspects from deploying new nodes to configuration
> and troubleshooting is much more refined than Ambari.
>
> During the selection process, what I found was that both vendors are very
> aggressive in their pitch. So much so that each pushes some FUD regarding
> the competition.
>
> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
> Cloudera's distro is heavily patched off-course from the core Apache trunk
> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
> patches over apache's Hadoop distro but (1) they aren't private patches.
> You can pull the list and verify that yourself just as I did. (2) In our
> testing and talking to other Cloudera customers, I couldn't find any issues
> with data corruption. It is true though that HDFS 2.x is still in beta but
> so is MRv2 that HW uses. I think both are stable and work well - depending
> on what you need but each uses that point to create FUD.
>
> HW also claimed that a new SQL engine that Cloudera's including in their
> distro - Impala is proprietary. Not true. The software is open source. But
> if you want support for Impala then Cloudera will charge you separately per
> node for Impala over and above what they charge per node for Hadoop support.
>
> In my experience, both products have plenty of issues when it comes to
> compute engines - Hive, Pig etc and their cluster management software. HDFS
> seem to be solid in both distros. So I wouldn't call either of them
> trouble-free and neither is at the maturity level of other popular
> enterprise products like say, Oracle. That said, you have to keep in mind
> that both vendors/products are successfully used by several customers so
> again, it is more a question of what fits your needs.
>
> In the end, we chose to go with Cloudera mostly because a more positive
> experience with CM in terms of administration/operations and their
> pre-sales team when compared to HW. Again, that said, another team that we
> closely work with chose HW for their cluster. I use both vendors/clusters
> at work and neither has any significant issues.
>
>
>
>
> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Here's the deal, folks can post questions to the list that aren't
>> abusive and simply asking what the difference between different vendor
>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>> or abusive question.
>>
>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>> PMC push off potentially useful questions that may have upstream
>> implications to the Apache  Hadoop core and let all the innovation
>> occur downstream?
>>
>> Have the conversations here if you'd like. I wouldn't turn anyone
>> away..
>>
>> My 2c.
>>
>> Cheers,
>> Chris
>>
>> ----Original Message-----
>>
>> From: Shahab Yunus <sh...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Friday, September 13, 2013 10:48 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I think, in my opinion, it is a wrong idea because:
>> >
>> >
>> >1- Many of the participants here are employees for these very companies
>> >that are under discussion. This puts these respective employees in very
>> >difficult position. It is very hard to come with a correct response.
>> >Comments can be misconstrued easily.
>> >2- Also, when we talk about vendor distributions of the software, it is
>> >not longer purely about open source. Now companies with the related
>> >corporate legal baggage also gets in the mix.
>> >3- The discussion would be on not only positive things about each vendor
>> >but in fact negatives. The latter type of  discussion which can get
>> >unpleasant very easily.
>> >
>> >4- Somebody mentioned that, this is a very lightly moderated platform and
>> >thus this discussion should be allowed. I think this is one of the
>> >reasons that it should not be because, people can say things casually,
>> >without much thought, or without taking
>> > care of the context or the possible interpretations and get in trouble.
>> >5- The risk here is not only that serious repercussions can occur (which
>> >very well can) but the greater risk is that it can cause misunderstanding
>> >between individuals, industries and companies.
>> >6-People here lot of time reply quickly just to resolve or help the
>> >'technical' issue. Now they will have to take care how they frame the
>> >response. Re: 4
>> >
>> >
>> >I know some will feel that I have created a highly exaggerated scenario
>> >above, but what I am trying to say is that, it is a slippery slope. If we
>> >allow this then this can go anywhere.
>> >
>> >
>> >By the way, I do not work for any of these vendors.
>> >
>> >
>> >More importantly, I am not saying that this discussion should not be had,
>> >I am just saying that this is a wrong forum.
>> >
>> >
>> >Just my 2 cents (or,...this was rather a dollar.)
>> >
>> >
>> >Regards,
>> >Shahab
>> >
>> >
>> >
>> >
>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>> ><ma...@apache.org> wrote:
>> >
>> >Errr, what's wrong with discussing these types of issues on list?
>> >
>> >Nothing public here, and as long as it's kept to facts, this should
>> >not be a problem and Apache is a fine place to have such discussions.
>> >
>> >My 2c.
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Xuri Nagarin <se...@gmail.com>
>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Date: Thursday, September 12, 2013 4:39 PM
>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>> >
>> >>I understand it can be contentious issue especially given that a lot of
>> >>contributors to this list work for one or the other vendor or have some
>> >>stake in any kind of evaluation. But, I see no reason why users should
>> >>not be able to compare notes
>> >> and share experiences. Over time, genuine pain points or issues or
>> >>claims will bubble up and should only help the community. Sure, there
>> >>will be a few flame wars but this already isn't a very tightly moderated
>> >>list.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> >><ae...@maprtech.com> wrote:
>> >>
>> >>Raj,
>> >>
>> >>
>> >>As others noted, this is not a great place for this discussion.  I'd
>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>> all
>> >>be happy to provide you more details.
>> >>
>> >>
>> >>I don't know about the others, but for MapR, just send an email to
>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >>to you with more information.
>> >>
>> >>
>> >>Best Regards,
>> >>Aaron Eng
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >>
>> >>
>> >>Hi,
>> >>
>> >>We are trying to evaluate different implementations of Hadoop for our
>> big
>> >>data enterprise project.
>> >>
>> >>Can the forum members advise on what are the advantages and
>> disadvantages
>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >>
>> >>Thanks in advance.
>> >>
>> >>Regards,
>> >>Raj
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by "M. C. Srivas" <mc...@gmail.com>.
So here's an example of marketing FUD at work.

On Mon, Sep 16, 2013 at 3:10 PM, Xuri Nagarin <se...@gmail.com> wrote:

> So I will try to answer the OP's question best I can without deviating too
> much into opinions and stick to facts. Disclaimer: I am not an employee of
> either vendor or any partner of theirs.
>
> Context is important: My team's use case was general data exploration of
> semi-structured log data and we had no typical data-warehouse type of
> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
> terms of ops/maintenance, we only have one person. I point this out because
> lots of hadoop shops have dedicated team for each - OS administration,
> Hadoop admin, Hadoop developers. And, they are very mature in terms of
> their compute use cases. To my mind, these aspects can significantly impact
> your vendor choices.
>
> MapR: My team simply did not consider them because of all the proprietary
> code in there. We are trying to move from a monolithic proprietary product
> and one of the criteria we set was - if we decided to move away from the
> chosen hadoop vendor, can we easily unlock our data?
>

Unlock your data? How about disctp? Or just "cp"?

The fact is there are 10x  more standard ways to access your data in a MapR
cluster versus a Cloudera or Hortonworks data.

MapR is entirely open source, with proprietary add-ons, just like Cloudera
or Hortonworks.

The difference is MapR has innovated both above and below the Hadoop stack,
while Cloudera and Horton have only done so above the stack. MapR's
innovations have set the bar so high that its competition likes to spread
FUD.

[disclaimer: I work for MapR ]



> HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
> management is via Ambari. Compared to Cloudera's CM, Ambari has very
> rudimentary features. But you have to keep in mind that Ambari is only an
> year old where as CM already has been under development for several years.
> This was a major selection factor for us because Ambari did not have all
> the automation/feature-set compared to CM for a single
> administrator/developer to easily maintain the cluster. Also, during the
> trial period, Hortonwork's packing format/structure apparently kept
> changing which made things a bit difficult to centrally deploy/administer.
>
> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
> management which is via their proprietary Cloudera Manager tool. It is free
> for use without certain feature like auditing and cluster replication
> features. Maybe a few more features are restricted to
> Enterprise/Licensed-only version. Offers much more features than Ambari. In
> terms of cluster administration, I found CM much easy to work with than
> Ambari. Pretty much all aspects from deploying new nodes to configuration
> and troubleshooting is much more refined than Ambari.
>
> During the selection process, what I found was that both vendors are very
> aggressive in their pitch. So much so that each pushes some FUD regarding
> the competition.
>

Obviously some of it worked, given some of the statements earlier.



>
> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
> Cloudera's distro is heavily patched off-course from the core Apache trunk
> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
> patches over apache's Hadoop distro but (1) they aren't private patches.
> You can pull the list and verify that yourself just as I did. (2) In our
> testing and talking to other Cloudera customers, I couldn't find any issues
> with data corruption. It is true though that HDFS 2.x is still in beta but
> so is MRv2 that HW uses. I think both are stable and work well - depending
> on what you need but each uses that point to create FUD.
>
> HW also claimed that a new SQL engine that Cloudera's including in their
> distro - Impala is proprietary. Not true. The software is open source. But
> if you want support for Impala then Cloudera will charge you separately per
> node for Impala over and above what they charge per node for Hadoop support.
>
> In my experience, both products have plenty of issues when it comes to
> compute engines - Hive, Pig etc and their cluster management software. HDFS
> seem to be solid in both distros. So I wouldn't call either of them
> trouble-free and neither is at the maturity level of other popular
> enterprise products like say, Oracle. That said, you have to keep in mind
> that both vendors/products are successfully used by several customers so
> again, it is more a question of what fits your needs.
>
> In the end, we chose to go with Cloudera mostly because a more positive
> experience with CM in terms of administration/operations and their
> pre-sales team when compared to HW. Again, that said, another team that we
> closely work with chose HW for their cluster. I use both vendors/clusters
> at work and neither has any significant issues.
>
>
>
>
> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Here's the deal, folks can post questions to the list that aren't
>> abusive and simply asking what the difference between different vendor
>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>> or abusive question.
>>
>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>> PMC push off potentially useful questions that may have upstream
>> implications to the Apache  Hadoop core and let all the innovation
>> occur downstream?
>>
>> Have the conversations here if you'd like. I wouldn't turn anyone
>> away..
>>
>> My 2c.
>>
>> Cheers,
>> Chris
>>
>> ----Original Message-----
>>
>> From: Shahab Yunus <sh...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Friday, September 13, 2013 10:48 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I think, in my opinion, it is a wrong idea because:
>> >
>> >
>> >1- Many of the participants here are employees for these very companies
>> >that are under discussion. This puts these respective employees in very
>> >difficult position. It is very hard to come with a correct response.
>> >Comments can be misconstrued easily.
>> >2- Also, when we talk about vendor distributions of the software, it is
>> >not longer purely about open source. Now companies with the related
>> >corporate legal baggage also gets in the mix.
>> >3- The discussion would be on not only positive things about each vendor
>> >but in fact negatives. The latter type of  discussion which can get
>> >unpleasant very easily.
>> >
>> >4- Somebody mentioned that, this is a very lightly moderated platform and
>> >thus this discussion should be allowed. I think this is one of the
>> >reasons that it should not be because, people can say things casually,
>> >without much thought, or without taking
>> > care of the context or the possible interpretations and get in trouble.
>> >5- The risk here is not only that serious repercussions can occur (which
>> >very well can) but the greater risk is that it can cause misunderstanding
>> >between individuals, industries and companies.
>> >6-People here lot of time reply quickly just to resolve or help the
>> >'technical' issue. Now they will have to take care how they frame the
>> >response. Re: 4
>> >
>> >
>> >I know some will feel that I have created a highly exaggerated scenario
>> >above, but what I am trying to say is that, it is a slippery slope. If we
>> >allow this then this can go anywhere.
>> >
>> >
>> >By the way, I do not work for any of these vendors.
>> >
>> >
>> >More importantly, I am not saying that this discussion should not be had,
>> >I am just saying that this is a wrong forum.
>> >
>> >
>> >Just my 2 cents (or,...this was rather a dollar.)
>> >
>> >
>> >Regards,
>> >Shahab
>> >
>> >
>> >
>> >
>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>> ><ma...@apache.org> wrote:
>> >
>> >Errr, what's wrong with discussing these types of issues on list?
>> >
>> >Nothing public here, and as long as it's kept to facts, this should
>> >not be a problem and Apache is a fine place to have such discussions.
>> >
>> >My 2c.
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Xuri Nagarin <se...@gmail.com>
>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Date: Thursday, September 12, 2013 4:39 PM
>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>> >
>> >>I understand it can be contentious issue especially given that a lot of
>> >>contributors to this list work for one or the other vendor or have some
>> >>stake in any kind of evaluation. But, I see no reason why users should
>> >>not be able to compare notes
>> >> and share experiences. Over time, genuine pain points or issues or
>> >>claims will bubble up and should only help the community. Sure, there
>> >>will be a few flame wars but this already isn't a very tightly moderated
>> >>list.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> >><ae...@maprtech.com> wrote:
>> >>
>> >>Raj,
>> >>
>> >>
>> >>As others noted, this is not a great place for this discussion.  I'd
>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>> all
>> >>be happy to provide you more details.
>> >>
>> >>
>> >>I don't know about the others, but for MapR, just send an email to
>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >>to you with more information.
>> >>
>> >>
>> >>Best Regards,
>> >>Aaron Eng
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >>
>> >>
>> >>Hi,
>> >>
>> >>We are trying to evaluate different implementations of Hadoop for our
>> big
>> >>data enterprise project.
>> >>
>> >>Can the forum members advise on what are the advantages and
>> disadvantages
>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >>
>> >>Thanks in advance.
>> >>
>> >>Regards,
>> >>Raj
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by "M. C. Srivas" <mc...@gmail.com>.
So here's an example of marketing FUD at work.

On Mon, Sep 16, 2013 at 3:10 PM, Xuri Nagarin <se...@gmail.com> wrote:

> So I will try to answer the OP's question best I can without deviating too
> much into opinions and stick to facts. Disclaimer: I am not an employee of
> either vendor or any partner of theirs.
>
> Context is important: My team's use case was general data exploration of
> semi-structured log data and we had no typical data-warehouse type of
> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
> terms of ops/maintenance, we only have one person. I point this out because
> lots of hadoop shops have dedicated team for each - OS administration,
> Hadoop admin, Hadoop developers. And, they are very mature in terms of
> their compute use cases. To my mind, these aspects can significantly impact
> your vendor choices.
>
> MapR: My team simply did not consider them because of all the proprietary
> code in there. We are trying to move from a monolithic proprietary product
> and one of the criteria we set was - if we decided to move away from the
> chosen hadoop vendor, can we easily unlock our data?
>

Unlock your data? How about disctp? Or just "cp"?

The fact is there are 10x  more standard ways to access your data in a MapR
cluster versus a Cloudera or Hortonworks data.

MapR is entirely open source, with proprietary add-ons, just like Cloudera
or Hortonworks.

The difference is MapR has innovated both above and below the Hadoop stack,
while Cloudera and Horton have only done so above the stack. MapR's
innovations have set the bar so high that its competition likes to spread
FUD.

[disclaimer: I work for MapR ]



> HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
> management is via Ambari. Compared to Cloudera's CM, Ambari has very
> rudimentary features. But you have to keep in mind that Ambari is only an
> year old where as CM already has been under development for several years.
> This was a major selection factor for us because Ambari did not have all
> the automation/feature-set compared to CM for a single
> administrator/developer to easily maintain the cluster. Also, during the
> trial period, Hortonwork's packing format/structure apparently kept
> changing which made things a bit difficult to centrally deploy/administer.
>
> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
> management which is via their proprietary Cloudera Manager tool. It is free
> for use without certain feature like auditing and cluster replication
> features. Maybe a few more features are restricted to
> Enterprise/Licensed-only version. Offers much more features than Ambari. In
> terms of cluster administration, I found CM much easy to work with than
> Ambari. Pretty much all aspects from deploying new nodes to configuration
> and troubleshooting is much more refined than Ambari.
>
> During the selection process, what I found was that both vendors are very
> aggressive in their pitch. So much so that each pushes some FUD regarding
> the competition.
>

Obviously some of it worked, given some of the statements earlier.



>
> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
> Cloudera's distro is heavily patched off-course from the core Apache trunk
> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
> patches over apache's Hadoop distro but (1) they aren't private patches.
> You can pull the list and verify that yourself just as I did. (2) In our
> testing and talking to other Cloudera customers, I couldn't find any issues
> with data corruption. It is true though that HDFS 2.x is still in beta but
> so is MRv2 that HW uses. I think both are stable and work well - depending
> on what you need but each uses that point to create FUD.
>
> HW also claimed that a new SQL engine that Cloudera's including in their
> distro - Impala is proprietary. Not true. The software is open source. But
> if you want support for Impala then Cloudera will charge you separately per
> node for Impala over and above what they charge per node for Hadoop support.
>
> In my experience, both products have plenty of issues when it comes to
> compute engines - Hive, Pig etc and their cluster management software. HDFS
> seem to be solid in both distros. So I wouldn't call either of them
> trouble-free and neither is at the maturity level of other popular
> enterprise products like say, Oracle. That said, you have to keep in mind
> that both vendors/products are successfully used by several customers so
> again, it is more a question of what fits your needs.
>
> In the end, we chose to go with Cloudera mostly because a more positive
> experience with CM in terms of administration/operations and their
> pre-sales team when compared to HW. Again, that said, another team that we
> closely work with chose HW for their cluster. I use both vendors/clusters
> at work and neither has any significant issues.
>
>
>
>
> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Here's the deal, folks can post questions to the list that aren't
>> abusive and simply asking what the difference between different vendor
>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>> or abusive question.
>>
>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>> PMC push off potentially useful questions that may have upstream
>> implications to the Apache  Hadoop core and let all the innovation
>> occur downstream?
>>
>> Have the conversations here if you'd like. I wouldn't turn anyone
>> away..
>>
>> My 2c.
>>
>> Cheers,
>> Chris
>>
>> ----Original Message-----
>>
>> From: Shahab Yunus <sh...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Friday, September 13, 2013 10:48 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I think, in my opinion, it is a wrong idea because:
>> >
>> >
>> >1- Many of the participants here are employees for these very companies
>> >that are under discussion. This puts these respective employees in very
>> >difficult position. It is very hard to come with a correct response.
>> >Comments can be misconstrued easily.
>> >2- Also, when we talk about vendor distributions of the software, it is
>> >not longer purely about open source. Now companies with the related
>> >corporate legal baggage also gets in the mix.
>> >3- The discussion would be on not only positive things about each vendor
>> >but in fact negatives. The latter type of  discussion which can get
>> >unpleasant very easily.
>> >
>> >4- Somebody mentioned that, this is a very lightly moderated platform and
>> >thus this discussion should be allowed. I think this is one of the
>> >reasons that it should not be because, people can say things casually,
>> >without much thought, or without taking
>> > care of the context or the possible interpretations and get in trouble.
>> >5- The risk here is not only that serious repercussions can occur (which
>> >very well can) but the greater risk is that it can cause misunderstanding
>> >between individuals, industries and companies.
>> >6-People here lot of time reply quickly just to resolve or help the
>> >'technical' issue. Now they will have to take care how they frame the
>> >response. Re: 4
>> >
>> >
>> >I know some will feel that I have created a highly exaggerated scenario
>> >above, but what I am trying to say is that, it is a slippery slope. If we
>> >allow this then this can go anywhere.
>> >
>> >
>> >By the way, I do not work for any of these vendors.
>> >
>> >
>> >More importantly, I am not saying that this discussion should not be had,
>> >I am just saying that this is a wrong forum.
>> >
>> >
>> >Just my 2 cents (or,...this was rather a dollar.)
>> >
>> >
>> >Regards,
>> >Shahab
>> >
>> >
>> >
>> >
>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>> ><ma...@apache.org> wrote:
>> >
>> >Errr, what's wrong with discussing these types of issues on list?
>> >
>> >Nothing public here, and as long as it's kept to facts, this should
>> >not be a problem and Apache is a fine place to have such discussions.
>> >
>> >My 2c.
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Xuri Nagarin <se...@gmail.com>
>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Date: Thursday, September 12, 2013 4:39 PM
>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>> >
>> >>I understand it can be contentious issue especially given that a lot of
>> >>contributors to this list work for one or the other vendor or have some
>> >>stake in any kind of evaluation. But, I see no reason why users should
>> >>not be able to compare notes
>> >> and share experiences. Over time, genuine pain points or issues or
>> >>claims will bubble up and should only help the community. Sure, there
>> >>will be a few flame wars but this already isn't a very tightly moderated
>> >>list.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> >><ae...@maprtech.com> wrote:
>> >>
>> >>Raj,
>> >>
>> >>
>> >>As others noted, this is not a great place for this discussion.  I'd
>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>> all
>> >>be happy to provide you more details.
>> >>
>> >>
>> >>I don't know about the others, but for MapR, just send an email to
>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >>to you with more information.
>> >>
>> >>
>> >>Best Regards,
>> >>Aaron Eng
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >>
>> >>
>> >>Hi,
>> >>
>> >>We are trying to evaluate different implementations of Hadoop for our
>> big
>> >>data enterprise project.
>> >>
>> >>Can the forum members advise on what are the advantages and
>> disadvantages
>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >>
>> >>Thanks in advance.
>> >>
>> >>Regards,
>> >>Raj
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by "M. C. Srivas" <mc...@gmail.com>.
So here's an example of marketing FUD at work.

On Mon, Sep 16, 2013 at 3:10 PM, Xuri Nagarin <se...@gmail.com> wrote:

> So I will try to answer the OP's question best I can without deviating too
> much into opinions and stick to facts. Disclaimer: I am not an employee of
> either vendor or any partner of theirs.
>
> Context is important: My team's use case was general data exploration of
> semi-structured log data and we had no typical data-warehouse type of
> existing use cases. Also, our's is a small (less than 30 nodes cluster). In
> terms of ops/maintenance, we only have one person. I point this out because
> lots of hadoop shops have dedicated team for each - OS administration,
> Hadoop admin, Hadoop developers. And, they are very mature in terms of
> their compute use cases. To my mind, these aspects can significantly impact
> your vendor choices.
>
> MapR: My team simply did not consider them because of all the proprietary
> code in there. We are trying to move from a monolithic proprietary product
> and one of the criteria we set was - if we decided to move away from the
> chosen hadoop vendor, can we easily unlock our data?
>

Unlock your data? How about disctp? Or just "cp"?

The fact is there are 10x  more standard ways to access your data in a MapR
cluster versus a Cloudera or Hortonworks data.

MapR is entirely open source, with proprietary add-ons, just like Cloudera
or Hortonworks.

The difference is MapR has innovated both above and below the Hadoop stack,
while Cloudera and Horton have only done so above the stack. MapR's
innovations have set the bar so high that its competition likes to spread
FUD.

[disclaimer: I work for MapR ]



> HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
> management is via Ambari. Compared to Cloudera's CM, Ambari has very
> rudimentary features. But you have to keep in mind that Ambari is only an
> year old where as CM already has been under development for several years.
> This was a major selection factor for us because Ambari did not have all
> the automation/feature-set compared to CM for a single
> administrator/developer to easily maintain the cluster. Also, during the
> trial period, Hortonwork's packing format/structure apparently kept
> changing which made things a bit difficult to centrally deploy/administer.
>
> Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
> management which is via their proprietary Cloudera Manager tool. It is free
> for use without certain feature like auditing and cluster replication
> features. Maybe a few more features are restricted to
> Enterprise/Licensed-only version. Offers much more features than Ambari. In
> terms of cluster administration, I found CM much easy to work with than
> Ambari. Pretty much all aspects from deploying new nodes to configuration
> and troubleshooting is much more refined than Ambari.
>
> During the selection process, what I found was that both vendors are very
> aggressive in their pitch. So much so that each pushes some FUD regarding
> the competition.
>

Obviously some of it worked, given some of the statements earlier.



>
> HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
> Cloudera's distro is heavily patched off-course from the core Apache trunk
> that can cause severe data corruption issues. Yes, Cloudera has some 1500+
> patches over apache's Hadoop distro but (1) they aren't private patches.
> You can pull the list and verify that yourself just as I did. (2) In our
> testing and talking to other Cloudera customers, I couldn't find any issues
> with data corruption. It is true though that HDFS 2.x is still in beta but
> so is MRv2 that HW uses. I think both are stable and work well - depending
> on what you need but each uses that point to create FUD.
>
> HW also claimed that a new SQL engine that Cloudera's including in their
> distro - Impala is proprietary. Not true. The software is open source. But
> if you want support for Impala then Cloudera will charge you separately per
> node for Impala over and above what they charge per node for Hadoop support.
>
> In my experience, both products have plenty of issues when it comes to
> compute engines - Hive, Pig etc and their cluster management software. HDFS
> seem to be solid in both distros. So I wouldn't call either of them
> trouble-free and neither is at the maturity level of other popular
> enterprise products like say, Oracle. That said, you have to keep in mind
> that both vendors/products are successfully used by several customers so
> again, it is more a question of what fits your needs.
>
> In the end, we chose to go with Cloudera mostly because a more positive
> experience with CM in terms of administration/operations and their
> pre-sales team when compared to HW. Again, that said, another team that we
> closely work with chose HW for their cluster. I use both vendors/clusters
> at work and neither has any significant issues.
>
>
>
>
> On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Here's the deal, folks can post questions to the list that aren't
>> abusive and simply asking what the difference between different vendor
>> implementations (downstream) of Apache  Hadoop is not an inflammatory
>> or abusive question.
>>
>> Stick to the facts. Discuss it here. Why should the Apache Hadoop
>> PMC push off potentially useful questions that may have upstream
>> implications to the Apache  Hadoop core and let all the innovation
>> occur downstream?
>>
>> Have the conversations here if you'd like. I wouldn't turn anyone
>> away..
>>
>> My 2c.
>>
>> Cheers,
>> Chris
>>
>> ----Original Message-----
>>
>> From: Shahab Yunus <sh...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Friday, September 13, 2013 10:48 AM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I think, in my opinion, it is a wrong idea because:
>> >
>> >
>> >1- Many of the participants here are employees for these very companies
>> >that are under discussion. This puts these respective employees in very
>> >difficult position. It is very hard to come with a correct response.
>> >Comments can be misconstrued easily.
>> >2- Also, when we talk about vendor distributions of the software, it is
>> >not longer purely about open source. Now companies with the related
>> >corporate legal baggage also gets in the mix.
>> >3- The discussion would be on not only positive things about each vendor
>> >but in fact negatives. The latter type of  discussion which can get
>> >unpleasant very easily.
>> >
>> >4- Somebody mentioned that, this is a very lightly moderated platform and
>> >thus this discussion should be allowed. I think this is one of the
>> >reasons that it should not be because, people can say things casually,
>> >without much thought, or without taking
>> > care of the context or the possible interpretations and get in trouble.
>> >5- The risk here is not only that serious repercussions can occur (which
>> >very well can) but the greater risk is that it can cause misunderstanding
>> >between individuals, industries and companies.
>> >6-People here lot of time reply quickly just to resolve or help the
>> >'technical' issue. Now they will have to take care how they frame the
>> >response. Re: 4
>> >
>> >
>> >I know some will feel that I have created a highly exaggerated scenario
>> >above, but what I am trying to say is that, it is a slippery slope. If we
>> >allow this then this can go anywhere.
>> >
>> >
>> >By the way, I do not work for any of these vendors.
>> >
>> >
>> >More importantly, I am not saying that this discussion should not be had,
>> >I am just saying that this is a wrong forum.
>> >
>> >
>> >Just my 2 cents (or,...this was rather a dollar.)
>> >
>> >
>> >Regards,
>> >Shahab
>> >
>> >
>> >
>> >
>> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
>> ><ma...@apache.org> wrote:
>> >
>> >Errr, what's wrong with discussing these types of issues on list?
>> >
>> >Nothing public here, and as long as it's kept to facts, this should
>> >not be a problem and Apache is a fine place to have such discussions.
>> >
>> >My 2c.
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Xuri Nagarin <se...@gmail.com>
>> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Date: Thursday, September 12, 2013 4:39 PM
>> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
>> >
>> >>I understand it can be contentious issue especially given that a lot of
>> >>contributors to this list work for one or the other vendor or have some
>> >>stake in any kind of evaluation. But, I see no reason why users should
>> >>not be able to compare notes
>> >> and share experiences. Over time, genuine pain points or issues or
>> >>claims will bubble up and should only help the community. Sure, there
>> >>will be a few flame wars but this already isn't a very tightly moderated
>> >>list.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> >><ae...@maprtech.com> wrote:
>> >>
>> >>Raj,
>> >>
>> >>
>> >>As others noted, this is not a great place for this discussion.  I'd
>> >>suggest contacting the vendors you are interested in as I'm sure we'd
>> all
>> >>be happy to provide you more details.
>> >>
>> >>
>> >>I don't know about the others, but for MapR, just send an email to
>> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >>to you with more information.
>> >>
>> >>
>> >>Best Regards,
>> >>Aaron Eng
>> >>
>> >>
>> >>
>> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >>
>> >>
>> >>Hi,
>> >>
>> >>We are trying to evaluate different implementations of Hadoop for our
>> big
>> >>data enterprise project.
>> >>
>> >>Can the forum members advise on what are the advantages and
>> disadvantages
>> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >>
>> >>Thanks in advance.
>> >>
>> >>Regards,
>> >>Raj
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
So I will try to answer the OP's question best I can without deviating too
much into opinions and stick to facts. Disclaimer: I am not an employee of
either vendor or any partner of theirs.

Context is important: My team's use case was general data exploration of
semi-structured log data and we had no typical data-warehouse type of
existing use cases. Also, our's is a small (less than 30 nodes cluster). In
terms of ops/maintenance, we only have one person. I point this out because
lots of hadoop shops have dedicated team for each - OS administration,
Hadoop admin, Hadoop developers. And, they are very mature in terms of
their compute use cases. To my mind, these aspects can significantly impact
your vendor choices.

MapR: My team simply did not consider them because of all the proprietary
code in there. We are trying to move from a monolithic proprietary product
and one of the criteria we set was - if we decided to move away from the
chosen hadoop vendor, can we easily unlock our data?
HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
management is via Ambari. Compared to Cloudera's CM, Ambari has very
rudimentary features. But you have to keep in mind that Ambari is only an
year old where as CM already has been under development for several years.
This was a major selection factor for us because Ambari did not have all
the automation/feature-set compared to CM for a single
administrator/developer to easily maintain the cluster. Also, during the
trial period, Hortonwork's packing format/structure apparently kept
changing which made things a bit difficult to centrally deploy/administer.

Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
management which is via their proprietary Cloudera Manager tool. It is free
for use without certain feature like auditing and cluster replication
features. Maybe a few more features are restricted to
Enterprise/Licensed-only version. Offers much more features than Ambari. In
terms of cluster administration, I found CM much easy to work with than
Ambari. Pretty much all aspects from deploying new nodes to configuration
and troubleshooting is much more refined than Ambari.

During the selection process, what I found was that both vendors are very
aggressive in their pitch. So much so that each pushes some FUD regarding
the competition.

HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
Cloudera's distro is heavily patched off-course from the core Apache trunk
that can cause severe data corruption issues. Yes, Cloudera has some 1500+
patches over apache's Hadoop distro but (1) they aren't private patches.
You can pull the list and verify that yourself just as I did. (2) In our
testing and talking to other Cloudera customers, I couldn't find any issues
with data corruption. It is true though that HDFS 2.x is still in beta but
so is MRv2 that HW uses. I think both are stable and work well - depending
on what you need but each uses that point to create FUD.

HW also claimed that a new SQL engine that Cloudera's including in their
distro - Impala is proprietary. Not true. The software is open source. But
if you want support for Impala then Cloudera will charge you separately per
node for Impala over and above what they charge per node for Hadoop support.

In my experience, both products have plenty of issues when it comes to
compute engines - Hive, Pig etc and their cluster management software. HDFS
seem to be solid in both distros. So I wouldn't call either of them
trouble-free and neither is at the maturity level of other popular
enterprise products like say, Oracle. That said, you have to keep in mind
that both vendors/products are successfully used by several customers so
again, it is more a question of what fits your needs.

In the end, we chose to go with Cloudera mostly because a more positive
experience with CM in terms of administration/operations and their
pre-sales team when compared to HW. Again, that said, another team that we
closely work with chose HW for their cluster. I use both vendors/clusters
at work and neither has any significant issues.




On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:

> Here's the deal, folks can post questions to the list that aren't
> abusive and simply asking what the difference between different vendor
> implementations (downstream) of Apache  Hadoop is not an inflammatory
> or abusive question.
>
> Stick to the facts. Discuss it here. Why should the Apache Hadoop
> PMC push off potentially useful questions that may have upstream
> implications to the Apache  Hadoop core and let all the innovation
> occur downstream?
>
> Have the conversations here if you'd like. I wouldn't turn anyone
> away..
>
> My 2c.
>
> Cheers,
> Chris
>
> ----Original Message-----
>
> From: Shahab Yunus <sh...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Friday, September 13, 2013 10:48 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
> >I think, in my opinion, it is a wrong idea because:
> >
> >
> >1- Many of the participants here are employees for these very companies
> >that are under discussion. This puts these respective employees in very
> >difficult position. It is very hard to come with a correct response.
> >Comments can be misconstrued easily.
> >2- Also, when we talk about vendor distributions of the software, it is
> >not longer purely about open source. Now companies with the related
> >corporate legal baggage also gets in the mix.
> >3- The discussion would be on not only positive things about each vendor
> >but in fact negatives. The latter type of  discussion which can get
> >unpleasant very easily.
> >
> >4- Somebody mentioned that, this is a very lightly moderated platform and
> >thus this discussion should be allowed. I think this is one of the
> >reasons that it should not be because, people can say things casually,
> >without much thought, or without taking
> > care of the context or the possible interpretations and get in trouble.
> >5- The risk here is not only that serious repercussions can occur (which
> >very well can) but the greater risk is that it can cause misunderstanding
> >between individuals, industries and companies.
> >6-People here lot of time reply quickly just to resolve or help the
> >'technical' issue. Now they will have to take care how they frame the
> >response. Re: 4
> >
> >
> >I know some will feel that I have created a highly exaggerated scenario
> >above, but what I am trying to say is that, it is a slippery slope. If we
> >allow this then this can go anywhere.
> >
> >
> >By the way, I do not work for any of these vendors.
> >
> >
> >More importantly, I am not saying that this discussion should not be had,
> >I am just saying that this is a wrong forum.
> >
> >
> >Just my 2 cents (or,...this was rather a dollar.)
> >
> >
> >Regards,
> >Shahab
> >
> >
> >
> >
> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
> ><ma...@apache.org> wrote:
> >
> >Errr, what's wrong with discussing these types of issues on list?
> >
> >Nothing public here, and as long as it's kept to facts, this should
> >not be a problem and Apache is a fine place to have such discussions.
> >
> >My 2c.
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Xuri Nagarin <se...@gmail.com>
> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> >Date: Thursday, September 12, 2013 4:39 PM
> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
> >
> >>I understand it can be contentious issue especially given that a lot of
> >>contributors to this list work for one or the other vendor or have some
> >>stake in any kind of evaluation. But, I see no reason why users should
> >>not be able to compare notes
> >> and share experiences. Over time, genuine pain points or issues or
> >>claims will bubble up and should only help the community. Sure, there
> >>will be a few flame wars but this already isn't a very tightly moderated
> >>list.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
> >><ae...@maprtech.com> wrote:
> >>
> >>Raj,
> >>
> >>
> >>As others noted, this is not a great place for this discussion.  I'd
> >>suggest contacting the vendors you are interested in as I'm sure we'd all
> >>be happy to provide you more details.
> >>
> >>
> >>I don't know about the others, but for MapR, just send an email to
> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
> back
> >>to you with more information.
> >>
> >>
> >>Best Regards,
> >>Aaron Eng
> >>
> >>
> >>
> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
> wrote:
> >>
> >>
> >>Hi,
> >>
> >>We are trying to evaluate different implementations of Hadoop for our big
> >>data enterprise project.
> >>
> >>Can the forum members advise on what are the advantages and disadvantages
> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> >>
> >>Thanks in advance.
> >>
> >>Regards,
> >>Raj
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
> >
> >
> >
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
So I will try to answer the OP's question best I can without deviating too
much into opinions and stick to facts. Disclaimer: I am not an employee of
either vendor or any partner of theirs.

Context is important: My team's use case was general data exploration of
semi-structured log data and we had no typical data-warehouse type of
existing use cases. Also, our's is a small (less than 30 nodes cluster). In
terms of ops/maintenance, we only have one person. I point this out because
lots of hadoop shops have dedicated team for each - OS administration,
Hadoop admin, Hadoop developers. And, they are very mature in terms of
their compute use cases. To my mind, these aspects can significantly impact
your vendor choices.

MapR: My team simply did not consider them because of all the proprietary
code in there. We are trying to move from a monolithic proprietary product
and one of the criteria we set was - if we decided to move away from the
chosen hadoop vendor, can we easily unlock our data?
HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
management is via Ambari. Compared to Cloudera's CM, Ambari has very
rudimentary features. But you have to keep in mind that Ambari is only an
year old where as CM already has been under development for several years.
This was a major selection factor for us because Ambari did not have all
the automation/feature-set compared to CM for a single
administrator/developer to easily maintain the cluster. Also, during the
trial period, Hortonwork's packing format/structure apparently kept
changing which made things a bit difficult to centrally deploy/administer.

Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
management which is via their proprietary Cloudera Manager tool. It is free
for use without certain feature like auditing and cluster replication
features. Maybe a few more features are restricted to
Enterprise/Licensed-only version. Offers much more features than Ambari. In
terms of cluster administration, I found CM much easy to work with than
Ambari. Pretty much all aspects from deploying new nodes to configuration
and troubleshooting is much more refined than Ambari.

During the selection process, what I found was that both vendors are very
aggressive in their pitch. So much so that each pushes some FUD regarding
the competition.

HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
Cloudera's distro is heavily patched off-course from the core Apache trunk
that can cause severe data corruption issues. Yes, Cloudera has some 1500+
patches over apache's Hadoop distro but (1) they aren't private patches.
You can pull the list and verify that yourself just as I did. (2) In our
testing and talking to other Cloudera customers, I couldn't find any issues
with data corruption. It is true though that HDFS 2.x is still in beta but
so is MRv2 that HW uses. I think both are stable and work well - depending
on what you need but each uses that point to create FUD.

HW also claimed that a new SQL engine that Cloudera's including in their
distro - Impala is proprietary. Not true. The software is open source. But
if you want support for Impala then Cloudera will charge you separately per
node for Impala over and above what they charge per node for Hadoop support.

In my experience, both products have plenty of issues when it comes to
compute engines - Hive, Pig etc and their cluster management software. HDFS
seem to be solid in both distros. So I wouldn't call either of them
trouble-free and neither is at the maturity level of other popular
enterprise products like say, Oracle. That said, you have to keep in mind
that both vendors/products are successfully used by several customers so
again, it is more a question of what fits your needs.

In the end, we chose to go with Cloudera mostly because a more positive
experience with CM in terms of administration/operations and their
pre-sales team when compared to HW. Again, that said, another team that we
closely work with chose HW for their cluster. I use both vendors/clusters
at work and neither has any significant issues.




On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:

> Here's the deal, folks can post questions to the list that aren't
> abusive and simply asking what the difference between different vendor
> implementations (downstream) of Apache  Hadoop is not an inflammatory
> or abusive question.
>
> Stick to the facts. Discuss it here. Why should the Apache Hadoop
> PMC push off potentially useful questions that may have upstream
> implications to the Apache  Hadoop core and let all the innovation
> occur downstream?
>
> Have the conversations here if you'd like. I wouldn't turn anyone
> away..
>
> My 2c.
>
> Cheers,
> Chris
>
> ----Original Message-----
>
> From: Shahab Yunus <sh...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Friday, September 13, 2013 10:48 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
> >I think, in my opinion, it is a wrong idea because:
> >
> >
> >1- Many of the participants here are employees for these very companies
> >that are under discussion. This puts these respective employees in very
> >difficult position. It is very hard to come with a correct response.
> >Comments can be misconstrued easily.
> >2- Also, when we talk about vendor distributions of the software, it is
> >not longer purely about open source. Now companies with the related
> >corporate legal baggage also gets in the mix.
> >3- The discussion would be on not only positive things about each vendor
> >but in fact negatives. The latter type of  discussion which can get
> >unpleasant very easily.
> >
> >4- Somebody mentioned that, this is a very lightly moderated platform and
> >thus this discussion should be allowed. I think this is one of the
> >reasons that it should not be because, people can say things casually,
> >without much thought, or without taking
> > care of the context or the possible interpretations and get in trouble.
> >5- The risk here is not only that serious repercussions can occur (which
> >very well can) but the greater risk is that it can cause misunderstanding
> >between individuals, industries and companies.
> >6-People here lot of time reply quickly just to resolve or help the
> >'technical' issue. Now they will have to take care how they frame the
> >response. Re: 4
> >
> >
> >I know some will feel that I have created a highly exaggerated scenario
> >above, but what I am trying to say is that, it is a slippery slope. If we
> >allow this then this can go anywhere.
> >
> >
> >By the way, I do not work for any of these vendors.
> >
> >
> >More importantly, I am not saying that this discussion should not be had,
> >I am just saying that this is a wrong forum.
> >
> >
> >Just my 2 cents (or,...this was rather a dollar.)
> >
> >
> >Regards,
> >Shahab
> >
> >
> >
> >
> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
> ><ma...@apache.org> wrote:
> >
> >Errr, what's wrong with discussing these types of issues on list?
> >
> >Nothing public here, and as long as it's kept to facts, this should
> >not be a problem and Apache is a fine place to have such discussions.
> >
> >My 2c.
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Xuri Nagarin <se...@gmail.com>
> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> >Date: Thursday, September 12, 2013 4:39 PM
> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
> >
> >>I understand it can be contentious issue especially given that a lot of
> >>contributors to this list work for one or the other vendor or have some
> >>stake in any kind of evaluation. But, I see no reason why users should
> >>not be able to compare notes
> >> and share experiences. Over time, genuine pain points or issues or
> >>claims will bubble up and should only help the community. Sure, there
> >>will be a few flame wars but this already isn't a very tightly moderated
> >>list.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
> >><ae...@maprtech.com> wrote:
> >>
> >>Raj,
> >>
> >>
> >>As others noted, this is not a great place for this discussion.  I'd
> >>suggest contacting the vendors you are interested in as I'm sure we'd all
> >>be happy to provide you more details.
> >>
> >>
> >>I don't know about the others, but for MapR, just send an email to
> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
> back
> >>to you with more information.
> >>
> >>
> >>Best Regards,
> >>Aaron Eng
> >>
> >>
> >>
> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
> wrote:
> >>
> >>
> >>Hi,
> >>
> >>We are trying to evaluate different implementations of Hadoop for our big
> >>data enterprise project.
> >>
> >>Can the forum members advise on what are the advantages and disadvantages
> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> >>
> >>Thanks in advance.
> >>
> >>Regards,
> >>Raj
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
> >
> >
> >
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
So I will try to answer the OP's question best I can without deviating too
much into opinions and stick to facts. Disclaimer: I am not an employee of
either vendor or any partner of theirs.

Context is important: My team's use case was general data exploration of
semi-structured log data and we had no typical data-warehouse type of
existing use cases. Also, our's is a small (less than 30 nodes cluster). In
terms of ops/maintenance, we only have one person. I point this out because
lots of hadoop shops have dedicated team for each - OS administration,
Hadoop admin, Hadoop developers. And, they are very mature in terms of
their compute use cases. To my mind, these aspects can significantly impact
your vendor choices.

MapR: My team simply did not consider them because of all the proprietary
code in there. We are trying to move from a monolithic proprietary product
and one of the criteria we set was - if we decided to move away from the
chosen hadoop vendor, can we easily unlock our data?
HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
management is via Ambari. Compared to Cloudera's CM, Ambari has very
rudimentary features. But you have to keep in mind that Ambari is only an
year old where as CM already has been under development for several years.
This was a major selection factor for us because Ambari did not have all
the automation/feature-set compared to CM for a single
administrator/developer to easily maintain the cluster. Also, during the
trial period, Hortonwork's packing format/structure apparently kept
changing which made things a bit difficult to centrally deploy/administer.

Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
management which is via their proprietary Cloudera Manager tool. It is free
for use without certain feature like auditing and cluster replication
features. Maybe a few more features are restricted to
Enterprise/Licensed-only version. Offers much more features than Ambari. In
terms of cluster administration, I found CM much easy to work with than
Ambari. Pretty much all aspects from deploying new nodes to configuration
and troubleshooting is much more refined than Ambari.

During the selection process, what I found was that both vendors are very
aggressive in their pitch. So much so that each pushes some FUD regarding
the competition.

HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
Cloudera's distro is heavily patched off-course from the core Apache trunk
that can cause severe data corruption issues. Yes, Cloudera has some 1500+
patches over apache's Hadoop distro but (1) they aren't private patches.
You can pull the list and verify that yourself just as I did. (2) In our
testing and talking to other Cloudera customers, I couldn't find any issues
with data corruption. It is true though that HDFS 2.x is still in beta but
so is MRv2 that HW uses. I think both are stable and work well - depending
on what you need but each uses that point to create FUD.

HW also claimed that a new SQL engine that Cloudera's including in their
distro - Impala is proprietary. Not true. The software is open source. But
if you want support for Impala then Cloudera will charge you separately per
node for Impala over and above what they charge per node for Hadoop support.

In my experience, both products have plenty of issues when it comes to
compute engines - Hive, Pig etc and their cluster management software. HDFS
seem to be solid in both distros. So I wouldn't call either of them
trouble-free and neither is at the maturity level of other popular
enterprise products like say, Oracle. That said, you have to keep in mind
that both vendors/products are successfully used by several customers so
again, it is more a question of what fits your needs.

In the end, we chose to go with Cloudera mostly because a more positive
experience with CM in terms of administration/operations and their
pre-sales team when compared to HW. Again, that said, another team that we
closely work with chose HW for their cluster. I use both vendors/clusters
at work and neither has any significant issues.




On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:

> Here's the deal, folks can post questions to the list that aren't
> abusive and simply asking what the difference between different vendor
> implementations (downstream) of Apache  Hadoop is not an inflammatory
> or abusive question.
>
> Stick to the facts. Discuss it here. Why should the Apache Hadoop
> PMC push off potentially useful questions that may have upstream
> implications to the Apache  Hadoop core and let all the innovation
> occur downstream?
>
> Have the conversations here if you'd like. I wouldn't turn anyone
> away..
>
> My 2c.
>
> Cheers,
> Chris
>
> ----Original Message-----
>
> From: Shahab Yunus <sh...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Friday, September 13, 2013 10:48 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
> >I think, in my opinion, it is a wrong idea because:
> >
> >
> >1- Many of the participants here are employees for these very companies
> >that are under discussion. This puts these respective employees in very
> >difficult position. It is very hard to come with a correct response.
> >Comments can be misconstrued easily.
> >2- Also, when we talk about vendor distributions of the software, it is
> >not longer purely about open source. Now companies with the related
> >corporate legal baggage also gets in the mix.
> >3- The discussion would be on not only positive things about each vendor
> >but in fact negatives. The latter type of  discussion which can get
> >unpleasant very easily.
> >
> >4- Somebody mentioned that, this is a very lightly moderated platform and
> >thus this discussion should be allowed. I think this is one of the
> >reasons that it should not be because, people can say things casually,
> >without much thought, or without taking
> > care of the context or the possible interpretations and get in trouble.
> >5- The risk here is not only that serious repercussions can occur (which
> >very well can) but the greater risk is that it can cause misunderstanding
> >between individuals, industries and companies.
> >6-People here lot of time reply quickly just to resolve or help the
> >'technical' issue. Now they will have to take care how they frame the
> >response. Re: 4
> >
> >
> >I know some will feel that I have created a highly exaggerated scenario
> >above, but what I am trying to say is that, it is a slippery slope. If we
> >allow this then this can go anywhere.
> >
> >
> >By the way, I do not work for any of these vendors.
> >
> >
> >More importantly, I am not saying that this discussion should not be had,
> >I am just saying that this is a wrong forum.
> >
> >
> >Just my 2 cents (or,...this was rather a dollar.)
> >
> >
> >Regards,
> >Shahab
> >
> >
> >
> >
> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
> ><ma...@apache.org> wrote:
> >
> >Errr, what's wrong with discussing these types of issues on list?
> >
> >Nothing public here, and as long as it's kept to facts, this should
> >not be a problem and Apache is a fine place to have such discussions.
> >
> >My 2c.
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Xuri Nagarin <se...@gmail.com>
> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> >Date: Thursday, September 12, 2013 4:39 PM
> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
> >
> >>I understand it can be contentious issue especially given that a lot of
> >>contributors to this list work for one or the other vendor or have some
> >>stake in any kind of evaluation. But, I see no reason why users should
> >>not be able to compare notes
> >> and share experiences. Over time, genuine pain points or issues or
> >>claims will bubble up and should only help the community. Sure, there
> >>will be a few flame wars but this already isn't a very tightly moderated
> >>list.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
> >><ae...@maprtech.com> wrote:
> >>
> >>Raj,
> >>
> >>
> >>As others noted, this is not a great place for this discussion.  I'd
> >>suggest contacting the vendors you are interested in as I'm sure we'd all
> >>be happy to provide you more details.
> >>
> >>
> >>I don't know about the others, but for MapR, just send an email to
> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
> back
> >>to you with more information.
> >>
> >>
> >>Best Regards,
> >>Aaron Eng
> >>
> >>
> >>
> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
> wrote:
> >>
> >>
> >>Hi,
> >>
> >>We are trying to evaluate different implementations of Hadoop for our big
> >>data enterprise project.
> >>
> >>Can the forum members advise on what are the advantages and disadvantages
> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> >>
> >>Thanks in advance.
> >>
> >>Regards,
> >>Raj
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
> >
> >
> >
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
So I will try to answer the OP's question best I can without deviating too
much into opinions and stick to facts. Disclaimer: I am not an employee of
either vendor or any partner of theirs.

Context is important: My team's use case was general data exploration of
semi-structured log data and we had no typical data-warehouse type of
existing use cases. Also, our's is a small (less than 30 nodes cluster). In
terms of ops/maintenance, we only have one person. I point this out because
lots of hadoop shops have dedicated team for each - OS administration,
Hadoop admin, Hadoop developers. And, they are very mature in terms of
their compute use cases. To my mind, these aspects can significantly impact
your vendor choices.

MapR: My team simply did not consider them because of all the proprietary
code in there. We are trying to move from a monolithic proprietary product
and one of the criteria we set was - if we decided to move away from the
chosen hadoop vendor, can we easily unlock our data?
HortonWorks: Distro uses HDFS 1.x with MRv2. All open source. Cluster
management is via Ambari. Compared to Cloudera's CM, Ambari has very
rudimentary features. But you have to keep in mind that Ambari is only an
year old where as CM already has been under development for several years.
This was a major selection factor for us because Ambari did not have all
the automation/feature-set compared to CM for a single
administrator/developer to easily maintain the cluster. Also, during the
trial period, Hortonwork's packing format/structure apparently kept
changing which made things a bit difficult to centrally deploy/administer.

Cloudera: Distro uses HDFS 2.x with MRv1. All open source except cluster
management which is via their proprietary Cloudera Manager tool. It is free
for use without certain feature like auditing and cluster replication
features. Maybe a few more features are restricted to
Enterprise/Licensed-only version. Offers much more features than Ambari. In
terms of cluster administration, I found CM much easy to work with than
Ambari. Pretty much all aspects from deploying new nodes to configuration
and troubleshooting is much more refined than Ambari.

During the selection process, what I found was that both vendors are very
aggressive in their pitch. So much so that each pushes some FUD regarding
the competition.

HW uses HDFS 1.x + MRv2 while CDH uses HDFS 2.x + MRv1. HW claimed that
Cloudera's distro is heavily patched off-course from the core Apache trunk
that can cause severe data corruption issues. Yes, Cloudera has some 1500+
patches over apache's Hadoop distro but (1) they aren't private patches.
You can pull the list and verify that yourself just as I did. (2) In our
testing and talking to other Cloudera customers, I couldn't find any issues
with data corruption. It is true though that HDFS 2.x is still in beta but
so is MRv2 that HW uses. I think both are stable and work well - depending
on what you need but each uses that point to create FUD.

HW also claimed that a new SQL engine that Cloudera's including in their
distro - Impala is proprietary. Not true. The software is open source. But
if you want support for Impala then Cloudera will charge you separately per
node for Impala over and above what they charge per node for Hadoop support.

In my experience, both products have plenty of issues when it comes to
compute engines - Hive, Pig etc and their cluster management software. HDFS
seem to be solid in both distros. So I wouldn't call either of them
trouble-free and neither is at the maturity level of other popular
enterprise products like say, Oracle. That said, you have to keep in mind
that both vendors/products are successfully used by several customers so
again, it is more a question of what fits your needs.

In the end, we chose to go with Cloudera mostly because a more positive
experience with CM in terms of administration/operations and their
pre-sales team when compared to HW. Again, that said, another team that we
closely work with chose HW for their cluster. I use both vendors/clusters
at work and neither has any significant issues.




On Sat, Sep 14, 2013 at 12:37 PM, Chris Mattmann <ma...@apache.org>wrote:

> Here's the deal, folks can post questions to the list that aren't
> abusive and simply asking what the difference between different vendor
> implementations (downstream) of Apache  Hadoop is not an inflammatory
> or abusive question.
>
> Stick to the facts. Discuss it here. Why should the Apache Hadoop
> PMC push off potentially useful questions that may have upstream
> implications to the Apache  Hadoop core and let all the innovation
> occur downstream?
>
> Have the conversations here if you'd like. I wouldn't turn anyone
> away..
>
> My 2c.
>
> Cheers,
> Chris
>
> ----Original Message-----
>
> From: Shahab Yunus <sh...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Friday, September 13, 2013 10:48 AM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
> >I think, in my opinion, it is a wrong idea because:
> >
> >
> >1- Many of the participants here are employees for these very companies
> >that are under discussion. This puts these respective employees in very
> >difficult position. It is very hard to come with a correct response.
> >Comments can be misconstrued easily.
> >2- Also, when we talk about vendor distributions of the software, it is
> >not longer purely about open source. Now companies with the related
> >corporate legal baggage also gets in the mix.
> >3- The discussion would be on not only positive things about each vendor
> >but in fact negatives. The latter type of  discussion which can get
> >unpleasant very easily.
> >
> >4- Somebody mentioned that, this is a very lightly moderated platform and
> >thus this discussion should be allowed. I think this is one of the
> >reasons that it should not be because, people can say things casually,
> >without much thought, or without taking
> > care of the context or the possible interpretations and get in trouble.
> >5- The risk here is not only that serious repercussions can occur (which
> >very well can) but the greater risk is that it can cause misunderstanding
> >between individuals, industries and companies.
> >6-People here lot of time reply quickly just to resolve or help the
> >'technical' issue. Now they will have to take care how they frame the
> >response. Re: 4
> >
> >
> >I know some will feel that I have created a highly exaggerated scenario
> >above, but what I am trying to say is that, it is a slippery slope. If we
> >allow this then this can go anywhere.
> >
> >
> >By the way, I do not work for any of these vendors.
> >
> >
> >More importantly, I am not saying that this discussion should not be had,
> >I am just saying that this is a wrong forum.
> >
> >
> >Just my 2 cents (or,...this was rather a dollar.)
> >
> >
> >Regards,
> >Shahab
> >
> >
> >
> >
> >On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
> ><ma...@apache.org> wrote:
> >
> >Errr, what's wrong with discussing these types of issues on list?
> >
> >Nothing public here, and as long as it's kept to facts, this should
> >not be a problem and Apache is a fine place to have such discussions.
> >
> >My 2c.
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Xuri Nagarin <se...@gmail.com>
> >Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> >Date: Thursday, September 12, 2013 4:39 PM
> >To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> >Subject: Re: Cloudera Vs Hortonworks Vs MapR
> >
> >>I understand it can be contentious issue especially given that a lot of
> >>contributors to this list work for one or the other vendor or have some
> >>stake in any kind of evaluation. But, I see no reason why users should
> >>not be able to compare notes
> >> and share experiences. Over time, genuine pain points or issues or
> >>claims will bubble up and should only help the community. Sure, there
> >>will be a few flame wars but this already isn't a very tightly moderated
> >>list.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
> >><ae...@maprtech.com> wrote:
> >>
> >>Raj,
> >>
> >>
> >>As others noted, this is not a great place for this discussion.  I'd
> >>suggest contacting the vendors you are interested in as I'm sure we'd all
> >>be happy to provide you more details.
> >>
> >>
> >>I don't know about the others, but for MapR, just send an email to
> >>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
> back
> >>to you with more information.
> >>
> >>
> >>Best Regards,
> >>Aaron Eng
> >>
> >>
> >>
> >>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
> wrote:
> >>
> >>
> >>Hi,
> >>
> >>We are trying to evaluate different implementations of Hadoop for our big
> >>data enterprise project.
> >>
> >>Can the forum members advise on what are the advantages and disadvantages
> >>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> >>
> >>Thanks in advance.
> >>
> >>Regards,
> >>Raj
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
> >
> >
> >
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
The against argument essentially is that let's not talk about it because it
is messy and there are vested interests that may or may not contribute to
the best of their knowledge and end up harming the community.

The Pro argument I would make is - make your arguments in favour of or
against a distro/product - good, bad, biased or whatever - people are smart
and they can read and figure out whats relevant for them and what's not. I
doubt anyone would make their decision based on a discussion on this list.
What a discussion would provide is guidance for people evaluating their
choices about what parameters to compare on and what use cases might be
useful to gauge a good fit. In general, you see products on Amazon etc
being debated/reviewed by users. You do not see a similar discussion around
enterprise products. Why not? In my experience, it leads to not-sharing of
knowledge/ideas and that leads to bad products. Again, more generally, as I
get older, I err on the side of allowing speech instead of limiting it. The
great gift of digital technology is you can tune out whatever you do not
want to hear/read.

That said, I do agree that this should be spun-off on to a different forum.
We should reserve a forum (maybe this one) for purely technical support
issues. Also, I noticed there isn't a list for hadoop-admins that discusses
all the operational issues in managing a cluster.




On Fri, Sep 13, 2013 at 11:01 AM, Adam Muise <am...@hortonworks.com> wrote:

> I would just through an additional point on top of Shahab's excellent
> summary.
>
> To evaluate a distribution requires more than just the technical aspects
> of that distribution. Even if we kept the discussion to the mailing list's
> technical Hadoop usage focus, any company/organization looking to use a
> distro is going to have to consider the costs, support, platform, partner
> ecosystem, market share, company strategy, etc. Even if everyone behaved,
> none of the topics I mentioned are appropriate for a Hadoop user mailing
> list so you could not make an informed decision from a user list discussion
> and you might as well give the vendors a call.
>
> Thanks,
>
> Adam
>
>
> On Fri, Sep 13, 2013 at 1:48 PM, Shahab Yunus <sh...@gmail.com>wrote:
>
>> I think, in my opinion, it is a wrong idea because:
>>
>> 1- Many of the participants here are employees for these very companies
>> that are under discussion. This puts these respective employees in very
>> difficult position. It is very hard to come with a correct response.
>> Comments can be misconstrued easily.
>> 2- Also, when we talk about vendor distributions of the software, it is
>> not longer purely about open source. Now companies with the related
>> corporate legal baggage also gets in the mix.
>> 3- The discussion would be on not only positive things about each vendor
>> but in fact negatives. The latter type of  discussion which can get
>> unpleasant very easily.
>> 4- Somebody mentioned that, this is a very lightly moderated platform and
>> thus this discussion should be allowed. I think this is one of the reasons
>> that it should not be because, people can say things casually, without much
>> thought, or without taking care of the context or the possible
>> interpretations and get in trouble.
>> 5- The risk here is not only that serious repercussions can occur (which
>> very well can) but the greater risk is that it can cause misunderstanding
>> between individuals, industries and companies.
>> 6-People here lot of time reply quickly just to resolve or help the
>> 'technical' issue. Now they will have to take care how they frame the
>> response. Re: 4
>>
>> I know some will feel that I have created a highly exaggerated scenario
>> above, but what I am trying to say is that, it is a slippery slope. If we
>> allow this then this can go anywhere.
>>
>> By the way, I do not work for any of these vendors.
>>
>> More importantly, I am not saying that this discussion should not be had,
>> I am just saying that this is a wrong forum.
>>
>> Just my 2 cents (or,...this was rather a dollar.)
>>
>> Regards,
>> Shahab
>>
>>
>> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>>
>>> Errr, what's wrong with discussing these types of issues on list?
>>>
>>> Nothing public here, and as long as it's kept to facts, this should
>>> not be a problem and Apache is a fine place to have such discussions.
>>>
>>> My 2c.
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Xuri Nagarin <se...@gmail.com>
>>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Date: Thursday, September 12, 2013 4:39 PM
>>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>>
>>> >I understand it can be contentious issue especially given that a lot of
>>> >contributors to this list work for one or the other vendor or have some
>>> >stake in any kind of evaluation. But, I see no reason why users should
>>> >not be able to compare notes
>>> > and share experiences. Over time, genuine pain points or issues or
>>> >claims will bubble up and should only help the community. Sure, there
>>> >will be a few flame wars but this already isn't a very tightly moderated
>>> >list.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>>> ><ae...@maprtech.com> wrote:
>>> >
>>> >Raj,
>>> >
>>> >
>>> >As others noted, this is not a great place for this discussion.  I'd
>>> >suggest contacting the vendors you are interested in as I'm sure we'd
>>> all
>>> >be happy to provide you more details.
>>> >
>>> >
>>> >I don't know about the others, but for MapR, just send an email to
>>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>>> back
>>> >to you with more information.
>>> >
>>> >
>>> >Best Regards,
>>> >Aaron Eng
>>> >
>>> >
>>> >
>>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>>> wrote:
>>> >
>>> >
>>> >Hi,
>>> >
>>> >We are trying to evaluate different implementations of Hadoop for our
>>> big
>>> >data enterprise project.
>>> >
>>> >Can the forum members advise on what are the advantages and
>>> disadvantages
>>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>> >
>>> >Thanks in advance.
>>> >
>>> >Regards,
>>> >Raj
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>
>
>
> --
> *
> *
> *
> *
> *Adam Muise*
> Solution Engineer
> *Hortonworks*
> amuise@hortonworks.com
> 416-417-4037
>
> Hortonworks - Develops, Distributes and Supports Enterprise Apache Hadoop.<http://hortonworks.com/>
>
> Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox>
>
> Hadoop: Disruptive Possibilities by Jeff Needham<http://hortonworks.com/resources/?did=72&cat=1>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
The against argument essentially is that let's not talk about it because it
is messy and there are vested interests that may or may not contribute to
the best of their knowledge and end up harming the community.

The Pro argument I would make is - make your arguments in favour of or
against a distro/product - good, bad, biased or whatever - people are smart
and they can read and figure out whats relevant for them and what's not. I
doubt anyone would make their decision based on a discussion on this list.
What a discussion would provide is guidance for people evaluating their
choices about what parameters to compare on and what use cases might be
useful to gauge a good fit. In general, you see products on Amazon etc
being debated/reviewed by users. You do not see a similar discussion around
enterprise products. Why not? In my experience, it leads to not-sharing of
knowledge/ideas and that leads to bad products. Again, more generally, as I
get older, I err on the side of allowing speech instead of limiting it. The
great gift of digital technology is you can tune out whatever you do not
want to hear/read.

That said, I do agree that this should be spun-off on to a different forum.
We should reserve a forum (maybe this one) for purely technical support
issues. Also, I noticed there isn't a list for hadoop-admins that discusses
all the operational issues in managing a cluster.




On Fri, Sep 13, 2013 at 11:01 AM, Adam Muise <am...@hortonworks.com> wrote:

> I would just through an additional point on top of Shahab's excellent
> summary.
>
> To evaluate a distribution requires more than just the technical aspects
> of that distribution. Even if we kept the discussion to the mailing list's
> technical Hadoop usage focus, any company/organization looking to use a
> distro is going to have to consider the costs, support, platform, partner
> ecosystem, market share, company strategy, etc. Even if everyone behaved,
> none of the topics I mentioned are appropriate for a Hadoop user mailing
> list so you could not make an informed decision from a user list discussion
> and you might as well give the vendors a call.
>
> Thanks,
>
> Adam
>
>
> On Fri, Sep 13, 2013 at 1:48 PM, Shahab Yunus <sh...@gmail.com>wrote:
>
>> I think, in my opinion, it is a wrong idea because:
>>
>> 1- Many of the participants here are employees for these very companies
>> that are under discussion. This puts these respective employees in very
>> difficult position. It is very hard to come with a correct response.
>> Comments can be misconstrued easily.
>> 2- Also, when we talk about vendor distributions of the software, it is
>> not longer purely about open source. Now companies with the related
>> corporate legal baggage also gets in the mix.
>> 3- The discussion would be on not only positive things about each vendor
>> but in fact negatives. The latter type of  discussion which can get
>> unpleasant very easily.
>> 4- Somebody mentioned that, this is a very lightly moderated platform and
>> thus this discussion should be allowed. I think this is one of the reasons
>> that it should not be because, people can say things casually, without much
>> thought, or without taking care of the context or the possible
>> interpretations and get in trouble.
>> 5- The risk here is not only that serious repercussions can occur (which
>> very well can) but the greater risk is that it can cause misunderstanding
>> between individuals, industries and companies.
>> 6-People here lot of time reply quickly just to resolve or help the
>> 'technical' issue. Now they will have to take care how they frame the
>> response. Re: 4
>>
>> I know some will feel that I have created a highly exaggerated scenario
>> above, but what I am trying to say is that, it is a slippery slope. If we
>> allow this then this can go anywhere.
>>
>> By the way, I do not work for any of these vendors.
>>
>> More importantly, I am not saying that this discussion should not be had,
>> I am just saying that this is a wrong forum.
>>
>> Just my 2 cents (or,...this was rather a dollar.)
>>
>> Regards,
>> Shahab
>>
>>
>> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>>
>>> Errr, what's wrong with discussing these types of issues on list?
>>>
>>> Nothing public here, and as long as it's kept to facts, this should
>>> not be a problem and Apache is a fine place to have such discussions.
>>>
>>> My 2c.
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Xuri Nagarin <se...@gmail.com>
>>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Date: Thursday, September 12, 2013 4:39 PM
>>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>>
>>> >I understand it can be contentious issue especially given that a lot of
>>> >contributors to this list work for one or the other vendor or have some
>>> >stake in any kind of evaluation. But, I see no reason why users should
>>> >not be able to compare notes
>>> > and share experiences. Over time, genuine pain points or issues or
>>> >claims will bubble up and should only help the community. Sure, there
>>> >will be a few flame wars but this already isn't a very tightly moderated
>>> >list.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>>> ><ae...@maprtech.com> wrote:
>>> >
>>> >Raj,
>>> >
>>> >
>>> >As others noted, this is not a great place for this discussion.  I'd
>>> >suggest contacting the vendors you are interested in as I'm sure we'd
>>> all
>>> >be happy to provide you more details.
>>> >
>>> >
>>> >I don't know about the others, but for MapR, just send an email to
>>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>>> back
>>> >to you with more information.
>>> >
>>> >
>>> >Best Regards,
>>> >Aaron Eng
>>> >
>>> >
>>> >
>>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>>> wrote:
>>> >
>>> >
>>> >Hi,
>>> >
>>> >We are trying to evaluate different implementations of Hadoop for our
>>> big
>>> >data enterprise project.
>>> >
>>> >Can the forum members advise on what are the advantages and
>>> disadvantages
>>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>> >
>>> >Thanks in advance.
>>> >
>>> >Regards,
>>> >Raj
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>
>
>
> --
> *
> *
> *
> *
> *Adam Muise*
> Solution Engineer
> *Hortonworks*
> amuise@hortonworks.com
> 416-417-4037
>
> Hortonworks - Develops, Distributes and Supports Enterprise Apache Hadoop.<http://hortonworks.com/>
>
> Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox>
>
> Hadoop: Disruptive Possibilities by Jeff Needham<http://hortonworks.com/resources/?did=72&cat=1>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
The against argument essentially is that let's not talk about it because it
is messy and there are vested interests that may or may not contribute to
the best of their knowledge and end up harming the community.

The Pro argument I would make is - make your arguments in favour of or
against a distro/product - good, bad, biased or whatever - people are smart
and they can read and figure out whats relevant for them and what's not. I
doubt anyone would make their decision based on a discussion on this list.
What a discussion would provide is guidance for people evaluating their
choices about what parameters to compare on and what use cases might be
useful to gauge a good fit. In general, you see products on Amazon etc
being debated/reviewed by users. You do not see a similar discussion around
enterprise products. Why not? In my experience, it leads to not-sharing of
knowledge/ideas and that leads to bad products. Again, more generally, as I
get older, I err on the side of allowing speech instead of limiting it. The
great gift of digital technology is you can tune out whatever you do not
want to hear/read.

That said, I do agree that this should be spun-off on to a different forum.
We should reserve a forum (maybe this one) for purely technical support
issues. Also, I noticed there isn't a list for hadoop-admins that discusses
all the operational issues in managing a cluster.




On Fri, Sep 13, 2013 at 11:01 AM, Adam Muise <am...@hortonworks.com> wrote:

> I would just through an additional point on top of Shahab's excellent
> summary.
>
> To evaluate a distribution requires more than just the technical aspects
> of that distribution. Even if we kept the discussion to the mailing list's
> technical Hadoop usage focus, any company/organization looking to use a
> distro is going to have to consider the costs, support, platform, partner
> ecosystem, market share, company strategy, etc. Even if everyone behaved,
> none of the topics I mentioned are appropriate for a Hadoop user mailing
> list so you could not make an informed decision from a user list discussion
> and you might as well give the vendors a call.
>
> Thanks,
>
> Adam
>
>
> On Fri, Sep 13, 2013 at 1:48 PM, Shahab Yunus <sh...@gmail.com>wrote:
>
>> I think, in my opinion, it is a wrong idea because:
>>
>> 1- Many of the participants here are employees for these very companies
>> that are under discussion. This puts these respective employees in very
>> difficult position. It is very hard to come with a correct response.
>> Comments can be misconstrued easily.
>> 2- Also, when we talk about vendor distributions of the software, it is
>> not longer purely about open source. Now companies with the related
>> corporate legal baggage also gets in the mix.
>> 3- The discussion would be on not only positive things about each vendor
>> but in fact negatives. The latter type of  discussion which can get
>> unpleasant very easily.
>> 4- Somebody mentioned that, this is a very lightly moderated platform and
>> thus this discussion should be allowed. I think this is one of the reasons
>> that it should not be because, people can say things casually, without much
>> thought, or without taking care of the context or the possible
>> interpretations and get in trouble.
>> 5- The risk here is not only that serious repercussions can occur (which
>> very well can) but the greater risk is that it can cause misunderstanding
>> between individuals, industries and companies.
>> 6-People here lot of time reply quickly just to resolve or help the
>> 'technical' issue. Now they will have to take care how they frame the
>> response. Re: 4
>>
>> I know some will feel that I have created a highly exaggerated scenario
>> above, but what I am trying to say is that, it is a slippery slope. If we
>> allow this then this can go anywhere.
>>
>> By the way, I do not work for any of these vendors.
>>
>> More importantly, I am not saying that this discussion should not be had,
>> I am just saying that this is a wrong forum.
>>
>> Just my 2 cents (or,...this was rather a dollar.)
>>
>> Regards,
>> Shahab
>>
>>
>> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>>
>>> Errr, what's wrong with discussing these types of issues on list?
>>>
>>> Nothing public here, and as long as it's kept to facts, this should
>>> not be a problem and Apache is a fine place to have such discussions.
>>>
>>> My 2c.
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Xuri Nagarin <se...@gmail.com>
>>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Date: Thursday, September 12, 2013 4:39 PM
>>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>>
>>> >I understand it can be contentious issue especially given that a lot of
>>> >contributors to this list work for one or the other vendor or have some
>>> >stake in any kind of evaluation. But, I see no reason why users should
>>> >not be able to compare notes
>>> > and share experiences. Over time, genuine pain points or issues or
>>> >claims will bubble up and should only help the community. Sure, there
>>> >will be a few flame wars but this already isn't a very tightly moderated
>>> >list.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>>> ><ae...@maprtech.com> wrote:
>>> >
>>> >Raj,
>>> >
>>> >
>>> >As others noted, this is not a great place for this discussion.  I'd
>>> >suggest contacting the vendors you are interested in as I'm sure we'd
>>> all
>>> >be happy to provide you more details.
>>> >
>>> >
>>> >I don't know about the others, but for MapR, just send an email to
>>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>>> back
>>> >to you with more information.
>>> >
>>> >
>>> >Best Regards,
>>> >Aaron Eng
>>> >
>>> >
>>> >
>>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>>> wrote:
>>> >
>>> >
>>> >Hi,
>>> >
>>> >We are trying to evaluate different implementations of Hadoop for our
>>> big
>>> >data enterprise project.
>>> >
>>> >Can the forum members advise on what are the advantages and
>>> disadvantages
>>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>> >
>>> >Thanks in advance.
>>> >
>>> >Regards,
>>> >Raj
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>
>
>
> --
> *
> *
> *
> *
> *Adam Muise*
> Solution Engineer
> *Hortonworks*
> amuise@hortonworks.com
> 416-417-4037
>
> Hortonworks - Develops, Distributes and Supports Enterprise Apache Hadoop.<http://hortonworks.com/>
>
> Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox>
>
> Hadoop: Disruptive Possibilities by Jeff Needham<http://hortonworks.com/resources/?did=72&cat=1>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
The against argument essentially is that let's not talk about it because it
is messy and there are vested interests that may or may not contribute to
the best of their knowledge and end up harming the community.

The Pro argument I would make is - make your arguments in favour of or
against a distro/product - good, bad, biased or whatever - people are smart
and they can read and figure out whats relevant for them and what's not. I
doubt anyone would make their decision based on a discussion on this list.
What a discussion would provide is guidance for people evaluating their
choices about what parameters to compare on and what use cases might be
useful to gauge a good fit. In general, you see products on Amazon etc
being debated/reviewed by users. You do not see a similar discussion around
enterprise products. Why not? In my experience, it leads to not-sharing of
knowledge/ideas and that leads to bad products. Again, more generally, as I
get older, I err on the side of allowing speech instead of limiting it. The
great gift of digital technology is you can tune out whatever you do not
want to hear/read.

That said, I do agree that this should be spun-off on to a different forum.
We should reserve a forum (maybe this one) for purely technical support
issues. Also, I noticed there isn't a list for hadoop-admins that discusses
all the operational issues in managing a cluster.




On Fri, Sep 13, 2013 at 11:01 AM, Adam Muise <am...@hortonworks.com> wrote:

> I would just through an additional point on top of Shahab's excellent
> summary.
>
> To evaluate a distribution requires more than just the technical aspects
> of that distribution. Even if we kept the discussion to the mailing list's
> technical Hadoop usage focus, any company/organization looking to use a
> distro is going to have to consider the costs, support, platform, partner
> ecosystem, market share, company strategy, etc. Even if everyone behaved,
> none of the topics I mentioned are appropriate for a Hadoop user mailing
> list so you could not make an informed decision from a user list discussion
> and you might as well give the vendors a call.
>
> Thanks,
>
> Adam
>
>
> On Fri, Sep 13, 2013 at 1:48 PM, Shahab Yunus <sh...@gmail.com>wrote:
>
>> I think, in my opinion, it is a wrong idea because:
>>
>> 1- Many of the participants here are employees for these very companies
>> that are under discussion. This puts these respective employees in very
>> difficult position. It is very hard to come with a correct response.
>> Comments can be misconstrued easily.
>> 2- Also, when we talk about vendor distributions of the software, it is
>> not longer purely about open source. Now companies with the related
>> corporate legal baggage also gets in the mix.
>> 3- The discussion would be on not only positive things about each vendor
>> but in fact negatives. The latter type of  discussion which can get
>> unpleasant very easily.
>> 4- Somebody mentioned that, this is a very lightly moderated platform and
>> thus this discussion should be allowed. I think this is one of the reasons
>> that it should not be because, people can say things casually, without much
>> thought, or without taking care of the context or the possible
>> interpretations and get in trouble.
>> 5- The risk here is not only that serious repercussions can occur (which
>> very well can) but the greater risk is that it can cause misunderstanding
>> between individuals, industries and companies.
>> 6-People here lot of time reply quickly just to resolve or help the
>> 'technical' issue. Now they will have to take care how they frame the
>> response. Re: 4
>>
>> I know some will feel that I have created a highly exaggerated scenario
>> above, but what I am trying to say is that, it is a slippery slope. If we
>> allow this then this can go anywhere.
>>
>> By the way, I do not work for any of these vendors.
>>
>> More importantly, I am not saying that this discussion should not be had,
>> I am just saying that this is a wrong forum.
>>
>> Just my 2 cents (or,...this was rather a dollar.)
>>
>> Regards,
>> Shahab
>>
>>
>> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>>
>>> Errr, what's wrong with discussing these types of issues on list?
>>>
>>> Nothing public here, and as long as it's kept to facts, this should
>>> not be a problem and Apache is a fine place to have such discussions.
>>>
>>> My 2c.
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Xuri Nagarin <se...@gmail.com>
>>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Date: Thursday, September 12, 2013 4:39 PM
>>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>>
>>> >I understand it can be contentious issue especially given that a lot of
>>> >contributors to this list work for one or the other vendor or have some
>>> >stake in any kind of evaluation. But, I see no reason why users should
>>> >not be able to compare notes
>>> > and share experiences. Over time, genuine pain points or issues or
>>> >claims will bubble up and should only help the community. Sure, there
>>> >will be a few flame wars but this already isn't a very tightly moderated
>>> >list.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>>> ><ae...@maprtech.com> wrote:
>>> >
>>> >Raj,
>>> >
>>> >
>>> >As others noted, this is not a great place for this discussion.  I'd
>>> >suggest contacting the vendors you are interested in as I'm sure we'd
>>> all
>>> >be happy to provide you more details.
>>> >
>>> >
>>> >I don't know about the others, but for MapR, just send an email to
>>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>>> back
>>> >to you with more information.
>>> >
>>> >
>>> >Best Regards,
>>> >Aaron Eng
>>> >
>>> >
>>> >
>>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>>> wrote:
>>> >
>>> >
>>> >Hi,
>>> >
>>> >We are trying to evaluate different implementations of Hadoop for our
>>> big
>>> >data enterprise project.
>>> >
>>> >Can the forum members advise on what are the advantages and
>>> disadvantages
>>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>> >
>>> >Thanks in advance.
>>> >
>>> >Regards,
>>> >Raj
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>
>
>
> --
> *
> *
> *
> *
> *Adam Muise*
> Solution Engineer
> *Hortonworks*
> amuise@hortonworks.com
> 416-417-4037
>
> Hortonworks - Develops, Distributes and Supports Enterprise Apache Hadoop.<http://hortonworks.com/>
>
> Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox>
>
> Hadoop: Disruptive Possibilities by Jeff Needham<http://hortonworks.com/resources/?did=72&cat=1>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Adam Muise <am...@hortonworks.com>.
I would just through an additional point on top of Shahab's excellent
summary.

To evaluate a distribution requires more than just the technical aspects of
that distribution. Even if we kept the discussion to the mailing list's
technical Hadoop usage focus, any company/organization looking to use a
distro is going to have to consider the costs, support, platform, partner
ecosystem, market share, company strategy, etc. Even if everyone behaved,
none of the topics I mentioned are appropriate for a Hadoop user mailing
list so you could not make an informed decision from a user list discussion
and you might as well give the vendors a call.

Thanks,

Adam


On Fri, Sep 13, 2013 at 1:48 PM, Shahab Yunus <sh...@gmail.com>wrote:

> I think, in my opinion, it is a wrong idea because:
>
> 1- Many of the participants here are employees for these very companies
> that are under discussion. This puts these respective employees in very
> difficult position. It is very hard to come with a correct response.
> Comments can be misconstrued easily.
> 2- Also, when we talk about vendor distributions of the software, it is
> not longer purely about open source. Now companies with the related
> corporate legal baggage also gets in the mix.
> 3- The discussion would be on not only positive things about each vendor
> but in fact negatives. The latter type of  discussion which can get
> unpleasant very easily.
> 4- Somebody mentioned that, this is a very lightly moderated platform and
> thus this discussion should be allowed. I think this is one of the reasons
> that it should not be because, people can say things casually, without much
> thought, or without taking care of the context or the possible
> interpretations and get in trouble.
> 5- The risk here is not only that serious repercussions can occur (which
> very well can) but the greater risk is that it can cause misunderstanding
> between individuals, industries and companies.
> 6-People here lot of time reply quickly just to resolve or help the
> 'technical' issue. Now they will have to take care how they frame the
> response. Re: 4
>
> I know some will feel that I have created a highly exaggerated scenario
> above, but what I am trying to say is that, it is a slippery slope. If we
> allow this then this can go anywhere.
>
> By the way, I do not work for any of these vendors.
>
> More importantly, I am not saying that this discussion should not be had,
> I am just saying that this is a wrong forum.
>
> Just my 2 cents (or,...this was rather a dollar.)
>
> Regards,
> Shahab
>
>
> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Errr, what's wrong with discussing these types of issues on list?
>>
>> Nothing public here, and as long as it's kept to facts, this should
>> not be a problem and Apache is a fine place to have such discussions.
>>
>> My 2c.
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Xuri Nagarin <se...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, September 12, 2013 4:39 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I understand it can be contentious issue especially given that a lot of
>> >contributors to this list work for one or the other vendor or have some
>> >stake in any kind of evaluation. But, I see no reason why users should
>> >not be able to compare notes
>> > and share experiences. Over time, genuine pain points or issues or
>> >claims will bubble up and should only help the community. Sure, there
>> >will be a few flame wars but this already isn't a very tightly moderated
>> >list.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> ><ae...@maprtech.com> wrote:
>> >
>> >Raj,
>> >
>> >
>> >As others noted, this is not a great place for this discussion.  I'd
>> >suggest contacting the vendors you are interested in as I'm sure we'd all
>> >be happy to provide you more details.
>> >
>> >
>> >I don't know about the others, but for MapR, just send an email to
>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >to you with more information.
>> >
>> >
>> >Best Regards,
>> >Aaron Eng
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >
>> >
>> >Hi,
>> >
>> >We are trying to evaluate different implementations of Hadoop for our big
>> >data enterprise project.
>> >
>> >Can the forum members advise on what are the advantages and disadvantages
>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >
>> >Thanks in advance.
>> >
>> >Regards,
>> >Raj
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>


-- 
*
*
*
*
*Adam Muise*
Solution Engineer
*Hortonworks*
amuise@hortonworks.com
416-417-4037

Hortonworks - Develops, Distributes and Supports Enterprise Apache
Hadoop.<http://hortonworks.com/>

Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox>

Hadoop: Disruptive Possibilities by Jeff
Needham<http://hortonworks.com/resources/?did=72&cat=1>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Adam Muise <am...@hortonworks.com>.
I would just through an additional point on top of Shahab's excellent
summary.

To evaluate a distribution requires more than just the technical aspects of
that distribution. Even if we kept the discussion to the mailing list's
technical Hadoop usage focus, any company/organization looking to use a
distro is going to have to consider the costs, support, platform, partner
ecosystem, market share, company strategy, etc. Even if everyone behaved,
none of the topics I mentioned are appropriate for a Hadoop user mailing
list so you could not make an informed decision from a user list discussion
and you might as well give the vendors a call.

Thanks,

Adam


On Fri, Sep 13, 2013 at 1:48 PM, Shahab Yunus <sh...@gmail.com>wrote:

> I think, in my opinion, it is a wrong idea because:
>
> 1- Many of the participants here are employees for these very companies
> that are under discussion. This puts these respective employees in very
> difficult position. It is very hard to come with a correct response.
> Comments can be misconstrued easily.
> 2- Also, when we talk about vendor distributions of the software, it is
> not longer purely about open source. Now companies with the related
> corporate legal baggage also gets in the mix.
> 3- The discussion would be on not only positive things about each vendor
> but in fact negatives. The latter type of  discussion which can get
> unpleasant very easily.
> 4- Somebody mentioned that, this is a very lightly moderated platform and
> thus this discussion should be allowed. I think this is one of the reasons
> that it should not be because, people can say things casually, without much
> thought, or without taking care of the context or the possible
> interpretations and get in trouble.
> 5- The risk here is not only that serious repercussions can occur (which
> very well can) but the greater risk is that it can cause misunderstanding
> between individuals, industries and companies.
> 6-People here lot of time reply quickly just to resolve or help the
> 'technical' issue. Now they will have to take care how they frame the
> response. Re: 4
>
> I know some will feel that I have created a highly exaggerated scenario
> above, but what I am trying to say is that, it is a slippery slope. If we
> allow this then this can go anywhere.
>
> By the way, I do not work for any of these vendors.
>
> More importantly, I am not saying that this discussion should not be had,
> I am just saying that this is a wrong forum.
>
> Just my 2 cents (or,...this was rather a dollar.)
>
> Regards,
> Shahab
>
>
> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Errr, what's wrong with discussing these types of issues on list?
>>
>> Nothing public here, and as long as it's kept to facts, this should
>> not be a problem and Apache is a fine place to have such discussions.
>>
>> My 2c.
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Xuri Nagarin <se...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, September 12, 2013 4:39 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I understand it can be contentious issue especially given that a lot of
>> >contributors to this list work for one or the other vendor or have some
>> >stake in any kind of evaluation. But, I see no reason why users should
>> >not be able to compare notes
>> > and share experiences. Over time, genuine pain points or issues or
>> >claims will bubble up and should only help the community. Sure, there
>> >will be a few flame wars but this already isn't a very tightly moderated
>> >list.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> ><ae...@maprtech.com> wrote:
>> >
>> >Raj,
>> >
>> >
>> >As others noted, this is not a great place for this discussion.  I'd
>> >suggest contacting the vendors you are interested in as I'm sure we'd all
>> >be happy to provide you more details.
>> >
>> >
>> >I don't know about the others, but for MapR, just send an email to
>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >to you with more information.
>> >
>> >
>> >Best Regards,
>> >Aaron Eng
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >
>> >
>> >Hi,
>> >
>> >We are trying to evaluate different implementations of Hadoop for our big
>> >data enterprise project.
>> >
>> >Can the forum members advise on what are the advantages and disadvantages
>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >
>> >Thanks in advance.
>> >
>> >Regards,
>> >Raj
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>


-- 
*
*
*
*
*Adam Muise*
Solution Engineer
*Hortonworks*
amuise@hortonworks.com
416-417-4037

Hortonworks - Develops, Distributes and Supports Enterprise Apache
Hadoop.<http://hortonworks.com/>

Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox>

Hadoop: Disruptive Possibilities by Jeff
Needham<http://hortonworks.com/resources/?did=72&cat=1>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Adam Muise <am...@hortonworks.com>.
I would just through an additional point on top of Shahab's excellent
summary.

To evaluate a distribution requires more than just the technical aspects of
that distribution. Even if we kept the discussion to the mailing list's
technical Hadoop usage focus, any company/organization looking to use a
distro is going to have to consider the costs, support, platform, partner
ecosystem, market share, company strategy, etc. Even if everyone behaved,
none of the topics I mentioned are appropriate for a Hadoop user mailing
list so you could not make an informed decision from a user list discussion
and you might as well give the vendors a call.

Thanks,

Adam


On Fri, Sep 13, 2013 at 1:48 PM, Shahab Yunus <sh...@gmail.com>wrote:

> I think, in my opinion, it is a wrong idea because:
>
> 1- Many of the participants here are employees for these very companies
> that are under discussion. This puts these respective employees in very
> difficult position. It is very hard to come with a correct response.
> Comments can be misconstrued easily.
> 2- Also, when we talk about vendor distributions of the software, it is
> not longer purely about open source. Now companies with the related
> corporate legal baggage also gets in the mix.
> 3- The discussion would be on not only positive things about each vendor
> but in fact negatives. The latter type of  discussion which can get
> unpleasant very easily.
> 4- Somebody mentioned that, this is a very lightly moderated platform and
> thus this discussion should be allowed. I think this is one of the reasons
> that it should not be because, people can say things casually, without much
> thought, or without taking care of the context or the possible
> interpretations and get in trouble.
> 5- The risk here is not only that serious repercussions can occur (which
> very well can) but the greater risk is that it can cause misunderstanding
> between individuals, industries and companies.
> 6-People here lot of time reply quickly just to resolve or help the
> 'technical' issue. Now they will have to take care how they frame the
> response. Re: 4
>
> I know some will feel that I have created a highly exaggerated scenario
> above, but what I am trying to say is that, it is a slippery slope. If we
> allow this then this can go anywhere.
>
> By the way, I do not work for any of these vendors.
>
> More importantly, I am not saying that this discussion should not be had,
> I am just saying that this is a wrong forum.
>
> Just my 2 cents (or,...this was rather a dollar.)
>
> Regards,
> Shahab
>
>
> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Errr, what's wrong with discussing these types of issues on list?
>>
>> Nothing public here, and as long as it's kept to facts, this should
>> not be a problem and Apache is a fine place to have such discussions.
>>
>> My 2c.
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Xuri Nagarin <se...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, September 12, 2013 4:39 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I understand it can be contentious issue especially given that a lot of
>> >contributors to this list work for one or the other vendor or have some
>> >stake in any kind of evaluation. But, I see no reason why users should
>> >not be able to compare notes
>> > and share experiences. Over time, genuine pain points or issues or
>> >claims will bubble up and should only help the community. Sure, there
>> >will be a few flame wars but this already isn't a very tightly moderated
>> >list.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> ><ae...@maprtech.com> wrote:
>> >
>> >Raj,
>> >
>> >
>> >As others noted, this is not a great place for this discussion.  I'd
>> >suggest contacting the vendors you are interested in as I'm sure we'd all
>> >be happy to provide you more details.
>> >
>> >
>> >I don't know about the others, but for MapR, just send an email to
>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >to you with more information.
>> >
>> >
>> >Best Regards,
>> >Aaron Eng
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >
>> >
>> >Hi,
>> >
>> >We are trying to evaluate different implementations of Hadoop for our big
>> >data enterprise project.
>> >
>> >Can the forum members advise on what are the advantages and disadvantages
>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >
>> >Thanks in advance.
>> >
>> >Regards,
>> >Raj
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>


-- 
*
*
*
*
*Adam Muise*
Solution Engineer
*Hortonworks*
amuise@hortonworks.com
416-417-4037

Hortonworks - Develops, Distributes and Supports Enterprise Apache
Hadoop.<http://hortonworks.com/>

Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox>

Hadoop: Disruptive Possibilities by Jeff
Needham<http://hortonworks.com/resources/?did=72&cat=1>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Suresh Srinivas <su...@hortonworks.com>.
Shahab,

I agree with your arguments. Really well put. Only things I would add is -
we do not want sales/marketing folks getting involved in these kinds of
threads and pollute it with sales pitches, unsubstantiated claims, and make
it a forum for marketing pitch. This can also have community repercussions
as you have rightly pointed out.

Wearing my own hadoop PMC hat, we do put Apache release regularly. Bigtop
also provides excellent stack packaging as well. In this forum my wish is
to see discussions around that than vendor related. There are already many
outside forums for this.

Regards,
Suresh


On Fri, Sep 13, 2013 at 10:48 AM, Shahab Yunus <sh...@gmail.com>wrote:

> I think, in my opinion, it is a wrong idea because:
>
> 1- Many of the participants here are employees for these very companies
> that are under discussion. This puts these respective employees in very
> difficult position. It is very hard to come with a correct response.
> Comments can be misconstrued easily.
> 2- Also, when we talk about vendor distributions of the software, it is
> not longer purely about open source. Now companies with the related
> corporate legal baggage also gets in the mix.
> 3- The discussion would be on not only positive things about each vendor
> but in fact negatives. The latter type of  discussion which can get
> unpleasant very easily.
> 4- Somebody mentioned that, this is a very lightly moderated platform and
> thus this discussion should be allowed. I think this is one of the reasons
> that it should not be because, people can say things casually, without much
> thought, or without taking care of the context or the possible
> interpretations and get in trouble.
> 5- The risk here is not only that serious repercussions can occur (which
> very well can) but the greater risk is that it can cause misunderstanding
> between individuals, industries and companies.
> 6-People here lot of time reply quickly just to resolve or help the
> 'technical' issue. Now they will have to take care how they frame the
> response. Re: 4
>
> I know some will feel that I have created a highly exaggerated scenario
> above, but what I am trying to say is that, it is a slippery slope. If we
> allow this then this can go anywhere.
>
> By the way, I do not work for any of these vendors.
>
> More importantly, I am not saying that this discussion should not be had,
> I am just saying that this is a wrong forum.
>
> Just my 2 cents (or,...this was rather a dollar.)
>
> Regards,
> Shahab
>
>
> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Errr, what's wrong with discussing these types of issues on list?
>>
>> Nothing public here, and as long as it's kept to facts, this should
>> not be a problem and Apache is a fine place to have such discussions.
>>
>> My 2c.
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Xuri Nagarin <se...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, September 12, 2013 4:39 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I understand it can be contentious issue especially given that a lot of
>> >contributors to this list work for one or the other vendor or have some
>> >stake in any kind of evaluation. But, I see no reason why users should
>> >not be able to compare notes
>> > and share experiences. Over time, genuine pain points or issues or
>> >claims will bubble up and should only help the community. Sure, there
>> >will be a few flame wars but this already isn't a very tightly moderated
>> >list.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> ><ae...@maprtech.com> wrote:
>> >
>> >Raj,
>> >
>> >
>> >As others noted, this is not a great place for this discussion.  I'd
>> >suggest contacting the vendors you are interested in as I'm sure we'd all
>> >be happy to provide you more details.
>> >
>> >
>> >I don't know about the others, but for MapR, just send an email to
>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >to you with more information.
>> >
>> >
>> >Best Regards,
>> >Aaron Eng
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >
>> >
>> >Hi,
>> >
>> >We are trying to evaluate different implementations of Hadoop for our big
>> >data enterprise project.
>> >
>> >Can the forum members advise on what are the advantages and disadvantages
>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >
>> >Thanks in advance.
>> >
>> >Regards,
>> >Raj
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Suresh Srinivas <su...@hortonworks.com>.
Shahab,

I agree with your arguments. Really well put. Only things I would add is -
we do not want sales/marketing folks getting involved in these kinds of
threads and pollute it with sales pitches, unsubstantiated claims, and make
it a forum for marketing pitch. This can also have community repercussions
as you have rightly pointed out.

Wearing my own hadoop PMC hat, we do put Apache release regularly. Bigtop
also provides excellent stack packaging as well. In this forum my wish is
to see discussions around that than vendor related. There are already many
outside forums for this.

Regards,
Suresh


On Fri, Sep 13, 2013 at 10:48 AM, Shahab Yunus <sh...@gmail.com>wrote:

> I think, in my opinion, it is a wrong idea because:
>
> 1- Many of the participants here are employees for these very companies
> that are under discussion. This puts these respective employees in very
> difficult position. It is very hard to come with a correct response.
> Comments can be misconstrued easily.
> 2- Also, when we talk about vendor distributions of the software, it is
> not longer purely about open source. Now companies with the related
> corporate legal baggage also gets in the mix.
> 3- The discussion would be on not only positive things about each vendor
> but in fact negatives. The latter type of  discussion which can get
> unpleasant very easily.
> 4- Somebody mentioned that, this is a very lightly moderated platform and
> thus this discussion should be allowed. I think this is one of the reasons
> that it should not be because, people can say things casually, without much
> thought, or without taking care of the context or the possible
> interpretations and get in trouble.
> 5- The risk here is not only that serious repercussions can occur (which
> very well can) but the greater risk is that it can cause misunderstanding
> between individuals, industries and companies.
> 6-People here lot of time reply quickly just to resolve or help the
> 'technical' issue. Now they will have to take care how they frame the
> response. Re: 4
>
> I know some will feel that I have created a highly exaggerated scenario
> above, but what I am trying to say is that, it is a slippery slope. If we
> allow this then this can go anywhere.
>
> By the way, I do not work for any of these vendors.
>
> More importantly, I am not saying that this discussion should not be had,
> I am just saying that this is a wrong forum.
>
> Just my 2 cents (or,...this was rather a dollar.)
>
> Regards,
> Shahab
>
>
> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Errr, what's wrong with discussing these types of issues on list?
>>
>> Nothing public here, and as long as it's kept to facts, this should
>> not be a problem and Apache is a fine place to have such discussions.
>>
>> My 2c.
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Xuri Nagarin <se...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, September 12, 2013 4:39 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I understand it can be contentious issue especially given that a lot of
>> >contributors to this list work for one or the other vendor or have some
>> >stake in any kind of evaluation. But, I see no reason why users should
>> >not be able to compare notes
>> > and share experiences. Over time, genuine pain points or issues or
>> >claims will bubble up and should only help the community. Sure, there
>> >will be a few flame wars but this already isn't a very tightly moderated
>> >list.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> ><ae...@maprtech.com> wrote:
>> >
>> >Raj,
>> >
>> >
>> >As others noted, this is not a great place for this discussion.  I'd
>> >suggest contacting the vendors you are interested in as I'm sure we'd all
>> >be happy to provide you more details.
>> >
>> >
>> >I don't know about the others, but for MapR, just send an email to
>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >to you with more information.
>> >
>> >
>> >Best Regards,
>> >Aaron Eng
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >
>> >
>> >Hi,
>> >
>> >We are trying to evaluate different implementations of Hadoop for our big
>> >data enterprise project.
>> >
>> >Can the forum members advise on what are the advantages and disadvantages
>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >
>> >Thanks in advance.
>> >
>> >Regards,
>> >Raj
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Adam Muise <am...@hortonworks.com>.
I would just through an additional point on top of Shahab's excellent
summary.

To evaluate a distribution requires more than just the technical aspects of
that distribution. Even if we kept the discussion to the mailing list's
technical Hadoop usage focus, any company/organization looking to use a
distro is going to have to consider the costs, support, platform, partner
ecosystem, market share, company strategy, etc. Even if everyone behaved,
none of the topics I mentioned are appropriate for a Hadoop user mailing
list so you could not make an informed decision from a user list discussion
and you might as well give the vendors a call.

Thanks,

Adam


On Fri, Sep 13, 2013 at 1:48 PM, Shahab Yunus <sh...@gmail.com>wrote:

> I think, in my opinion, it is a wrong idea because:
>
> 1- Many of the participants here are employees for these very companies
> that are under discussion. This puts these respective employees in very
> difficult position. It is very hard to come with a correct response.
> Comments can be misconstrued easily.
> 2- Also, when we talk about vendor distributions of the software, it is
> not longer purely about open source. Now companies with the related
> corporate legal baggage also gets in the mix.
> 3- The discussion would be on not only positive things about each vendor
> but in fact negatives. The latter type of  discussion which can get
> unpleasant very easily.
> 4- Somebody mentioned that, this is a very lightly moderated platform and
> thus this discussion should be allowed. I think this is one of the reasons
> that it should not be because, people can say things casually, without much
> thought, or without taking care of the context or the possible
> interpretations and get in trouble.
> 5- The risk here is not only that serious repercussions can occur (which
> very well can) but the greater risk is that it can cause misunderstanding
> between individuals, industries and companies.
> 6-People here lot of time reply quickly just to resolve or help the
> 'technical' issue. Now they will have to take care how they frame the
> response. Re: 4
>
> I know some will feel that I have created a highly exaggerated scenario
> above, but what I am trying to say is that, it is a slippery slope. If we
> allow this then this can go anywhere.
>
> By the way, I do not work for any of these vendors.
>
> More importantly, I am not saying that this discussion should not be had,
> I am just saying that this is a wrong forum.
>
> Just my 2 cents (or,...this was rather a dollar.)
>
> Regards,
> Shahab
>
>
> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Errr, what's wrong with discussing these types of issues on list?
>>
>> Nothing public here, and as long as it's kept to facts, this should
>> not be a problem and Apache is a fine place to have such discussions.
>>
>> My 2c.
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Xuri Nagarin <se...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, September 12, 2013 4:39 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I understand it can be contentious issue especially given that a lot of
>> >contributors to this list work for one or the other vendor or have some
>> >stake in any kind of evaluation. But, I see no reason why users should
>> >not be able to compare notes
>> > and share experiences. Over time, genuine pain points or issues or
>> >claims will bubble up and should only help the community. Sure, there
>> >will be a few flame wars but this already isn't a very tightly moderated
>> >list.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> ><ae...@maprtech.com> wrote:
>> >
>> >Raj,
>> >
>> >
>> >As others noted, this is not a great place for this discussion.  I'd
>> >suggest contacting the vendors you are interested in as I'm sure we'd all
>> >be happy to provide you more details.
>> >
>> >
>> >I don't know about the others, but for MapR, just send an email to
>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >to you with more information.
>> >
>> >
>> >Best Regards,
>> >Aaron Eng
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >
>> >
>> >Hi,
>> >
>> >We are trying to evaluate different implementations of Hadoop for our big
>> >data enterprise project.
>> >
>> >Can the forum members advise on what are the advantages and disadvantages
>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >
>> >Thanks in advance.
>> >
>> >Regards,
>> >Raj
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>


-- 
*
*
*
*
*Adam Muise*
Solution Engineer
*Hortonworks*
amuise@hortonworks.com
416-417-4037

Hortonworks - Develops, Distributes and Supports Enterprise Apache
Hadoop.<http://hortonworks.com/>

Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox>

Hadoop: Disruptive Possibilities by Jeff
Needham<http://hortonworks.com/resources/?did=72&cat=1>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Suresh Srinivas <su...@hortonworks.com>.
Shahab,

I agree with your arguments. Really well put. Only things I would add is -
we do not want sales/marketing folks getting involved in these kinds of
threads and pollute it with sales pitches, unsubstantiated claims, and make
it a forum for marketing pitch. This can also have community repercussions
as you have rightly pointed out.

Wearing my own hadoop PMC hat, we do put Apache release regularly. Bigtop
also provides excellent stack packaging as well. In this forum my wish is
to see discussions around that than vendor related. There are already many
outside forums for this.

Regards,
Suresh


On Fri, Sep 13, 2013 at 10:48 AM, Shahab Yunus <sh...@gmail.com>wrote:

> I think, in my opinion, it is a wrong idea because:
>
> 1- Many of the participants here are employees for these very companies
> that are under discussion. This puts these respective employees in very
> difficult position. It is very hard to come with a correct response.
> Comments can be misconstrued easily.
> 2- Also, when we talk about vendor distributions of the software, it is
> not longer purely about open source. Now companies with the related
> corporate legal baggage also gets in the mix.
> 3- The discussion would be on not only positive things about each vendor
> but in fact negatives. The latter type of  discussion which can get
> unpleasant very easily.
> 4- Somebody mentioned that, this is a very lightly moderated platform and
> thus this discussion should be allowed. I think this is one of the reasons
> that it should not be because, people can say things casually, without much
> thought, or without taking care of the context or the possible
> interpretations and get in trouble.
> 5- The risk here is not only that serious repercussions can occur (which
> very well can) but the greater risk is that it can cause misunderstanding
> between individuals, industries and companies.
> 6-People here lot of time reply quickly just to resolve or help the
> 'technical' issue. Now they will have to take care how they frame the
> response. Re: 4
>
> I know some will feel that I have created a highly exaggerated scenario
> above, but what I am trying to say is that, it is a slippery slope. If we
> allow this then this can go anywhere.
>
> By the way, I do not work for any of these vendors.
>
> More importantly, I am not saying that this discussion should not be had,
> I am just saying that this is a wrong forum.
>
> Just my 2 cents (or,...this was rather a dollar.)
>
> Regards,
> Shahab
>
>
> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Errr, what's wrong with discussing these types of issues on list?
>>
>> Nothing public here, and as long as it's kept to facts, this should
>> not be a problem and Apache is a fine place to have such discussions.
>>
>> My 2c.
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Xuri Nagarin <se...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, September 12, 2013 4:39 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I understand it can be contentious issue especially given that a lot of
>> >contributors to this list work for one or the other vendor or have some
>> >stake in any kind of evaluation. But, I see no reason why users should
>> >not be able to compare notes
>> > and share experiences. Over time, genuine pain points or issues or
>> >claims will bubble up and should only help the community. Sure, there
>> >will be a few flame wars but this already isn't a very tightly moderated
>> >list.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> ><ae...@maprtech.com> wrote:
>> >
>> >Raj,
>> >
>> >
>> >As others noted, this is not a great place for this discussion.  I'd
>> >suggest contacting the vendors you are interested in as I'm sure we'd all
>> >be happy to provide you more details.
>> >
>> >
>> >I don't know about the others, but for MapR, just send an email to
>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >to you with more information.
>> >
>> >
>> >Best Regards,
>> >Aaron Eng
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >
>> >
>> >Hi,
>> >
>> >We are trying to evaluate different implementations of Hadoop for our big
>> >data enterprise project.
>> >
>> >Can the forum members advise on what are the advantages and disadvantages
>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >
>> >Thanks in advance.
>> >
>> >Regards,
>> >Raj
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Suresh Srinivas <su...@hortonworks.com>.
Shahab,

I agree with your arguments. Really well put. Only things I would add is -
we do not want sales/marketing folks getting involved in these kinds of
threads and pollute it with sales pitches, unsubstantiated claims, and make
it a forum for marketing pitch. This can also have community repercussions
as you have rightly pointed out.

Wearing my own hadoop PMC hat, we do put Apache release regularly. Bigtop
also provides excellent stack packaging as well. In this forum my wish is
to see discussions around that than vendor related. There are already many
outside forums for this.

Regards,
Suresh


On Fri, Sep 13, 2013 at 10:48 AM, Shahab Yunus <sh...@gmail.com>wrote:

> I think, in my opinion, it is a wrong idea because:
>
> 1- Many of the participants here are employees for these very companies
> that are under discussion. This puts these respective employees in very
> difficult position. It is very hard to come with a correct response.
> Comments can be misconstrued easily.
> 2- Also, when we talk about vendor distributions of the software, it is
> not longer purely about open source. Now companies with the related
> corporate legal baggage also gets in the mix.
> 3- The discussion would be on not only positive things about each vendor
> but in fact negatives. The latter type of  discussion which can get
> unpleasant very easily.
> 4- Somebody mentioned that, this is a very lightly moderated platform and
> thus this discussion should be allowed. I think this is one of the reasons
> that it should not be because, people can say things casually, without much
> thought, or without taking care of the context or the possible
> interpretations and get in trouble.
> 5- The risk here is not only that serious repercussions can occur (which
> very well can) but the greater risk is that it can cause misunderstanding
> between individuals, industries and companies.
> 6-People here lot of time reply quickly just to resolve or help the
> 'technical' issue. Now they will have to take care how they frame the
> response. Re: 4
>
> I know some will feel that I have created a highly exaggerated scenario
> above, but what I am trying to say is that, it is a slippery slope. If we
> allow this then this can go anywhere.
>
> By the way, I do not work for any of these vendors.
>
> More importantly, I am not saying that this discussion should not be had,
> I am just saying that this is a wrong forum.
>
> Just my 2 cents (or,...this was rather a dollar.)
>
> Regards,
> Shahab
>
>
> On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org>wrote:
>
>> Errr, what's wrong with discussing these types of issues on list?
>>
>> Nothing public here, and as long as it's kept to facts, this should
>> not be a problem and Apache is a fine place to have such discussions.
>>
>> My 2c.
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Xuri Nagarin <se...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, September 12, 2013 4:39 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>>
>> >I understand it can be contentious issue especially given that a lot of
>> >contributors to this list work for one or the other vendor or have some
>> >stake in any kind of evaluation. But, I see no reason why users should
>> >not be able to compare notes
>> > and share experiences. Over time, genuine pain points or issues or
>> >claims will bubble up and should only help the community. Sure, there
>> >will be a few flame wars but this already isn't a very tightly moderated
>> >list.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>> ><ae...@maprtech.com> wrote:
>> >
>> >Raj,
>> >
>> >
>> >As others noted, this is not a great place for this discussion.  I'd
>> >suggest contacting the vendors you are interested in as I'm sure we'd all
>> >be happy to provide you more details.
>> >
>> >
>> >I don't know about the others, but for MapR, just send an email to
>> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get
>> back
>> >to you with more information.
>> >
>> >
>> >Best Regards,
>> >Aaron Eng
>> >
>> >
>> >
>> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com>
>> wrote:
>> >
>> >
>> >Hi,
>> >
>> >We are trying to evaluate different implementations of Hadoop for our big
>> >data enterprise project.
>> >
>> >Can the forum members advise on what are the advantages and disadvantages
>> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>> >
>> >Thanks in advance.
>> >
>> >Regards,
>> >Raj
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>


-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Mattmann <ma...@apache.org>.
Here's the deal, folks can post questions to the list that aren't
abusive and simply asking what the difference between different vendor
implementations (downstream) of Apache Hadoop is not an inflammatory
or abusive question.

Stick to the facts. Discuss it here. Why should the Apache Hadoop
PMC push off potentially useful questions that may have upstream
implications to the Apache Hadoop core and let all the innovation
occur downstream?

Have the conversations here if you'd like. I wouldn't turn anyone
away..

My 2c.

Cheers,
Chris

----Original Message-----

From: Shahab Yunus <sh...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Friday, September 13, 2013 10:48 AM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Cloudera Vs Hortonworks Vs MapR

>I think, in my opinion, it is a wrong idea because:
>
>
>1- Many of the participants here are employees for these very companies
>that are under discussion. This puts these respective employees in very
>difficult position. It is very hard to come with a correct response.
>Comments can be misconstrued easily.
>2- Also, when we talk about vendor distributions of the software, it is
>not longer purely about open source. Now companies with the related
>corporate legal baggage also gets in the mix.
>3- The discussion would be on not only positive things about each vendor
>but in fact negatives. The latter type of  discussion which can get
>unpleasant very easily.
>
>4- Somebody mentioned that, this is a very lightly moderated platform and
>thus this discussion should be allowed. I think this is one of the
>reasons that it should not be because, people can say things casually,
>without much thought, or without taking
> care of the context or the possible interpretations and get in trouble.
>5- The risk here is not only that serious repercussions can occur (which
>very well can) but the greater risk is that it can cause misunderstanding
>between individuals, industries and companies.
>6-People here lot of time reply quickly just to resolve or help the
>'technical' issue. Now they will have to take care how they frame the
>response. Re: 4
>
>
>I know some will feel that I have created a highly exaggerated scenario
>above, but what I am trying to say is that, it is a slippery slope. If we
>allow this then this can go anywhere.
>
>
>By the way, I do not work for any of these vendors.
>
>
>More importantly, I am not saying that this discussion should not be had,
>I am just saying that this is a wrong forum.
>
>
>Just my 2 cents (or,...this was rather a dollar.)
>
>
>Regards,
>Shahab
>
>
>
>
>On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
><ma...@apache.org> wrote:
>
>Errr, what's wrong with discussing these types of issues on list?
>
>Nothing public here, and as long as it's kept to facts, this should
>not be a problem and Apache is a fine place to have such discussions.
>
>My 2c.
>
>
>
>
>
>-----Original Message-----
>From: Xuri Nagarin <se...@gmail.com>
>Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>Date: Thursday, September 12, 2013 4:39 PM
>To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
>>I understand it can be contentious issue especially given that a lot of
>>contributors to this list work for one or the other vendor or have some
>>stake in any kind of evaluation. But, I see no reason why users should
>>not be able to compare notes
>> and share experiences. Over time, genuine pain points or issues or
>>claims will bubble up and should only help the community. Sure, there
>>will be a few flame wars but this already isn't a very tightly moderated
>>list.
>>
>>
>>
>>
>>
>>
>>
>>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>><ae...@maprtech.com> wrote:
>>
>>Raj,
>>
>>
>>As others noted, this is not a great place for this discussion.  I'd
>>suggest contacting the vendors you are interested in as I'm sure we'd all
>>be happy to provide you more details.
>>
>>
>>I don't know about the others, but for MapR, just send an email to
>>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>>to you with more information.
>>
>>
>>Best Regards,
>>Aaron Eng
>>
>>
>>
>>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>>
>>
>>Hi,
>>
>>We are trying to evaluate different implementations of Hadoop for our big
>>data enterprise project.
>>
>>Can the forum members advise on what are the advantages and disadvantages
>>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>>Thanks in advance.
>>
>>Regards,
>>Raj
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
>
>
>
>



Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Mattmann <ma...@apache.org>.
Here's the deal, folks can post questions to the list that aren't
abusive and simply asking what the difference between different vendor
implementations (downstream) of Apache Hadoop is not an inflammatory
or abusive question.

Stick to the facts. Discuss it here. Why should the Apache Hadoop
PMC push off potentially useful questions that may have upstream
implications to the Apache Hadoop core and let all the innovation
occur downstream?

Have the conversations here if you'd like. I wouldn't turn anyone
away..

My 2c.

Cheers,
Chris

----Original Message-----

From: Shahab Yunus <sh...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Friday, September 13, 2013 10:48 AM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Cloudera Vs Hortonworks Vs MapR

>I think, in my opinion, it is a wrong idea because:
>
>
>1- Many of the participants here are employees for these very companies
>that are under discussion. This puts these respective employees in very
>difficult position. It is very hard to come with a correct response.
>Comments can be misconstrued easily.
>2- Also, when we talk about vendor distributions of the software, it is
>not longer purely about open source. Now companies with the related
>corporate legal baggage also gets in the mix.
>3- The discussion would be on not only positive things about each vendor
>but in fact negatives. The latter type of  discussion which can get
>unpleasant very easily.
>
>4- Somebody mentioned that, this is a very lightly moderated platform and
>thus this discussion should be allowed. I think this is one of the
>reasons that it should not be because, people can say things casually,
>without much thought, or without taking
> care of the context or the possible interpretations and get in trouble.
>5- The risk here is not only that serious repercussions can occur (which
>very well can) but the greater risk is that it can cause misunderstanding
>between individuals, industries and companies.
>6-People here lot of time reply quickly just to resolve or help the
>'technical' issue. Now they will have to take care how they frame the
>response. Re: 4
>
>
>I know some will feel that I have created a highly exaggerated scenario
>above, but what I am trying to say is that, it is a slippery slope. If we
>allow this then this can go anywhere.
>
>
>By the way, I do not work for any of these vendors.
>
>
>More importantly, I am not saying that this discussion should not be had,
>I am just saying that this is a wrong forum.
>
>
>Just my 2 cents (or,...this was rather a dollar.)
>
>
>Regards,
>Shahab
>
>
>
>
>On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
><ma...@apache.org> wrote:
>
>Errr, what's wrong with discussing these types of issues on list?
>
>Nothing public here, and as long as it's kept to facts, this should
>not be a problem and Apache is a fine place to have such discussions.
>
>My 2c.
>
>
>
>
>
>-----Original Message-----
>From: Xuri Nagarin <se...@gmail.com>
>Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>Date: Thursday, September 12, 2013 4:39 PM
>To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
>>I understand it can be contentious issue especially given that a lot of
>>contributors to this list work for one or the other vendor or have some
>>stake in any kind of evaluation. But, I see no reason why users should
>>not be able to compare notes
>> and share experiences. Over time, genuine pain points or issues or
>>claims will bubble up and should only help the community. Sure, there
>>will be a few flame wars but this already isn't a very tightly moderated
>>list.
>>
>>
>>
>>
>>
>>
>>
>>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>><ae...@maprtech.com> wrote:
>>
>>Raj,
>>
>>
>>As others noted, this is not a great place for this discussion.  I'd
>>suggest contacting the vendors you are interested in as I'm sure we'd all
>>be happy to provide you more details.
>>
>>
>>I don't know about the others, but for MapR, just send an email to
>>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>>to you with more information.
>>
>>
>>Best Regards,
>>Aaron Eng
>>
>>
>>
>>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>>
>>
>>Hi,
>>
>>We are trying to evaluate different implementations of Hadoop for our big
>>data enterprise project.
>>
>>Can the forum members advise on what are the advantages and disadvantages
>>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>>Thanks in advance.
>>
>>Regards,
>>Raj
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
>
>
>
>



Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Mattmann <ma...@apache.org>.
Here's the deal, folks can post questions to the list that aren't
abusive and simply asking what the difference between different vendor
implementations (downstream) of Apache Hadoop is not an inflammatory
or abusive question.

Stick to the facts. Discuss it here. Why should the Apache Hadoop
PMC push off potentially useful questions that may have upstream
implications to the Apache Hadoop core and let all the innovation
occur downstream?

Have the conversations here if you'd like. I wouldn't turn anyone
away..

My 2c.

Cheers,
Chris

----Original Message-----

From: Shahab Yunus <sh...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Friday, September 13, 2013 10:48 AM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Cloudera Vs Hortonworks Vs MapR

>I think, in my opinion, it is a wrong idea because:
>
>
>1- Many of the participants here are employees for these very companies
>that are under discussion. This puts these respective employees in very
>difficult position. It is very hard to come with a correct response.
>Comments can be misconstrued easily.
>2- Also, when we talk about vendor distributions of the software, it is
>not longer purely about open source. Now companies with the related
>corporate legal baggage also gets in the mix.
>3- The discussion would be on not only positive things about each vendor
>but in fact negatives. The latter type of  discussion which can get
>unpleasant very easily.
>
>4- Somebody mentioned that, this is a very lightly moderated platform and
>thus this discussion should be allowed. I think this is one of the
>reasons that it should not be because, people can say things casually,
>without much thought, or without taking
> care of the context or the possible interpretations and get in trouble.
>5- The risk here is not only that serious repercussions can occur (which
>very well can) but the greater risk is that it can cause misunderstanding
>between individuals, industries and companies.
>6-People here lot of time reply quickly just to resolve or help the
>'technical' issue. Now they will have to take care how they frame the
>response. Re: 4
>
>
>I know some will feel that I have created a highly exaggerated scenario
>above, but what I am trying to say is that, it is a slippery slope. If we
>allow this then this can go anywhere.
>
>
>By the way, I do not work for any of these vendors.
>
>
>More importantly, I am not saying that this discussion should not be had,
>I am just saying that this is a wrong forum.
>
>
>Just my 2 cents (or,...this was rather a dollar.)
>
>
>Regards,
>Shahab
>
>
>
>
>On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
><ma...@apache.org> wrote:
>
>Errr, what's wrong with discussing these types of issues on list?
>
>Nothing public here, and as long as it's kept to facts, this should
>not be a problem and Apache is a fine place to have such discussions.
>
>My 2c.
>
>
>
>
>
>-----Original Message-----
>From: Xuri Nagarin <se...@gmail.com>
>Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>Date: Thursday, September 12, 2013 4:39 PM
>To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
>>I understand it can be contentious issue especially given that a lot of
>>contributors to this list work for one or the other vendor or have some
>>stake in any kind of evaluation. But, I see no reason why users should
>>not be able to compare notes
>> and share experiences. Over time, genuine pain points or issues or
>>claims will bubble up and should only help the community. Sure, there
>>will be a few flame wars but this already isn't a very tightly moderated
>>list.
>>
>>
>>
>>
>>
>>
>>
>>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>><ae...@maprtech.com> wrote:
>>
>>Raj,
>>
>>
>>As others noted, this is not a great place for this discussion.  I'd
>>suggest contacting the vendors you are interested in as I'm sure we'd all
>>be happy to provide you more details.
>>
>>
>>I don't know about the others, but for MapR, just send an email to
>>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>>to you with more information.
>>
>>
>>Best Regards,
>>Aaron Eng
>>
>>
>>
>>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>>
>>
>>Hi,
>>
>>We are trying to evaluate different implementations of Hadoop for our big
>>data enterprise project.
>>
>>Can the forum members advise on what are the advantages and disadvantages
>>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>>Thanks in advance.
>>
>>Regards,
>>Raj
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
>
>
>
>



Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Mattmann <ma...@apache.org>.
Here's the deal, folks can post questions to the list that aren't
abusive and simply asking what the difference between different vendor
implementations (downstream) of Apache Hadoop is not an inflammatory
or abusive question.

Stick to the facts. Discuss it here. Why should the Apache Hadoop
PMC push off potentially useful questions that may have upstream
implications to the Apache Hadoop core and let all the innovation
occur downstream?

Have the conversations here if you'd like. I wouldn't turn anyone
away..

My 2c.

Cheers,
Chris

----Original Message-----

From: Shahab Yunus <sh...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Friday, September 13, 2013 10:48 AM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Cloudera Vs Hortonworks Vs MapR

>I think, in my opinion, it is a wrong idea because:
>
>
>1- Many of the participants here are employees for these very companies
>that are under discussion. This puts these respective employees in very
>difficult position. It is very hard to come with a correct response.
>Comments can be misconstrued easily.
>2- Also, when we talk about vendor distributions of the software, it is
>not longer purely about open source. Now companies with the related
>corporate legal baggage also gets in the mix.
>3- The discussion would be on not only positive things about each vendor
>but in fact negatives. The latter type of  discussion which can get
>unpleasant very easily.
>
>4- Somebody mentioned that, this is a very lightly moderated platform and
>thus this discussion should be allowed. I think this is one of the
>reasons that it should not be because, people can say things casually,
>without much thought, or without taking
> care of the context or the possible interpretations and get in trouble.
>5- The risk here is not only that serious repercussions can occur (which
>very well can) but the greater risk is that it can cause misunderstanding
>between individuals, industries and companies.
>6-People here lot of time reply quickly just to resolve or help the
>'technical' issue. Now they will have to take care how they frame the
>response. Re: 4
>
>
>I know some will feel that I have created a highly exaggerated scenario
>above, but what I am trying to say is that, it is a slippery slope. If we
>allow this then this can go anywhere.
>
>
>By the way, I do not work for any of these vendors.
>
>
>More importantly, I am not saying that this discussion should not be had,
>I am just saying that this is a wrong forum.
>
>
>Just my 2 cents (or,...this was rather a dollar.)
>
>
>Regards,
>Shahab
>
>
>
>
>On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann
><ma...@apache.org> wrote:
>
>Errr, what's wrong with discussing these types of issues on list?
>
>Nothing public here, and as long as it's kept to facts, this should
>not be a problem and Apache is a fine place to have such discussions.
>
>My 2c.
>
>
>
>
>
>-----Original Message-----
>From: Xuri Nagarin <se...@gmail.com>
>Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>Date: Thursday, September 12, 2013 4:39 PM
>To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
>>I understand it can be contentious issue especially given that a lot of
>>contributors to this list work for one or the other vendor or have some
>>stake in any kind of evaluation. But, I see no reason why users should
>>not be able to compare notes
>> and share experiences. Over time, genuine pain points or issues or
>>claims will bubble up and should only help the community. Sure, there
>>will be a few flame wars but this already isn't a very tightly moderated
>>list.
>>
>>
>>
>>
>>
>>
>>
>>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>><ae...@maprtech.com> wrote:
>>
>>Raj,
>>
>>
>>As others noted, this is not a great place for this discussion.  I'd
>>suggest contacting the vendors you are interested in as I'm sure we'd all
>>be happy to provide you more details.
>>
>>
>>I don't know about the others, but for MapR, just send an email to
>>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>>to you with more information.
>>
>>
>>Best Regards,
>>Aaron Eng
>>
>>
>>
>>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>>
>>
>>Hi,
>>
>>We are trying to evaluate different implementations of Hadoop for our big
>>data enterprise project.
>>
>>Can the forum members advise on what are the advantages and disadvantages
>>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>>Thanks in advance.
>>
>>Regards,
>>Raj
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
>
>
>
>



Re: Cloudera Vs Hortonworks Vs MapR

Posted by Shahab Yunus <sh...@gmail.com>.
I think, in my opinion, it is a wrong idea because:

1- Many of the participants here are employees for these very companies
that are under discussion. This puts these respective employees in very
difficult position. It is very hard to come with a correct response.
Comments can be misconstrued easily.
2- Also, when we talk about vendor distributions of the software, it is not
longer purely about open source. Now companies with the related corporate
legal baggage also gets in the mix.
3- The discussion would be on not only positive things about each vendor
but in fact negatives. The latter type of  discussion which can get
unpleasant very easily.
4- Somebody mentioned that, this is a very lightly moderated platform and
thus this discussion should be allowed. I think this is one of the reasons
that it should not be because, people can say things casually, without much
thought, or without taking care of the context or the possible
interpretations and get in trouble.
5- The risk here is not only that serious repercussions can occur (which
very well can) but the greater risk is that it can cause misunderstanding
between individuals, industries and companies.
6-People here lot of time reply quickly just to resolve or help the
'technical' issue. Now they will have to take care how they frame the
response. Re: 4

I know some will feel that I have created a highly exaggerated scenario
above, but what I am trying to say is that, it is a slippery slope. If we
allow this then this can go anywhere.

By the way, I do not work for any of these vendors.

More importantly, I am not saying that this discussion should not be had, I
am just saying that this is a wrong forum.

Just my 2 cents (or,...this was rather a dollar.)

Regards,
Shahab


On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org> wrote:

> Errr, what's wrong with discussing these types of issues on list?
>
> Nothing public here, and as long as it's kept to facts, this should
> not be a problem and Apache is a fine place to have such discussions.
>
> My 2c.
>
>
>
>
>
> -----Original Message-----
> From: Xuri Nagarin <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, September 12, 2013 4:39 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
> >I understand it can be contentious issue especially given that a lot of
> >contributors to this list work for one or the other vendor or have some
> >stake in any kind of evaluation. But, I see no reason why users should
> >not be able to compare notes
> > and share experiences. Over time, genuine pain points or issues or
> >claims will bubble up and should only help the community. Sure, there
> >will be a few flame wars but this already isn't a very tightly moderated
> >list.
> >
> >
> >
> >
> >
> >
> >
> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
> ><ae...@maprtech.com> wrote:
> >
> >Raj,
> >
> >
> >As others noted, this is not a great place for this discussion.  I'd
> >suggest contacting the vendors you are interested in as I'm sure we'd all
> >be happy to provide you more details.
> >
> >
> >I don't know about the others, but for MapR, just send an email to
> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
> >to you with more information.
> >
> >
> >Best Regards,
> >Aaron Eng
> >
> >
> >
> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
> >
> >
> >Hi,
> >
> >We are trying to evaluate different implementations of Hadoop for our big
> >data enterprise project.
> >
> >Can the forum members advise on what are the advantages and disadvantages
> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> >
> >Thanks in advance.
> >
> >Regards,
> >Raj
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Embree <ce...@gmail.com>.
The only problem is around the degeneration of the discussion.  See
years long threads around vi vs. emacs, Windows vs. Linux, Java vs.
C/Python/Perl/Ruby.


On 9/13/13, Chris Mattmann <ma...@apache.org> wrote:
> Errr, what's wrong with discussing these types of issues on list?
>
> Nothing public here, and as long as it's kept to facts, this should
> not be a problem and Apache is a fine place to have such discussions.
>
> My 2c.
>
>
>
>
>
> -----Original Message-----
> From: Xuri Nagarin <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, September 12, 2013 4:39 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
>>I understand it can be contentious issue especially given that a lot of
>>contributors to this list work for one or the other vendor or have some
>>stake in any kind of evaluation. But, I see no reason why users should
>>not be able to compare notes
>> and share experiences. Over time, genuine pain points or issues or
>>claims will bubble up and should only help the community. Sure, there
>>will be a few flame wars but this already isn't a very tightly moderated
>>list.
>>
>>
>>
>>
>>
>>
>>
>>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>><ae...@maprtech.com> wrote:
>>
>>Raj,
>>
>>
>>As others noted, this is not a great place for this discussion.  I'd
>>suggest contacting the vendors you are interested in as I'm sure we'd all
>>be happy to provide you more details.
>>
>>
>>I don't know about the others, but for MapR, just send an email to
>>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>>to you with more information.
>>
>>
>>Best Regards,
>>Aaron Eng
>>
>>
>>
>>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>>
>>
>>Hi,
>>
>>We are trying to evaluate different implementations of Hadoop for our big
>>data enterprise project.
>>
>>Can the forum members advise on what are the advantages and disadvantages
>>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>>Thanks in advance.
>>
>>Regards,
>>Raj
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Shahab Yunus <sh...@gmail.com>.
I think, in my opinion, it is a wrong idea because:

1- Many of the participants here are employees for these very companies
that are under discussion. This puts these respective employees in very
difficult position. It is very hard to come with a correct response.
Comments can be misconstrued easily.
2- Also, when we talk about vendor distributions of the software, it is not
longer purely about open source. Now companies with the related corporate
legal baggage also gets in the mix.
3- The discussion would be on not only positive things about each vendor
but in fact negatives. The latter type of  discussion which can get
unpleasant very easily.
4- Somebody mentioned that, this is a very lightly moderated platform and
thus this discussion should be allowed. I think this is one of the reasons
that it should not be because, people can say things casually, without much
thought, or without taking care of the context or the possible
interpretations and get in trouble.
5- The risk here is not only that serious repercussions can occur (which
very well can) but the greater risk is that it can cause misunderstanding
between individuals, industries and companies.
6-People here lot of time reply quickly just to resolve or help the
'technical' issue. Now they will have to take care how they frame the
response. Re: 4

I know some will feel that I have created a highly exaggerated scenario
above, but what I am trying to say is that, it is a slippery slope. If we
allow this then this can go anywhere.

By the way, I do not work for any of these vendors.

More importantly, I am not saying that this discussion should not be had, I
am just saying that this is a wrong forum.

Just my 2 cents (or,...this was rather a dollar.)

Regards,
Shahab


On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org> wrote:

> Errr, what's wrong with discussing these types of issues on list?
>
> Nothing public here, and as long as it's kept to facts, this should
> not be a problem and Apache is a fine place to have such discussions.
>
> My 2c.
>
>
>
>
>
> -----Original Message-----
> From: Xuri Nagarin <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, September 12, 2013 4:39 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
> >I understand it can be contentious issue especially given that a lot of
> >contributors to this list work for one or the other vendor or have some
> >stake in any kind of evaluation. But, I see no reason why users should
> >not be able to compare notes
> > and share experiences. Over time, genuine pain points or issues or
> >claims will bubble up and should only help the community. Sure, there
> >will be a few flame wars but this already isn't a very tightly moderated
> >list.
> >
> >
> >
> >
> >
> >
> >
> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
> ><ae...@maprtech.com> wrote:
> >
> >Raj,
> >
> >
> >As others noted, this is not a great place for this discussion.  I'd
> >suggest contacting the vendors you are interested in as I'm sure we'd all
> >be happy to provide you more details.
> >
> >
> >I don't know about the others, but for MapR, just send an email to
> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
> >to you with more information.
> >
> >
> >Best Regards,
> >Aaron Eng
> >
> >
> >
> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
> >
> >
> >Hi,
> >
> >We are trying to evaluate different implementations of Hadoop for our big
> >data enterprise project.
> >
> >Can the forum members advise on what are the advantages and disadvantages
> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> >
> >Thanks in advance.
> >
> >Regards,
> >Raj
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Shahab Yunus <sh...@gmail.com>.
I think, in my opinion, it is a wrong idea because:

1- Many of the participants here are employees for these very companies
that are under discussion. This puts these respective employees in very
difficult position. It is very hard to come with a correct response.
Comments can be misconstrued easily.
2- Also, when we talk about vendor distributions of the software, it is not
longer purely about open source. Now companies with the related corporate
legal baggage also gets in the mix.
3- The discussion would be on not only positive things about each vendor
but in fact negatives. The latter type of  discussion which can get
unpleasant very easily.
4- Somebody mentioned that, this is a very lightly moderated platform and
thus this discussion should be allowed. I think this is one of the reasons
that it should not be because, people can say things casually, without much
thought, or without taking care of the context or the possible
interpretations and get in trouble.
5- The risk here is not only that serious repercussions can occur (which
very well can) but the greater risk is that it can cause misunderstanding
between individuals, industries and companies.
6-People here lot of time reply quickly just to resolve or help the
'technical' issue. Now they will have to take care how they frame the
response. Re: 4

I know some will feel that I have created a highly exaggerated scenario
above, but what I am trying to say is that, it is a slippery slope. If we
allow this then this can go anywhere.

By the way, I do not work for any of these vendors.

More importantly, I am not saying that this discussion should not be had, I
am just saying that this is a wrong forum.

Just my 2 cents (or,...this was rather a dollar.)

Regards,
Shahab


On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org> wrote:

> Errr, what's wrong with discussing these types of issues on list?
>
> Nothing public here, and as long as it's kept to facts, this should
> not be a problem and Apache is a fine place to have such discussions.
>
> My 2c.
>
>
>
>
>
> -----Original Message-----
> From: Xuri Nagarin <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, September 12, 2013 4:39 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
> >I understand it can be contentious issue especially given that a lot of
> >contributors to this list work for one or the other vendor or have some
> >stake in any kind of evaluation. But, I see no reason why users should
> >not be able to compare notes
> > and share experiences. Over time, genuine pain points or issues or
> >claims will bubble up and should only help the community. Sure, there
> >will be a few flame wars but this already isn't a very tightly moderated
> >list.
> >
> >
> >
> >
> >
> >
> >
> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
> ><ae...@maprtech.com> wrote:
> >
> >Raj,
> >
> >
> >As others noted, this is not a great place for this discussion.  I'd
> >suggest contacting the vendors you are interested in as I'm sure we'd all
> >be happy to provide you more details.
> >
> >
> >I don't know about the others, but for MapR, just send an email to
> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
> >to you with more information.
> >
> >
> >Best Regards,
> >Aaron Eng
> >
> >
> >
> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
> >
> >
> >Hi,
> >
> >We are trying to evaluate different implementations of Hadoop for our big
> >data enterprise project.
> >
> >Can the forum members advise on what are the advantages and disadvantages
> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> >
> >Thanks in advance.
> >
> >Regards,
> >Raj
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Embree <ce...@gmail.com>.
The only problem is around the degeneration of the discussion.  See
years long threads around vi vs. emacs, Windows vs. Linux, Java vs.
C/Python/Perl/Ruby.


On 9/13/13, Chris Mattmann <ma...@apache.org> wrote:
> Errr, what's wrong with discussing these types of issues on list?
>
> Nothing public here, and as long as it's kept to facts, this should
> not be a problem and Apache is a fine place to have such discussions.
>
> My 2c.
>
>
>
>
>
> -----Original Message-----
> From: Xuri Nagarin <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, September 12, 2013 4:39 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
>>I understand it can be contentious issue especially given that a lot of
>>contributors to this list work for one or the other vendor or have some
>>stake in any kind of evaluation. But, I see no reason why users should
>>not be able to compare notes
>> and share experiences. Over time, genuine pain points or issues or
>>claims will bubble up and should only help the community. Sure, there
>>will be a few flame wars but this already isn't a very tightly moderated
>>list.
>>
>>
>>
>>
>>
>>
>>
>>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>><ae...@maprtech.com> wrote:
>>
>>Raj,
>>
>>
>>As others noted, this is not a great place for this discussion.  I'd
>>suggest contacting the vendors you are interested in as I'm sure we'd all
>>be happy to provide you more details.
>>
>>
>>I don't know about the others, but for MapR, just send an email to
>>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>>to you with more information.
>>
>>
>>Best Regards,
>>Aaron Eng
>>
>>
>>
>>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>>
>>
>>Hi,
>>
>>We are trying to evaluate different implementations of Hadoop for our big
>>data enterprise project.
>>
>>Can the forum members advise on what are the advantages and disadvantages
>>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>>Thanks in advance.
>>
>>Regards,
>>Raj
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Shahab Yunus <sh...@gmail.com>.
I think, in my opinion, it is a wrong idea because:

1- Many of the participants here are employees for these very companies
that are under discussion. This puts these respective employees in very
difficult position. It is very hard to come with a correct response.
Comments can be misconstrued easily.
2- Also, when we talk about vendor distributions of the software, it is not
longer purely about open source. Now companies with the related corporate
legal baggage also gets in the mix.
3- The discussion would be on not only positive things about each vendor
but in fact negatives. The latter type of  discussion which can get
unpleasant very easily.
4- Somebody mentioned that, this is a very lightly moderated platform and
thus this discussion should be allowed. I think this is one of the reasons
that it should not be because, people can say things casually, without much
thought, or without taking care of the context or the possible
interpretations and get in trouble.
5- The risk here is not only that serious repercussions can occur (which
very well can) but the greater risk is that it can cause misunderstanding
between individuals, industries and companies.
6-People here lot of time reply quickly just to resolve or help the
'technical' issue. Now they will have to take care how they frame the
response. Re: 4

I know some will feel that I have created a highly exaggerated scenario
above, but what I am trying to say is that, it is a slippery slope. If we
allow this then this can go anywhere.

By the way, I do not work for any of these vendors.

More importantly, I am not saying that this discussion should not be had, I
am just saying that this is a wrong forum.

Just my 2 cents (or,...this was rather a dollar.)

Regards,
Shahab


On Fri, Sep 13, 2013 at 1:50 AM, Chris Mattmann <ma...@apache.org> wrote:

> Errr, what's wrong with discussing these types of issues on list?
>
> Nothing public here, and as long as it's kept to facts, this should
> not be a problem and Apache is a fine place to have such discussions.
>
> My 2c.
>
>
>
>
>
> -----Original Message-----
> From: Xuri Nagarin <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, September 12, 2013 4:39 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
> >I understand it can be contentious issue especially given that a lot of
> >contributors to this list work for one or the other vendor or have some
> >stake in any kind of evaluation. But, I see no reason why users should
> >not be able to compare notes
> > and share experiences. Over time, genuine pain points or issues or
> >claims will bubble up and should only help the community. Sure, there
> >will be a few flame wars but this already isn't a very tightly moderated
> >list.
> >
> >
> >
> >
> >
> >
> >
> >On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
> ><ae...@maprtech.com> wrote:
> >
> >Raj,
> >
> >
> >As others noted, this is not a great place for this discussion.  I'd
> >suggest contacting the vendors you are interested in as I'm sure we'd all
> >be happy to provide you more details.
> >
> >
> >I don't know about the others, but for MapR, just send an email to
> >sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
> >to you with more information.
> >
> >
> >Best Regards,
> >Aaron Eng
> >
> >
> >
> >On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
> >
> >
> >Hi,
> >
> >We are trying to evaluate different implementations of Hadoop for our big
> >data enterprise project.
> >
> >Can the forum members advise on what are the advantages and disadvantages
> >of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> >
> >Thanks in advance.
> >
> >Regards,
> >Raj
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Embree <ce...@gmail.com>.
The only problem is around the degeneration of the discussion.  See
years long threads around vi vs. emacs, Windows vs. Linux, Java vs.
C/Python/Perl/Ruby.


On 9/13/13, Chris Mattmann <ma...@apache.org> wrote:
> Errr, what's wrong with discussing these types of issues on list?
>
> Nothing public here, and as long as it's kept to facts, this should
> not be a problem and Apache is a fine place to have such discussions.
>
> My 2c.
>
>
>
>
>
> -----Original Message-----
> From: Xuri Nagarin <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, September 12, 2013 4:39 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Cloudera Vs Hortonworks Vs MapR
>
>>I understand it can be contentious issue especially given that a lot of
>>contributors to this list work for one or the other vendor or have some
>>stake in any kind of evaluation. But, I see no reason why users should
>>not be able to compare notes
>> and share experiences. Over time, genuine pain points or issues or
>>claims will bubble up and should only help the community. Sure, there
>>will be a few flame wars but this already isn't a very tightly moderated
>>list.
>>
>>
>>
>>
>>
>>
>>
>>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
>><ae...@maprtech.com> wrote:
>>
>>Raj,
>>
>>
>>As others noted, this is not a great place for this discussion.  I'd
>>suggest contacting the vendors you are interested in as I'm sure we'd all
>>be happy to provide you more details.
>>
>>
>>I don't know about the others, but for MapR, just send an email to
>>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>>to you with more information.
>>
>>
>>Best Regards,
>>Aaron Eng
>>
>>
>>
>>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>>
>>
>>Hi,
>>
>>We are trying to evaluate different implementations of Hadoop for our big
>>data enterprise project.
>>
>>Can the forum members advise on what are the advantages and disadvantages
>>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>>Thanks in advance.
>>
>>Regards,
>>Raj
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Mattmann <ma...@apache.org>.
Errr, what's wrong with discussing these types of issues on list?

Nothing public here, and as long as it's kept to facts, this should
not be a problem and Apache is a fine place to have such discussions.

My 2c.





-----Original Message-----
From: Xuri Nagarin <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Thursday, September 12, 2013 4:39 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Cloudera Vs Hortonworks Vs MapR

>I understand it can be contentious issue especially given that a lot of
>contributors to this list work for one or the other vendor or have some
>stake in any kind of evaluation. But, I see no reason why users should
>not be able to compare notes
> and share experiences. Over time, genuine pain points or issues or
>claims will bubble up and should only help the community. Sure, there
>will be a few flame wars but this already isn't a very tightly moderated
>list.
>
>
>
>
>
>
>
>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
><ae...@maprtech.com> wrote:
>
>Raj,
>
>
>As others noted, this is not a great place for this discussion.  I'd
>suggest contacting the vendors you are interested in as I'm sure we'd all
>be happy to provide you more details.
>
>
>I don't know about the others, but for MapR, just send an email to
>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>to you with more information.
>
>
>Best Regards,
>Aaron Eng
>
>
>
>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>
>
>Hi,
>
>We are trying to evaluate different implementations of Hadoop for our big
>data enterprise project.
>
>Can the forum members advise on what are the advantages and disadvantages
>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
>Thanks in advance.
>
>Regards,
>Raj
>
>
>
>
>
>
>
>
>
>
>



Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Mattmann <ma...@apache.org>.
Errr, what's wrong with discussing these types of issues on list?

Nothing public here, and as long as it's kept to facts, this should
not be a problem and Apache is a fine place to have such discussions.

My 2c.





-----Original Message-----
From: Xuri Nagarin <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Thursday, September 12, 2013 4:39 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Cloudera Vs Hortonworks Vs MapR

>I understand it can be contentious issue especially given that a lot of
>contributors to this list work for one or the other vendor or have some
>stake in any kind of evaluation. But, I see no reason why users should
>not be able to compare notes
> and share experiences. Over time, genuine pain points or issues or
>claims will bubble up and should only help the community. Sure, there
>will be a few flame wars but this already isn't a very tightly moderated
>list.
>
>
>
>
>
>
>
>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
><ae...@maprtech.com> wrote:
>
>Raj,
>
>
>As others noted, this is not a great place for this discussion.  I'd
>suggest contacting the vendors you are interested in as I'm sure we'd all
>be happy to provide you more details.
>
>
>I don't know about the others, but for MapR, just send an email to
>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>to you with more information.
>
>
>Best Regards,
>Aaron Eng
>
>
>
>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>
>
>Hi,
>
>We are trying to evaluate different implementations of Hadoop for our big
>data enterprise project.
>
>Can the forum members advise on what are the advantages and disadvantages
>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
>Thanks in advance.
>
>Regards,
>Raj
>
>
>
>
>
>
>
>
>
>
>



Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Mattmann <ma...@apache.org>.
Errr, what's wrong with discussing these types of issues on list?

Nothing public here, and as long as it's kept to facts, this should
not be a problem and Apache is a fine place to have such discussions.

My 2c.





-----Original Message-----
From: Xuri Nagarin <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Thursday, September 12, 2013 4:39 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Cloudera Vs Hortonworks Vs MapR

>I understand it can be contentious issue especially given that a lot of
>contributors to this list work for one or the other vendor or have some
>stake in any kind of evaluation. But, I see no reason why users should
>not be able to compare notes
> and share experiences. Over time, genuine pain points or issues or
>claims will bubble up and should only help the community. Sure, there
>will be a few flame wars but this already isn't a very tightly moderated
>list.
>
>
>
>
>
>
>
>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
><ae...@maprtech.com> wrote:
>
>Raj,
>
>
>As others noted, this is not a great place for this discussion.  I'd
>suggest contacting the vendors you are interested in as I'm sure we'd all
>be happy to provide you more details.
>
>
>I don't know about the others, but for MapR, just send an email to
>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>to you with more information.
>
>
>Best Regards,
>Aaron Eng
>
>
>
>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>
>
>Hi,
>
>We are trying to evaluate different implementations of Hadoop for our big
>data enterprise project.
>
>Can the forum members advise on what are the advantages and disadvantages
>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
>Thanks in advance.
>
>Regards,
>Raj
>
>
>
>
>
>
>
>
>
>
>



Re: Cloudera Vs Hortonworks Vs MapR

Posted by Chris Mattmann <ma...@apache.org>.
Errr, what's wrong with discussing these types of issues on list?

Nothing public here, and as long as it's kept to facts, this should
not be a problem and Apache is a fine place to have such discussions.

My 2c.





-----Original Message-----
From: Xuri Nagarin <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Thursday, September 12, 2013 4:39 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Cloudera Vs Hortonworks Vs MapR

>I understand it can be contentious issue especially given that a lot of
>contributors to this list work for one or the other vendor or have some
>stake in any kind of evaluation. But, I see no reason why users should
>not be able to compare notes
> and share experiences. Over time, genuine pain points or issues or
>claims will bubble up and should only help the community. Sure, there
>will be a few flame wars but this already isn't a very tightly moderated
>list.
>
>
>
>
>
>
>
>On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng
><ae...@maprtech.com> wrote:
>
>Raj,
>
>
>As others noted, this is not a great place for this discussion.  I'd
>suggest contacting the vendors you are interested in as I'm sure we'd all
>be happy to provide you more details.
>
>
>I don't know about the others, but for MapR, just send an email to
>sales@mapr.com <ma...@mapr.com> and I'm sure someone will get back
>to you with more information.
>
>
>Best Regards,
>Aaron Eng
>
>
>
>On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>
>
>Hi,
>
>We are trying to evaluate different implementations of Hadoop for our big
>data enterprise project.
>
>Can the forum members advise on what are the advantages and disadvantages
>of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
>Thanks in advance.
>
>Regards,
>Raj
>
>
>
>
>
>
>
>
>
>
>



Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
I understand it can be contentious issue especially given that a lot of
contributors to this list work for one or the other vendor or have some
stake in any kind of evaluation. But, I see no reason why users should not
be able to compare notes and share experiences. Over time, genuine pain
points or issues or claims will bubble up and should only help the
community. Sure, there will be a few flame wars but this already isn't a
very tightly moderated list.




On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng <ae...@maprtech.com> wrote:

> Raj,
>
> As others noted, this is not a great place for this discussion.  I'd
> suggest contacting the vendors you are interested in as I'm sure we'd all
> be happy to provide you more details.
>
> I don't know about the others, but for MapR, just send an email to
> sales@mapr.com and I'm sure someone will get back to you with more
> information.
>
> Best Regards,
> Aaron Eng
>
>
> On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>
>> Hi,
>>
>> We are trying to evaluate different implementations of Hadoop for our big
>> data enterprise project.
>>
>> Can the forum members advise on what are the advantages and disadvantages
>> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>> Thanks in advance.
>>
>> Regards,
>> Raj
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
I understand it can be contentious issue especially given that a lot of
contributors to this list work for one or the other vendor or have some
stake in any kind of evaluation. But, I see no reason why users should not
be able to compare notes and share experiences. Over time, genuine pain
points or issues or claims will bubble up and should only help the
community. Sure, there will be a few flame wars but this already isn't a
very tightly moderated list.




On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng <ae...@maprtech.com> wrote:

> Raj,
>
> As others noted, this is not a great place for this discussion.  I'd
> suggest contacting the vendors you are interested in as I'm sure we'd all
> be happy to provide you more details.
>
> I don't know about the others, but for MapR, just send an email to
> sales@mapr.com and I'm sure someone will get back to you with more
> information.
>
> Best Regards,
> Aaron Eng
>
>
> On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>
>> Hi,
>>
>> We are trying to evaluate different implementations of Hadoop for our big
>> data enterprise project.
>>
>> Can the forum members advise on what are the advantages and disadvantages
>> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>> Thanks in advance.
>>
>> Regards,
>> Raj
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
I understand it can be contentious issue especially given that a lot of
contributors to this list work for one or the other vendor or have some
stake in any kind of evaluation. But, I see no reason why users should not
be able to compare notes and share experiences. Over time, genuine pain
points or issues or claims will bubble up and should only help the
community. Sure, there will be a few flame wars but this already isn't a
very tightly moderated list.




On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng <ae...@maprtech.com> wrote:

> Raj,
>
> As others noted, this is not a great place for this discussion.  I'd
> suggest contacting the vendors you are interested in as I'm sure we'd all
> be happy to provide you more details.
>
> I don't know about the others, but for MapR, just send an email to
> sales@mapr.com and I'm sure someone will get back to you with more
> information.
>
> Best Regards,
> Aaron Eng
>
>
> On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>
>> Hi,
>>
>> We are trying to evaluate different implementations of Hadoop for our big
>> data enterprise project.
>>
>> Can the forum members advise on what are the advantages and disadvantages
>> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>> Thanks in advance.
>>
>> Regards,
>> Raj
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Xuri Nagarin <se...@gmail.com>.
I understand it can be contentious issue especially given that a lot of
contributors to this list work for one or the other vendor or have some
stake in any kind of evaluation. But, I see no reason why users should not
be able to compare notes and share experiences. Over time, genuine pain
points or issues or claims will bubble up and should only help the
community. Sure, there will be a few flame wars but this already isn't a
very tightly moderated list.




On Thu, Sep 12, 2013 at 11:14 AM, Aaron Eng <ae...@maprtech.com> wrote:

> Raj,
>
> As others noted, this is not a great place for this discussion.  I'd
> suggest contacting the vendors you are interested in as I'm sure we'd all
> be happy to provide you more details.
>
> I don't know about the others, but for MapR, just send an email to
> sales@mapr.com and I'm sure someone will get back to you with more
> information.
>
> Best Regards,
> Aaron Eng
>
>
> On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:
>
>> Hi,
>>
>> We are trying to evaluate different implementations of Hadoop for our big
>> data enterprise project.
>>
>> Can the forum members advise on what are the advantages and disadvantages
>> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>>
>> Thanks in advance.
>>
>> Regards,
>> Raj
>
>
>

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Aaron Eng <ae...@maprtech.com>.
Raj,

As others noted, this is not a great place for this discussion.  I'd
suggest contacting the vendors you are interested in as I'm sure we'd all
be happy to provide you more details.

I don't know about the others, but for MapR, just send an email to
sales@mapr.com and I'm sure someone will get back to you with more
information.

Best Regards,
Aaron Eng


On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
>
> We are trying to evaluate different implementations of Hadoop for our big
> data enterprise project.
>
> Can the forum members advise on what are the advantages and disadvantages
> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
> Thanks in advance.
>
> Regards,
> Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Suresh Srinivas <su...@hortonworks.com>.
Raj,

You can also use Apache Hadoop releases. Bigtop does fine job as well
putting together consumable Hadoop stack.

As regards to vendor solutions, this is not the right forum. There are
other forums for this. Please refrain from this type of discussions on
Apache forum.

Regards,
Suresh


On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
>
> We are trying to evaluate different implementations of Hadoop for our big
> data enterprise project.
>
> Can the forum members advise on what are the advantages and disadvantages
> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
> Thanks in advance.
>
> Regards,
> Raj




-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Suresh Srinivas <su...@hortonworks.com>.
Raj,

You can also use Apache Hadoop releases. Bigtop does fine job as well
putting together consumable Hadoop stack.

As regards to vendor solutions, this is not the right forum. There are
other forums for this. Please refrain from this type of discussions on
Apache forum.

Regards,
Suresh


On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
>
> We are trying to evaluate different implementations of Hadoop for our big
> data enterprise project.
>
> Can the forum members advise on what are the advantages and disadvantages
> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
> Thanks in advance.
>
> Regards,
> Raj




-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

RE: Cloudera Vs Hortonworks Vs MapR

Posted by "Smith, Joshua D." <Jo...@gd-ais.com>.
Cloudera has the widest distribution and distinguishes itself with Cloudera Impala, Cloudera Search and Sentry (all open source). It also comes with Cloudera Manager which is proprietary, but free for selected functionality.

Hortonworks distinguishes itself as being pure open source (no proprietary extensions) and being able to run on Microsoft Windows as well as Linux. Hortonworks comes with Ambari to perform functions similar to Cloudera Manager, but Ambari is open source.

MapR has a number of proprietary pieces. They distinguish themselves based on performance.

Of course every vendor may disagree with one or more of the characterizations that I've given above, but that's how I've come to view them. Of course, the landscape is always changing, so you'll have to evaluate the current offerings.

Josh

-----Original Message-----
From: Hadoop Raj [mailto:hadoopraj@yahoo.com] 
Sent: Thursday, September 12, 2013 1:20 PM
To: User
Subject: Cloudera Vs Hortonworks Vs MapR

Hi,

We are trying to evaluate different implementations of Hadoop for our big data enterprise project.

Can the forum members advise on what are the advantages and disadvantages of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.

Thanks in advance.

Regards,
Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Marco Shaw <ma...@gmail.com>.
Hi

    
      


    I don't this is the approprite place to discuss this. 

    
      


    This list should be a vendor-neutral service. 

    
      


    You arr encouraged to do your own research or look through the popular search engines for others who may have already done such an anlysis based on their special requirements. 

    
      


    Marco

On Thu, Sep 12, 2013 at 2:22 PM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
> We are trying to evaluate different implementations of Hadoop for our big data enterprise project.
> Can the forum members advise on what are the advantages and disadvantages of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> Thanks in advance.
> Regards,
> Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Marco Shaw <ma...@gmail.com>.
Hi

    
      


    I don't this is the approprite place to discuss this. 

    
      


    This list should be a vendor-neutral service. 

    
      


    You arr encouraged to do your own research or look through the popular search engines for others who may have already done such an anlysis based on their special requirements. 

    
      


    Marco

On Thu, Sep 12, 2013 at 2:22 PM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
> We are trying to evaluate different implementations of Hadoop for our big data enterprise project.
> Can the forum members advise on what are the advantages and disadvantages of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> Thanks in advance.
> Regards,
> Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Suresh Srinivas <su...@hortonworks.com>.
Raj,

You can also use Apache Hadoop releases. Bigtop does fine job as well
putting together consumable Hadoop stack.

As regards to vendor solutions, this is not the right forum. There are
other forums for this. Please refrain from this type of discussions on
Apache forum.

Regards,
Suresh


On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
>
> We are trying to evaluate different implementations of Hadoop for our big
> data enterprise project.
>
> Can the forum members advise on what are the advantages and disadvantages
> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
> Thanks in advance.
>
> Regards,
> Raj




-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

RE: Cloudera Vs Hortonworks Vs MapR

Posted by "Smith, Joshua D." <Jo...@gd-ais.com>.
Cloudera has the widest distribution and distinguishes itself with Cloudera Impala, Cloudera Search and Sentry (all open source). It also comes with Cloudera Manager which is proprietary, but free for selected functionality.

Hortonworks distinguishes itself as being pure open source (no proprietary extensions) and being able to run on Microsoft Windows as well as Linux. Hortonworks comes with Ambari to perform functions similar to Cloudera Manager, but Ambari is open source.

MapR has a number of proprietary pieces. They distinguish themselves based on performance.

Of course every vendor may disagree with one or more of the characterizations that I've given above, but that's how I've come to view them. Of course, the landscape is always changing, so you'll have to evaluate the current offerings.

Josh

-----Original Message-----
From: Hadoop Raj [mailto:hadoopraj@yahoo.com] 
Sent: Thursday, September 12, 2013 1:20 PM
To: User
Subject: Cloudera Vs Hortonworks Vs MapR

Hi,

We are trying to evaluate different implementations of Hadoop for our big data enterprise project.

Can the forum members advise on what are the advantages and disadvantages of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.

Thanks in advance.

Regards,
Raj

RE: Cloudera Vs Hortonworks Vs MapR

Posted by "Smith, Joshua D." <Jo...@gd-ais.com>.
Cloudera has the widest distribution and distinguishes itself with Cloudera Impala, Cloudera Search and Sentry (all open source). It also comes with Cloudera Manager which is proprietary, but free for selected functionality.

Hortonworks distinguishes itself as being pure open source (no proprietary extensions) and being able to run on Microsoft Windows as well as Linux. Hortonworks comes with Ambari to perform functions similar to Cloudera Manager, but Ambari is open source.

MapR has a number of proprietary pieces. They distinguish themselves based on performance.

Of course every vendor may disagree with one or more of the characterizations that I've given above, but that's how I've come to view them. Of course, the landscape is always changing, so you'll have to evaluate the current offerings.

Josh

-----Original Message-----
From: Hadoop Raj [mailto:hadoopraj@yahoo.com] 
Sent: Thursday, September 12, 2013 1:20 PM
To: User
Subject: Cloudera Vs Hortonworks Vs MapR

Hi,

We are trying to evaluate different implementations of Hadoop for our big data enterprise project.

Can the forum members advise on what are the advantages and disadvantages of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.

Thanks in advance.

Regards,
Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Aaron Eng <ae...@maprtech.com>.
Raj,

As others noted, this is not a great place for this discussion.  I'd
suggest contacting the vendors you are interested in as I'm sure we'd all
be happy to provide you more details.

I don't know about the others, but for MapR, just send an email to
sales@mapr.com and I'm sure someone will get back to you with more
information.

Best Regards,
Aaron Eng


On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
>
> We are trying to evaluate different implementations of Hadoop for our big
> data enterprise project.
>
> Can the forum members advise on what are the advantages and disadvantages
> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
> Thanks in advance.
>
> Regards,
> Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Aaron Eng <ae...@maprtech.com>.
Raj,

As others noted, this is not a great place for this discussion.  I'd
suggest contacting the vendors you are interested in as I'm sure we'd all
be happy to provide you more details.

I don't know about the others, but for MapR, just send an email to
sales@mapr.com and I'm sure someone will get back to you with more
information.

Best Regards,
Aaron Eng


On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
>
> We are trying to evaluate different implementations of Hadoop for our big
> data enterprise project.
>
> Can the forum members advise on what are the advantages and disadvantages
> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
> Thanks in advance.
>
> Regards,
> Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Marco Shaw <ma...@gmail.com>.
Hi

    
      


    I don't this is the approprite place to discuss this. 

    
      


    This list should be a vendor-neutral service. 

    
      


    You arr encouraged to do your own research or look through the popular search engines for others who may have already done such an anlysis based on their special requirements. 

    
      


    Marco

On Thu, Sep 12, 2013 at 2:22 PM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
> We are trying to evaluate different implementations of Hadoop for our big data enterprise project.
> Can the forum members advise on what are the advantages and disadvantages of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> Thanks in advance.
> Regards,
> Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Marco Shaw <ma...@gmail.com>.
Hi

    
      


    I don't this is the approprite place to discuss this. 

    
      


    This list should be a vendor-neutral service. 

    
      


    You arr encouraged to do your own research or look through the popular search engines for others who may have already done such an anlysis based on their special requirements. 

    
      


    Marco

On Thu, Sep 12, 2013 at 2:22 PM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
> We are trying to evaluate different implementations of Hadoop for our big data enterprise project.
> Can the forum members advise on what are the advantages and disadvantages of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
> Thanks in advance.
> Regards,
> Raj

Re: Cloudera Vs Hortonworks Vs MapR

Posted by Aaron Eng <ae...@maprtech.com>.
Raj,

As others noted, this is not a great place for this discussion.  I'd
suggest contacting the vendors you are interested in as I'm sure we'd all
be happy to provide you more details.

I don't know about the others, but for MapR, just send an email to
sales@mapr.com and I'm sure someone will get back to you with more
information.

Best Regards,
Aaron Eng


On Thu, Sep 12, 2013 at 10:19 AM, Hadoop Raj <ha...@yahoo.com> wrote:

> Hi,
>
> We are trying to evaluate different implementations of Hadoop for our big
> data enterprise project.
>
> Can the forum members advise on what are the advantages and disadvantages
> of each implementation i.e. Cloudera Vs Hortonworks Vs MapR.
>
> Thanks in advance.
>
> Regards,
> Raj