You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Smith Stan <sm...@gmail.com> on 2009/10/03 01:02:56 UTC

Hey Cloudera can you help us In beating Google Yahoo Facebook?

Hey Cloudera genius guys .

I read this

Via Cloudera, Hadoop is currently used by most of the giants in the
space including Google, Yahoo, Facebook (we wrote about Facebook’s use
of Cloudera here), Amazon, AOL, Baidu and more.

On.
http://www.techcrunch.com/2009/10/01/hadoop-clusters-get-a-monitoring-client-with-cloudera-desktop/

if this is true can you guys help us beat Y G and F.

Is it true that Google uses hadoop?
Is it true that above mentoned giants use Hadoop via Cloudera?

Thanks,
Stan S

Re: Hey Cloudera can you help us In beating Google Yahoo Facebook?

Posted by Stefan Groschupf <sg...@101tec.com>.
Hi Ted,

I'm sure Stan meant that in a satirical way.
Techcrunch article gives the impression Hadoop was developed by the  
cloudera boys and all the big companies including Y! use their  
distribution.

:-)

Stefan



On Oct 2, 2009, at 4:36 PM, Ted Dunning wrote:

> On Fri, Oct 2, 2009 at 4:02 PM, Smith Stan <sm...@gmail.com>  
> wrote:
>
>> if this is true can you guys help us beat Y G and F.
>>
>
> What do you mean beat Yahoo, Google and Facebook?
>
> Is it true that Google uses hadoop?
>>
>
> Yes.  Mostly for educational purposes, not internal production.
>
>
>> Is it true that above mentoned giants use Hadoop via Cloudera?
>>
>
> Yahoo sponsored most of the writing of Yahoo and does not use  
> Cloudera's
> distribution.
>
> Facebook sponsored the writing of Hive and probably still runs their  
> own
> version of Hadoop.
>
> Why do you care if they use Cloudera's distribution?
>
> -- 
> Ted Dunning, CTO
> DeepDyve

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Hadoop training and consulting
http://www.scaleunlimited.com
http://www.101tec.com




Re: Hey Cloudera can you help us In beating Google Yahoo Facebook?

Posted by Ted Dunning <te...@gmail.com>.
On Fri, Oct 2, 2009 at 4:02 PM, Smith Stan <sm...@gmail.com> wrote:

> if this is true can you guys help us beat Y G and F.
>

What do you mean beat Yahoo, Google and Facebook?

 Is it true that Google uses hadoop?
>

Yes.  Mostly for educational purposes, not internal production.


>  Is it true that above mentoned giants use Hadoop via Cloudera?
>

Yahoo sponsored most of the writing of Yahoo and does not use Cloudera's
distribution.

Facebook sponsored the writing of Hive and probably still runs their own
version of Hadoop.

Why do you care if they use Cloudera's distribution?

-- 
Ted Dunning, CTO
DeepDyve

Re: Hey Cloudera can you help us In beating Google Yahoo Facebook?

Posted by Ted Dunning <te...@gmail.com>.
On Fri, Oct 2, 2009 at 4:02 PM, Smith Stan <sm...@gmail.com> wrote:

> if this is true can you guys help us beat Y G and F.
>

What do you mean beat Yahoo, Google and Facebook?

 Is it true that Google uses hadoop?
>

Yes.  Mostly for educational purposes, not internal production.


>  Is it true that above mentoned giants use Hadoop via Cloudera?
>

Yahoo sponsored most of the writing of Yahoo and does not use Cloudera's
distribution.

Facebook sponsored the writing of Hive and probably still runs their own
version of Hadoop.

Why do you care if they use Cloudera's distribution?

-- 
Ted Dunning, CTO
DeepDyve

Re: Hey Cloudera can you help us In beating Google Yahoo Facebook?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Stan,

First, this is not the Cloudera mailing list and this is not a dev question.

Also, AFAIK, Google uses Hadoop only to interface with people outside
since MapReduce works the same way.
I think this article is wrong in saying that Google, Yahoo! and
Facebook are using Hadoop via Cloudera and I'm 99% sure of that. They
all have enough expertise to not be dependent on a support contract
and Y! even has it's own distro of Hadoop (tho not supported like
cloudera does). Maybe Leena Rao thought that Cloudera were the only
ones developing Hadoop and took the biggest names out of the PoweredBy
page.

J-D

On Fri, Oct 2, 2009 at 7:02 PM, Smith Stan <sm...@gmail.com> wrote:
> Hey Cloudera genius guys .
>
> I read this
>
> Via Cloudera, Hadoop is currently used by most of the giants in the
> space including Google, Yahoo, Facebook (we wrote about Facebook’s use
> of Cloudera here), Amazon, AOL, Baidu and more.
>
> On.
> http://www.techcrunch.com/2009/10/01/hadoop-clusters-get-a-monitoring-client-with-cloudera-desktop/
>
> if this is true can you guys help us beat Y G and F.
>
> Is it true that Google uses hadoop?
> Is it true that above mentoned giants use Hadoop via Cloudera?
>
> Thanks,
> Stan S
>

Re: Hey Cloudera can you help us In beating Google Yahoo Facebook?

Posted by Amr Awadallah <aa...@cloudera.com>.
  As other folks said, if you need to communicate with Cloudera then 
please use info ait cloudera d0t com, this is not the right forum for that.

  That said, some blog reporters make mistakes like this all the time, 
despite all of our efforts to properly educate them about the space. 
We'll reach out and ask that they post a correction to the last 
paragraph but I can't promise that it will happen. I want to make it 
clear on this public forum that Cloudera's intention is certainly *not* 
to belittle the contribution of Yahoo to Apache Hadoop (or Facebook for 
that matter), we all know that without their backing Hadoop probably 
wouldn't be as successful as it is today. Finally, all this press is 
good for Hadoop, and it will, hopefully, lead to more companies using it 
which will only serve to strengthen the platform and grow it even more.

-- amr

Steve Loughran wrote:
> Smith Stan wrote:
>> Hey Cloudera genius guys .
>
> Sorry, not cloudera. I speak for myself.
>
>> I read this
>>
>> Via Cloudera, Hadoop is currently used by most of the giants in the
>> space including Google, Yahoo, Facebook (we wrote about Facebook’s use
>> of Cloudera here), Amazon, AOL, Baidu and more.
>
> I would be doubful that any on that list use the cloudera distro, 
> because once you manage a cluster to the extent you create your own 
> RPMs for PXE-preboot and kickstart install then you know what you are 
> doing and will be worrying more about the power budget of your 
> datacentre -as measured in megawatts-, and whether your off-site 
> replication plan is copying data to other facilities on different 
> earthquake fault lines for than how hadoop-site.xml works.
>
>>
>> On.
>> http://www.techcrunch.com/2009/10/01/hadoop-clusters-get-a-monitoring-client-with-cloudera-desktop/ 
>>
>>
>> if this is true can you guys help us beat Y G and F.
>
> This is not much different from saying these companies all use TCP/IP, 
> Http, MySQL and Linux, therefore a Linux server running apache and 
> mysqld will help you to beat them.
>
> Hadoop is a tool for very large datasets, works best if you can group 
> and scan them independently.
>
> * If you do not know what you are doing, it will not help
> * if you do not have a sufficiently large dataset, it is not worth the 
> effort
> * if you havent outgrown an RDBMS, stick with the database
> * Cloudera are offering to help with running/using hadoop, but they 
> aren't going to code your datamining algorithms for you.
>
> see also: 
> http://teddziuba.com/2008/04/im-going-to-scale-my-foot-up-y.html
>
> -Steve

Re: Hey Cloudera can you help us In beating Google Yahoo Facebook?

Posted by Steve Loughran <st...@apache.org>.
Smith Stan wrote:
> Hey Cloudera genius guys .

Sorry, not cloudera. I speak for myself.

> I read this
> 
> Via Cloudera, Hadoop is currently used by most of the giants in the
> space including Google, Yahoo, Facebook (we wrote about Facebook’s use
> of Cloudera here), Amazon, AOL, Baidu and more.

I would be doubful that any on that list use the cloudera distro, 
because once you manage a cluster to the extent you create your own RPMs 
for PXE-preboot and kickstart install then you know what you are doing 
and will be worrying more about the power budget of your datacentre -as 
measured in megawatts-, and whether your off-site replication plan is 
copying data to other facilities on different earthquake fault lines for 
than how hadoop-site.xml works.

> 
> On.
> http://www.techcrunch.com/2009/10/01/hadoop-clusters-get-a-monitoring-client-with-cloudera-desktop/
> 
> if this is true can you guys help us beat Y G and F.

This is not much different from saying these companies all use TCP/IP, 
Http, MySQL and Linux, therefore a Linux server running apache and 
mysqld will help you to beat them.

Hadoop is a tool for very large datasets, works best if you can group 
and scan them independently.

* If you do not know what you are doing, it will not help
* if you do not have a sufficiently large dataset, it is not worth the 
effort
* if you havent outgrown an RDBMS, stick with the database
* Cloudera are offering to help with running/using hadoop, but they 
aren't going to code your datamining algorithms for you.

see also: http://teddziuba.com/2008/04/im-going-to-scale-my-foot-up-y.html

-Steve