You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by mani kandan <ma...@gmail.com> on 2014/08/12 22:12:24 UTC

Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Which distribution are you people using? Cloudera vs Hortonworks vs
Biginsights?

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Is this up to date?

http://www.mapr.com/products/product-overview/overview


Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Aaron Eng 
Sent: Tuesday, August 12, 2014 4:31 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

On that note, 2 is also misleading/incomplete.  You might want to explain which specific features you are referencing so the original poster can figure out if those features are relevant.  The inverse of 2 is also true, things like consistent snapshots and full random read/write over NFS are in MapR and not in HDFS.




On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:

  3. seems a biased and incomplete statement. 

  Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

  Nobody is required to use it to run a Hadoop cluster.

  Kai (a Cloudera Employee)

  Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:


    Hortonworks. Here is my reasoning:
    1. Hortonwork is 100% open source.
    2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
    3. Cloudera has proprietary stuff in their stack. No.
    4. Hortonworks makes training super accessible and there is a community around it.
    5. Who the heck is BigInsights? (Which should tell you something.)

    Adaryl "Bob" Wakefield, MBA
    Principal
    Mass Street Analytics
    913.938.6685
    www.linkedin.com/in/bobwakefieldmba
    Twitter: @BobLovesData

    From: mani kandan 
    Sent: Tuesday, August 12, 2014 3:12 PM
    To: user@hadoop.apache.org 
    Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

    Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 



------------------------------------------------------------------------------
  Kai Voigt Am Germaniahafen 1 k@123.org
  24143 Kiel +49 160 96683050
  Germany @KaiVoigt


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Jay Vyas <ja...@gmail.com>.
also, consider apache bigtop. That is the apache upstream Hadoop initiative, and it comes with smoke tests+ Puppet recipes for setting up your own Hadoop distro from scratch.

IMHO ... If learning or building your own tooling around Hadoop , bigtop is ideal.  If interested in purchasing support , than the vendor distros are a good gateway.

> On Aug 12, 2014, at 5:31 PM, Aaron Eng <ae...@maprtech.com> wrote:
> 
> On that note, 2 is also misleading/incomplete.  You might want to explain which specific features you are referencing so the original poster can figure out if those features are relevant.  The inverse of 2 is also true, things like consistent snapshots and full random read/write over NFS are in MapR and not in HDFS.
> 
> 
>> On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:
>> 3. seems a biased and incomplete statement.
>> 
>> Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.
>> 
>> Nobody is required to use it to run a Hadoop cluster.
>> 
>> Kai (a Cloudera Employee)
>> 
>>> Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:
>>> 
>>> Hortonworks. Here is my reasoning:
>>> 1. Hortonwork is 100% open source.
>>> 2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
>>> 3. Cloudera has proprietary stuff in their stack. No.
>>> 4. Hortonworks makes training super accessible and there is a community around it.
>>> 5. Who the heck is BigInsights? (Which should tell you something.)
>>>  
>>> Adaryl "Bob" Wakefield, MBA
>>> Principal
>>> Mass Street Analytics
>>> 913.938.6685
>>> www.linkedin.com/in/bobwakefieldmba
>>> Twitter: @BobLovesData
>>>  
>>> From: mani kandan
>>> Sent: Tuesday, August 12, 2014 3:12 PM
>>> To: user@hadoop.apache.org
>>> Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?
>>>  
>>> Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?
>>> 
>> 
>> Kai Voigt			Am Germaniahafen 1			k@123.org
>> 					24143 Kiel					+49 160 96683050
>> 					Germany						@KaiVoigt
> 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Jay Vyas <ja...@gmail.com>.
also, consider apache bigtop. That is the apache upstream Hadoop initiative, and it comes with smoke tests+ Puppet recipes for setting up your own Hadoop distro from scratch.

IMHO ... If learning or building your own tooling around Hadoop , bigtop is ideal.  If interested in purchasing support , than the vendor distros are a good gateway.

> On Aug 12, 2014, at 5:31 PM, Aaron Eng <ae...@maprtech.com> wrote:
> 
> On that note, 2 is also misleading/incomplete.  You might want to explain which specific features you are referencing so the original poster can figure out if those features are relevant.  The inverse of 2 is also true, things like consistent snapshots and full random read/write over NFS are in MapR and not in HDFS.
> 
> 
>> On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:
>> 3. seems a biased and incomplete statement.
>> 
>> Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.
>> 
>> Nobody is required to use it to run a Hadoop cluster.
>> 
>> Kai (a Cloudera Employee)
>> 
>>> Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:
>>> 
>>> Hortonworks. Here is my reasoning:
>>> 1. Hortonwork is 100% open source.
>>> 2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
>>> 3. Cloudera has proprietary stuff in their stack. No.
>>> 4. Hortonworks makes training super accessible and there is a community around it.
>>> 5. Who the heck is BigInsights? (Which should tell you something.)
>>>  
>>> Adaryl "Bob" Wakefield, MBA
>>> Principal
>>> Mass Street Analytics
>>> 913.938.6685
>>> www.linkedin.com/in/bobwakefieldmba
>>> Twitter: @BobLovesData
>>>  
>>> From: mani kandan
>>> Sent: Tuesday, August 12, 2014 3:12 PM
>>> To: user@hadoop.apache.org
>>> Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?
>>>  
>>> Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?
>>> 
>> 
>> Kai Voigt			Am Germaniahafen 1			k@123.org
>> 					24143 Kiel					+49 160 96683050
>> 					Germany						@KaiVoigt
> 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Is this up to date?

http://www.mapr.com/products/product-overview/overview


Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Aaron Eng 
Sent: Tuesday, August 12, 2014 4:31 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

On that note, 2 is also misleading/incomplete.  You might want to explain which specific features you are referencing so the original poster can figure out if those features are relevant.  The inverse of 2 is also true, things like consistent snapshots and full random read/write over NFS are in MapR and not in HDFS.




On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:

  3. seems a biased and incomplete statement. 

  Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

  Nobody is required to use it to run a Hadoop cluster.

  Kai (a Cloudera Employee)

  Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:


    Hortonworks. Here is my reasoning:
    1. Hortonwork is 100% open source.
    2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
    3. Cloudera has proprietary stuff in their stack. No.
    4. Hortonworks makes training super accessible and there is a community around it.
    5. Who the heck is BigInsights? (Which should tell you something.)

    Adaryl "Bob" Wakefield, MBA
    Principal
    Mass Street Analytics
    913.938.6685
    www.linkedin.com/in/bobwakefieldmba
    Twitter: @BobLovesData

    From: mani kandan 
    Sent: Tuesday, August 12, 2014 3:12 PM
    To: user@hadoop.apache.org 
    Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

    Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 



------------------------------------------------------------------------------
  Kai Voigt Am Germaniahafen 1 k@123.org
  24143 Kiel +49 160 96683050
  Germany @KaiVoigt


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Jay Vyas <ja...@gmail.com>.
also, consider apache bigtop. That is the apache upstream Hadoop initiative, and it comes with smoke tests+ Puppet recipes for setting up your own Hadoop distro from scratch.

IMHO ... If learning or building your own tooling around Hadoop , bigtop is ideal.  If interested in purchasing support , than the vendor distros are a good gateway.

> On Aug 12, 2014, at 5:31 PM, Aaron Eng <ae...@maprtech.com> wrote:
> 
> On that note, 2 is also misleading/incomplete.  You might want to explain which specific features you are referencing so the original poster can figure out if those features are relevant.  The inverse of 2 is also true, things like consistent snapshots and full random read/write over NFS are in MapR and not in HDFS.
> 
> 
>> On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:
>> 3. seems a biased and incomplete statement.
>> 
>> Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.
>> 
>> Nobody is required to use it to run a Hadoop cluster.
>> 
>> Kai (a Cloudera Employee)
>> 
>>> Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:
>>> 
>>> Hortonworks. Here is my reasoning:
>>> 1. Hortonwork is 100% open source.
>>> 2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
>>> 3. Cloudera has proprietary stuff in their stack. No.
>>> 4. Hortonworks makes training super accessible and there is a community around it.
>>> 5. Who the heck is BigInsights? (Which should tell you something.)
>>>  
>>> Adaryl "Bob" Wakefield, MBA
>>> Principal
>>> Mass Street Analytics
>>> 913.938.6685
>>> www.linkedin.com/in/bobwakefieldmba
>>> Twitter: @BobLovesData
>>>  
>>> From: mani kandan
>>> Sent: Tuesday, August 12, 2014 3:12 PM
>>> To: user@hadoop.apache.org
>>> Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?
>>>  
>>> Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?
>>> 
>> 
>> Kai Voigt			Am Germaniahafen 1			k@123.org
>> 					24143 Kiel					+49 160 96683050
>> 					Germany						@KaiVoigt
> 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Is this up to date?

http://www.mapr.com/products/product-overview/overview


Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Aaron Eng 
Sent: Tuesday, August 12, 2014 4:31 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

On that note, 2 is also misleading/incomplete.  You might want to explain which specific features you are referencing so the original poster can figure out if those features are relevant.  The inverse of 2 is also true, things like consistent snapshots and full random read/write over NFS are in MapR and not in HDFS.




On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:

  3. seems a biased and incomplete statement. 

  Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

  Nobody is required to use it to run a Hadoop cluster.

  Kai (a Cloudera Employee)

  Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:


    Hortonworks. Here is my reasoning:
    1. Hortonwork is 100% open source.
    2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
    3. Cloudera has proprietary stuff in their stack. No.
    4. Hortonworks makes training super accessible and there is a community around it.
    5. Who the heck is BigInsights? (Which should tell you something.)

    Adaryl "Bob" Wakefield, MBA
    Principal
    Mass Street Analytics
    913.938.6685
    www.linkedin.com/in/bobwakefieldmba
    Twitter: @BobLovesData

    From: mani kandan 
    Sent: Tuesday, August 12, 2014 3:12 PM
    To: user@hadoop.apache.org 
    Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

    Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 



------------------------------------------------------------------------------
  Kai Voigt Am Germaniahafen 1 k@123.org
  24143 Kiel +49 160 96683050
  Germany @KaiVoigt


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Is this up to date?

http://www.mapr.com/products/product-overview/overview


Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Aaron Eng 
Sent: Tuesday, August 12, 2014 4:31 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

On that note, 2 is also misleading/incomplete.  You might want to explain which specific features you are referencing so the original poster can figure out if those features are relevant.  The inverse of 2 is also true, things like consistent snapshots and full random read/write over NFS are in MapR and not in HDFS.




On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:

  3. seems a biased and incomplete statement. 

  Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

  Nobody is required to use it to run a Hadoop cluster.

  Kai (a Cloudera Employee)

  Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:


    Hortonworks. Here is my reasoning:
    1. Hortonwork is 100% open source.
    2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
    3. Cloudera has proprietary stuff in their stack. No.
    4. Hortonworks makes training super accessible and there is a community around it.
    5. Who the heck is BigInsights? (Which should tell you something.)

    Adaryl "Bob" Wakefield, MBA
    Principal
    Mass Street Analytics
    913.938.6685
    www.linkedin.com/in/bobwakefieldmba
    Twitter: @BobLovesData

    From: mani kandan 
    Sent: Tuesday, August 12, 2014 3:12 PM
    To: user@hadoop.apache.org 
    Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

    Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 



------------------------------------------------------------------------------
  Kai Voigt Am Germaniahafen 1 k@123.org
  24143 Kiel +49 160 96683050
  Germany @KaiVoigt


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Jay Vyas <ja...@gmail.com>.
also, consider apache bigtop. That is the apache upstream Hadoop initiative, and it comes with smoke tests+ Puppet recipes for setting up your own Hadoop distro from scratch.

IMHO ... If learning or building your own tooling around Hadoop , bigtop is ideal.  If interested in purchasing support , than the vendor distros are a good gateway.

> On Aug 12, 2014, at 5:31 PM, Aaron Eng <ae...@maprtech.com> wrote:
> 
> On that note, 2 is also misleading/incomplete.  You might want to explain which specific features you are referencing so the original poster can figure out if those features are relevant.  The inverse of 2 is also true, things like consistent snapshots and full random read/write over NFS are in MapR and not in HDFS.
> 
> 
>> On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:
>> 3. seems a biased and incomplete statement.
>> 
>> Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.
>> 
>> Nobody is required to use it to run a Hadoop cluster.
>> 
>> Kai (a Cloudera Employee)
>> 
>>> Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:
>>> 
>>> Hortonworks. Here is my reasoning:
>>> 1. Hortonwork is 100% open source.
>>> 2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
>>> 3. Cloudera has proprietary stuff in their stack. No.
>>> 4. Hortonworks makes training super accessible and there is a community around it.
>>> 5. Who the heck is BigInsights? (Which should tell you something.)
>>>  
>>> Adaryl "Bob" Wakefield, MBA
>>> Principal
>>> Mass Street Analytics
>>> 913.938.6685
>>> www.linkedin.com/in/bobwakefieldmba
>>> Twitter: @BobLovesData
>>>  
>>> From: mani kandan
>>> Sent: Tuesday, August 12, 2014 3:12 PM
>>> To: user@hadoop.apache.org
>>> Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?
>>>  
>>> Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?
>>> 
>> 
>> Kai Voigt			Am Germaniahafen 1			k@123.org
>> 					24143 Kiel					+49 160 96683050
>> 					Germany						@KaiVoigt
> 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Aaron Eng <ae...@maprtech.com>.
On that note, 2 is also misleading/incomplete.  You might want to explain
which specific features you are referencing so the original poster can
figure out if those features are relevant.  The inverse of 2 is also true,
things like consistent snapshots and full random read/write over NFS are in
MapR and not in HDFS.


On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:

> 3. seems a biased and incomplete statement.
>
> Cloudera’s distribution CDH is fully open source. The proprietary „stuff"
> you refer to is most likely Cloudera Manager, an additional tool to make
> deployment, configuration and monitoring easy.
>
> Nobody is required to use it to run a Hadoop cluster.
>
> Kai (a Cloudera Employee)
>
> Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <
> adaryl.wakefield@hotmail.com>:
>
>   Hortonworks. Here is my reasoning:
> 1. Hortonwork is 100% open source.
> 2. MapR has stuff on their roadmap that Hortonworks has already
> accomplished and has moved on to other things.
> 3. Cloudera has proprietary stuff in their stack. No.
> 4. Hortonworks makes training super accessible and there is a community
> around it.
> 5. Who the heck is BigInsights? (Which should tell you something.)
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>
>  *From:* mani kandan <ma...@gmail.com>
> *Sent:* Tuesday, August 12, 2014 3:12 PM
> *To:* user@hadoop.apache.org
> *Subject:* Started learning Hadoop. Which distribution is best for native
> install in pseudo distributed mode?
>
>
> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>
>
> ------------------------------
> *Kai Voigt* Am Germaniahafen 1 k@123.org
> 24143 Kiel +49 160 96683050
> Germany @KaiVoigt
>
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
You fell into my trap sir. I was hoping someone would clear that up. :)

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Kai Voigt 
Sent: Tuesday, August 12, 2014 4:10 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

3. seems a biased and incomplete statement. 

Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

Nobody is required to use it to run a Hadoop cluster.

Kai (a Cloudera Employee)

Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:


  Hortonworks. Here is my reasoning:
  1. Hortonwork is 100% open source.
  2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
  3. Cloudera has proprietary stuff in their stack. No.
  4. Hortonworks makes training super accessible and there is a community around it.
  5. Who the heck is BigInsights? (Which should tell you something.)

  Adaryl "Bob" Wakefield, MBA
  Principal
  Mass Street Analytics
  913.938.6685
  www.linkedin.com/in/bobwakefieldmba
  Twitter: @BobLovesData

  From: mani kandan 
  Sent: Tuesday, August 12, 2014 3:12 PM
  To: user@hadoop.apache.org 
  Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

  Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 



--------------------------------------------------------------------------------
Kai Voigt Am Germaniahafen 1 k@123.org
24143 Kiel +49 160 96683050
Germany @KaiVoigt

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Aaron Eng <ae...@maprtech.com>.
On that note, 2 is also misleading/incomplete.  You might want to explain
which specific features you are referencing so the original poster can
figure out if those features are relevant.  The inverse of 2 is also true,
things like consistent snapshots and full random read/write over NFS are in
MapR and not in HDFS.


On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:

> 3. seems a biased and incomplete statement.
>
> Cloudera’s distribution CDH is fully open source. The proprietary „stuff"
> you refer to is most likely Cloudera Manager, an additional tool to make
> deployment, configuration and monitoring easy.
>
> Nobody is required to use it to run a Hadoop cluster.
>
> Kai (a Cloudera Employee)
>
> Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <
> adaryl.wakefield@hotmail.com>:
>
>   Hortonworks. Here is my reasoning:
> 1. Hortonwork is 100% open source.
> 2. MapR has stuff on their roadmap that Hortonworks has already
> accomplished and has moved on to other things.
> 3. Cloudera has proprietary stuff in their stack. No.
> 4. Hortonworks makes training super accessible and there is a community
> around it.
> 5. Who the heck is BigInsights? (Which should tell you something.)
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>
>  *From:* mani kandan <ma...@gmail.com>
> *Sent:* Tuesday, August 12, 2014 3:12 PM
> *To:* user@hadoop.apache.org
> *Subject:* Started learning Hadoop. Which distribution is best for native
> install in pseudo distributed mode?
>
>
> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>
>
> ------------------------------
> *Kai Voigt* Am Germaniahafen 1 k@123.org
> 24143 Kiel +49 160 96683050
> Germany @KaiVoigt
>
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
You fell into my trap sir. I was hoping someone would clear that up. :)

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Kai Voigt 
Sent: Tuesday, August 12, 2014 4:10 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

3. seems a biased and incomplete statement. 

Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

Nobody is required to use it to run a Hadoop cluster.

Kai (a Cloudera Employee)

Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:


  Hortonworks. Here is my reasoning:
  1. Hortonwork is 100% open source.
  2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
  3. Cloudera has proprietary stuff in their stack. No.
  4. Hortonworks makes training super accessible and there is a community around it.
  5. Who the heck is BigInsights? (Which should tell you something.)

  Adaryl "Bob" Wakefield, MBA
  Principal
  Mass Street Analytics
  913.938.6685
  www.linkedin.com/in/bobwakefieldmba
  Twitter: @BobLovesData

  From: mani kandan 
  Sent: Tuesday, August 12, 2014 3:12 PM
  To: user@hadoop.apache.org 
  Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

  Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 



--------------------------------------------------------------------------------
Kai Voigt Am Germaniahafen 1 k@123.org
24143 Kiel +49 160 96683050
Germany @KaiVoigt

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
You fell into my trap sir. I was hoping someone would clear that up. :)

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Kai Voigt 
Sent: Tuesday, August 12, 2014 4:10 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

3. seems a biased and incomplete statement. 

Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

Nobody is required to use it to run a Hadoop cluster.

Kai (a Cloudera Employee)

Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:


  Hortonworks. Here is my reasoning:
  1. Hortonwork is 100% open source.
  2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
  3. Cloudera has proprietary stuff in their stack. No.
  4. Hortonworks makes training super accessible and there is a community around it.
  5. Who the heck is BigInsights? (Which should tell you something.)

  Adaryl "Bob" Wakefield, MBA
  Principal
  Mass Street Analytics
  913.938.6685
  www.linkedin.com/in/bobwakefieldmba
  Twitter: @BobLovesData

  From: mani kandan 
  Sent: Tuesday, August 12, 2014 3:12 PM
  To: user@hadoop.apache.org 
  Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

  Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 



--------------------------------------------------------------------------------
Kai Voigt Am Germaniahafen 1 k@123.org
24143 Kiel +49 160 96683050
Germany @KaiVoigt

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
You fell into my trap sir. I was hoping someone would clear that up. :)

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Kai Voigt 
Sent: Tuesday, August 12, 2014 4:10 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

3. seems a biased and incomplete statement. 

Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

Nobody is required to use it to run a Hadoop cluster.

Kai (a Cloudera Employee)

Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:


  Hortonworks. Here is my reasoning:
  1. Hortonwork is 100% open source.
  2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
  3. Cloudera has proprietary stuff in their stack. No.
  4. Hortonworks makes training super accessible and there is a community around it.
  5. Who the heck is BigInsights? (Which should tell you something.)

  Adaryl "Bob" Wakefield, MBA
  Principal
  Mass Street Analytics
  913.938.6685
  www.linkedin.com/in/bobwakefieldmba
  Twitter: @BobLovesData

  From: mani kandan 
  Sent: Tuesday, August 12, 2014 3:12 PM
  To: user@hadoop.apache.org 
  Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

  Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 



--------------------------------------------------------------------------------
Kai Voigt Am Germaniahafen 1 k@123.org
24143 Kiel +49 160 96683050
Germany @KaiVoigt

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Aaron Eng <ae...@maprtech.com>.
On that note, 2 is also misleading/incomplete.  You might want to explain
which specific features you are referencing so the original poster can
figure out if those features are relevant.  The inverse of 2 is also true,
things like consistent snapshots and full random read/write over NFS are in
MapR and not in HDFS.


On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:

> 3. seems a biased and incomplete statement.
>
> Cloudera’s distribution CDH is fully open source. The proprietary „stuff"
> you refer to is most likely Cloudera Manager, an additional tool to make
> deployment, configuration and monitoring easy.
>
> Nobody is required to use it to run a Hadoop cluster.
>
> Kai (a Cloudera Employee)
>
> Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <
> adaryl.wakefield@hotmail.com>:
>
>   Hortonworks. Here is my reasoning:
> 1. Hortonwork is 100% open source.
> 2. MapR has stuff on their roadmap that Hortonworks has already
> accomplished and has moved on to other things.
> 3. Cloudera has proprietary stuff in their stack. No.
> 4. Hortonworks makes training super accessible and there is a community
> around it.
> 5. Who the heck is BigInsights? (Which should tell you something.)
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>
>  *From:* mani kandan <ma...@gmail.com>
> *Sent:* Tuesday, August 12, 2014 3:12 PM
> *To:* user@hadoop.apache.org
> *Subject:* Started learning Hadoop. Which distribution is best for native
> install in pseudo distributed mode?
>
>
> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>
>
> ------------------------------
> *Kai Voigt* Am Germaniahafen 1 k@123.org
> 24143 Kiel +49 160 96683050
> Germany @KaiVoigt
>
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Aaron Eng <ae...@maprtech.com>.
On that note, 2 is also misleading/incomplete.  You might want to explain
which specific features you are referencing so the original poster can
figure out if those features are relevant.  The inverse of 2 is also true,
things like consistent snapshots and full random read/write over NFS are in
MapR and not in HDFS.


On Tue, Aug 12, 2014 at 2:10 PM, Kai Voigt <k...@123.org> wrote:

> 3. seems a biased and incomplete statement.
>
> Cloudera’s distribution CDH is fully open source. The proprietary „stuff"
> you refer to is most likely Cloudera Manager, an additional tool to make
> deployment, configuration and monitoring easy.
>
> Nobody is required to use it to run a Hadoop cluster.
>
> Kai (a Cloudera Employee)
>
> Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <
> adaryl.wakefield@hotmail.com>:
>
>   Hortonworks. Here is my reasoning:
> 1. Hortonwork is 100% open source.
> 2. MapR has stuff on their roadmap that Hortonworks has already
> accomplished and has moved on to other things.
> 3. Cloudera has proprietary stuff in their stack. No.
> 4. Hortonworks makes training super accessible and there is a community
> around it.
> 5. Who the heck is BigInsights? (Which should tell you something.)
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>
>  *From:* mani kandan <ma...@gmail.com>
> *Sent:* Tuesday, August 12, 2014 3:12 PM
> *To:* user@hadoop.apache.org
> *Subject:* Started learning Hadoop. Which distribution is best for native
> install in pseudo distributed mode?
>
>
> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>
>
> ------------------------------
> *Kai Voigt* Am Germaniahafen 1 k@123.org
> 24143 Kiel +49 160 96683050
> Germany @KaiVoigt
>
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Kai Voigt <k...@123.org>.
3. seems a biased and incomplete statement.

Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

Nobody is required to use it to run a Hadoop cluster.

Kai (a Cloudera Employee)

Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:

> Hortonworks. Here is my reasoning:
> 1. Hortonwork is 100% open source.
> 2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
> 3. Cloudera has proprietary stuff in their stack. No.
> 4. Hortonworks makes training super accessible and there is a community around it.
> 5. Who the heck is BigInsights? (Which should tell you something.)
>  
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>  
> From: mani kandan
> Sent: Tuesday, August 12, 2014 3:12 PM
> To: user@hadoop.apache.org
> Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?
>  
> Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?
> 

Kai Voigt			Am Germaniahafen 1			k@123.org
					24143 Kiel					+49 160 96683050
					Germany						@KaiVoigt


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Kai Voigt <k...@123.org>.
3. seems a biased and incomplete statement.

Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

Nobody is required to use it to run a Hadoop cluster.

Kai (a Cloudera Employee)

Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:

> Hortonworks. Here is my reasoning:
> 1. Hortonwork is 100% open source.
> 2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
> 3. Cloudera has proprietary stuff in their stack. No.
> 4. Hortonworks makes training super accessible and there is a community around it.
> 5. Who the heck is BigInsights? (Which should tell you something.)
>  
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>  
> From: mani kandan
> Sent: Tuesday, August 12, 2014 3:12 PM
> To: user@hadoop.apache.org
> Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?
>  
> Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?
> 

Kai Voigt			Am Germaniahafen 1			k@123.org
					24143 Kiel					+49 160 96683050
					Germany						@KaiVoigt


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Kai Voigt <k...@123.org>.
3. seems a biased and incomplete statement.

Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

Nobody is required to use it to run a Hadoop cluster.

Kai (a Cloudera Employee)

Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:

> Hortonworks. Here is my reasoning:
> 1. Hortonwork is 100% open source.
> 2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
> 3. Cloudera has proprietary stuff in their stack. No.
> 4. Hortonworks makes training super accessible and there is a community around it.
> 5. Who the heck is BigInsights? (Which should tell you something.)
>  
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>  
> From: mani kandan
> Sent: Tuesday, August 12, 2014 3:12 PM
> To: user@hadoop.apache.org
> Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?
>  
> Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?
> 

Kai Voigt			Am Germaniahafen 1			k@123.org
					24143 Kiel					+49 160 96683050
					Germany						@KaiVoigt


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Kai Voigt <k...@123.org>.
3. seems a biased and incomplete statement.

Cloudera’s distribution CDH is fully open source. The proprietary „stuff" you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy.

Nobody is required to use it to run a Hadoop cluster.

Kai (a Cloudera Employee)

Am 12.08.2014 um 21:56 schrieb Adaryl Bob Wakefield, MBA <ad...@hotmail.com>:

> Hortonworks. Here is my reasoning:
> 1. Hortonwork is 100% open source.
> 2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
> 3. Cloudera has proprietary stuff in their stack. No.
> 4. Hortonworks makes training super accessible and there is a community around it.
> 5. Who the heck is BigInsights? (Which should tell you something.)
>  
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>  
> From: mani kandan
> Sent: Tuesday, August 12, 2014 3:12 PM
> To: user@hadoop.apache.org
> Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?
>  
> Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?
> 

Kai Voigt			Am Germaniahafen 1			k@123.org
					24143 Kiel					+49 160 96683050
					Germany						@KaiVoigt


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Hortonworks. Here is my reasoning:
1. Hortonwork is 100% open source.
2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
3. Cloudera has proprietary stuff in their stack. No.
4. Hortonworks makes training super accessible and there is a community around it.
5. Who the heck is BigInsights? (Which should tell you something.)

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: mani kandan 
Sent: Tuesday, August 12, 2014 3:12 PM
To: user@hadoop.apache.org 
Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Mohan Radhakrishnan <ra...@gmail.com>.
Actually there  was another thread about using MR for ML but I didn't see
many responses. I use Octave or R for this but it would be useful to know
how this is solved using Hadoop. The closest community that has an interest
in this could be H2o but they have implemented MR for their engine to solve
these problems. That is what I understand. So we may be able to look at
their code but that could be tedious.

Mohan


On Thu, Aug 14, 2014 at 3:35 PM, Kai Wähner <me...@gmail.com> wrote:

> As a beginner, it depends on what you want to learn? Do you want to
> program MapReduce, just do some SQL queries to hadoop, or install, deploy
> and monitor a Hadoop cluster?
>
> This article might help making a good decision:
> "spoilt for choice - how to choose the right Hadoop distribution"
> http://www.infoq.com/articles/BigDataPlatform
>
> Kai
>
> Sent from my iPhone
>
> > On 14.08.2014, at 11:58, Chris MacKenzie <
> studio@chrismackenziephotography.co.uk> wrote:
> >
> > Hi,
> >
> > I have been using Hadoop since Christmas loosely and from May for an
> > Software engineering MSc at Heriot Watt University in Edinburgh,
> Scotland.
> > I have written a genetic sequence alignment algorithm.
> >
> > I have installed Hadoop in various places including a 32 node cluster and
> > am using eclipse kepler sr 2 as an IDE.
> >
> > My current Hadoop version is 2.4.1 which I download as a tar from the
> > apache mirror servers.
> >
> > It¹s been a tough learning curve, but that has made the learning all the
> > more valuable.
> >
> > I believe using the straight Hadoop version has given insights that
> > proprietary builds wouldn¹t have. There are so many confusing issues that
> > crop up, it¹s easy to attach importance to trying to fix the an error
> > which masks another. With the proprietary versions it would be easy to
> > attach blame where it¹s not that build or this builds fault.
> >
> > Go with your heart but be prepared to work to solve the problems you
> > encounter.
> >
> > Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
> > but it gives you enough detail and structure to build an impression you
> > can work from. The downloadable source code is a great help when trying
> to
> > get started.
> >
> > Good luck.
> >
> >
> > Regards,
> >
> > Chris MacKenzie
> > telephone: 0131 332 6967
> > email: studio@chrismackenziephotography.co.uk
> > corporate: www.chrismackenziephotography.co.uk
> > <http://www.chrismackenziephotography.co.uk/>
> > <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> > <http://www.linkedin.com/in/chrismackenziephotography/>
> >
> >
> >
> >
> >
> >
> > From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
> > Reply-To:  <us...@hadoop.apache.org>
> > Date:  Thursday, 14 August 2014 01:13
> > To:  <us...@hadoop.apache.org>
> > Subject:  Re: Started learning Hadoop. Which distribution is best for
> > native install in pseudo distributed mode?
> >
> >
> > He didn¹t ask for the best and nobody framed up their answer like that.
> He
> > asked what people were using. Out of the 10 responses only four of them
> > actually
> > answered his question.
> >
> > I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
> > more people would ask for community input and what does what and how.
> >
> > Adaryl
> > "Bob" Wakefield, MBA
> > Principal
> > Mass Street
> > Analytics
> > 913.938.6685
> > www.linkedin.com/in/bobwakefieldmba
> > Twitter:
> > @BobLovesData
> >
> > From: Kilaru, Sambaiah <ma...@intuit.com>
> > Sent: Wednesday, August 13, 2014 1:10 PM
> > To: user@hadoop.apache.org
> > Subject: Re: Started learning Hadoop. Which distribution is best for
> > native install in pseudo distributed mode?
> >
> >
> >
> >
> > Engough wars on going on which is best. You choose one of it and try to
> > learn and there is nothing that x is better or y is better.
> > It is upto your choice.
> >
> > Thanks,
> > Sam
> >
> > From: Sebastiano Di Paola <se...@gmail.com>
> > Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Date: Wednesday, August 13, 2014 at 6:28
> > PM
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Subject: Re: Started learning Hadoop. Which
> > distribution is best for native install in pseudo distributed mode?
> >
> >
> > Hi,
> > I'm a newbie too and I'm not using any particular distribution. Just
> > download the component I need / want to try for my deploiment and use
> > them.
> >
> > It's a slow process but allows me to better understand what I'm
> > doing under the hood.
> >
> > Regards,
> > Seba
> >
> >
> >
> > On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com>
> wrote:
> >
> >  Which distribution are you people using? Cloudera vs Hortonworks vs
> >  Biginsights?
> >
> >
> >
> >
> >
> >
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Mohan Radhakrishnan <ra...@gmail.com>.
Actually there  was another thread about using MR for ML but I didn't see
many responses. I use Octave or R for this but it would be useful to know
how this is solved using Hadoop. The closest community that has an interest
in this could be H2o but they have implemented MR for their engine to solve
these problems. That is what I understand. So we may be able to look at
their code but that could be tedious.

Mohan


On Thu, Aug 14, 2014 at 3:35 PM, Kai Wähner <me...@gmail.com> wrote:

> As a beginner, it depends on what you want to learn? Do you want to
> program MapReduce, just do some SQL queries to hadoop, or install, deploy
> and monitor a Hadoop cluster?
>
> This article might help making a good decision:
> "spoilt for choice - how to choose the right Hadoop distribution"
> http://www.infoq.com/articles/BigDataPlatform
>
> Kai
>
> Sent from my iPhone
>
> > On 14.08.2014, at 11:58, Chris MacKenzie <
> studio@chrismackenziephotography.co.uk> wrote:
> >
> > Hi,
> >
> > I have been using Hadoop since Christmas loosely and from May for an
> > Software engineering MSc at Heriot Watt University in Edinburgh,
> Scotland.
> > I have written a genetic sequence alignment algorithm.
> >
> > I have installed Hadoop in various places including a 32 node cluster and
> > am using eclipse kepler sr 2 as an IDE.
> >
> > My current Hadoop version is 2.4.1 which I download as a tar from the
> > apache mirror servers.
> >
> > It¹s been a tough learning curve, but that has made the learning all the
> > more valuable.
> >
> > I believe using the straight Hadoop version has given insights that
> > proprietary builds wouldn¹t have. There are so many confusing issues that
> > crop up, it¹s easy to attach importance to trying to fix the an error
> > which masks another. With the proprietary versions it would be easy to
> > attach blame where it¹s not that build or this builds fault.
> >
> > Go with your heart but be prepared to work to solve the problems you
> > encounter.
> >
> > Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
> > but it gives you enough detail and structure to build an impression you
> > can work from. The downloadable source code is a great help when trying
> to
> > get started.
> >
> > Good luck.
> >
> >
> > Regards,
> >
> > Chris MacKenzie
> > telephone: 0131 332 6967
> > email: studio@chrismackenziephotography.co.uk
> > corporate: www.chrismackenziephotography.co.uk
> > <http://www.chrismackenziephotography.co.uk/>
> > <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> > <http://www.linkedin.com/in/chrismackenziephotography/>
> >
> >
> >
> >
> >
> >
> > From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
> > Reply-To:  <us...@hadoop.apache.org>
> > Date:  Thursday, 14 August 2014 01:13
> > To:  <us...@hadoop.apache.org>
> > Subject:  Re: Started learning Hadoop. Which distribution is best for
> > native install in pseudo distributed mode?
> >
> >
> > He didn¹t ask for the best and nobody framed up their answer like that.
> He
> > asked what people were using. Out of the 10 responses only four of them
> > actually
> > answered his question.
> >
> > I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
> > more people would ask for community input and what does what and how.
> >
> > Adaryl
> > "Bob" Wakefield, MBA
> > Principal
> > Mass Street
> > Analytics
> > 913.938.6685
> > www.linkedin.com/in/bobwakefieldmba
> > Twitter:
> > @BobLovesData
> >
> > From: Kilaru, Sambaiah <ma...@intuit.com>
> > Sent: Wednesday, August 13, 2014 1:10 PM
> > To: user@hadoop.apache.org
> > Subject: Re: Started learning Hadoop. Which distribution is best for
> > native install in pseudo distributed mode?
> >
> >
> >
> >
> > Engough wars on going on which is best. You choose one of it and try to
> > learn and there is nothing that x is better or y is better.
> > It is upto your choice.
> >
> > Thanks,
> > Sam
> >
> > From: Sebastiano Di Paola <se...@gmail.com>
> > Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Date: Wednesday, August 13, 2014 at 6:28
> > PM
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Subject: Re: Started learning Hadoop. Which
> > distribution is best for native install in pseudo distributed mode?
> >
> >
> > Hi,
> > I'm a newbie too and I'm not using any particular distribution. Just
> > download the component I need / want to try for my deploiment and use
> > them.
> >
> > It's a slow process but allows me to better understand what I'm
> > doing under the hood.
> >
> > Regards,
> > Seba
> >
> >
> >
> > On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com>
> wrote:
> >
> >  Which distribution are you people using? Cloudera vs Hortonworks vs
> >  Biginsights?
> >
> >
> >
> >
> >
> >
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Mohan Radhakrishnan <ra...@gmail.com>.
Actually there  was another thread about using MR for ML but I didn't see
many responses. I use Octave or R for this but it would be useful to know
how this is solved using Hadoop. The closest community that has an interest
in this could be H2o but they have implemented MR for their engine to solve
these problems. That is what I understand. So we may be able to look at
their code but that could be tedious.

Mohan


On Thu, Aug 14, 2014 at 3:35 PM, Kai Wähner <me...@gmail.com> wrote:

> As a beginner, it depends on what you want to learn? Do you want to
> program MapReduce, just do some SQL queries to hadoop, or install, deploy
> and monitor a Hadoop cluster?
>
> This article might help making a good decision:
> "spoilt for choice - how to choose the right Hadoop distribution"
> http://www.infoq.com/articles/BigDataPlatform
>
> Kai
>
> Sent from my iPhone
>
> > On 14.08.2014, at 11:58, Chris MacKenzie <
> studio@chrismackenziephotography.co.uk> wrote:
> >
> > Hi,
> >
> > I have been using Hadoop since Christmas loosely and from May for an
> > Software engineering MSc at Heriot Watt University in Edinburgh,
> Scotland.
> > I have written a genetic sequence alignment algorithm.
> >
> > I have installed Hadoop in various places including a 32 node cluster and
> > am using eclipse kepler sr 2 as an IDE.
> >
> > My current Hadoop version is 2.4.1 which I download as a tar from the
> > apache mirror servers.
> >
> > It¹s been a tough learning curve, but that has made the learning all the
> > more valuable.
> >
> > I believe using the straight Hadoop version has given insights that
> > proprietary builds wouldn¹t have. There are so many confusing issues that
> > crop up, it¹s easy to attach importance to trying to fix the an error
> > which masks another. With the proprietary versions it would be easy to
> > attach blame where it¹s not that build or this builds fault.
> >
> > Go with your heart but be prepared to work to solve the problems you
> > encounter.
> >
> > Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
> > but it gives you enough detail and structure to build an impression you
> > can work from. The downloadable source code is a great help when trying
> to
> > get started.
> >
> > Good luck.
> >
> >
> > Regards,
> >
> > Chris MacKenzie
> > telephone: 0131 332 6967
> > email: studio@chrismackenziephotography.co.uk
> > corporate: www.chrismackenziephotography.co.uk
> > <http://www.chrismackenziephotography.co.uk/>
> > <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> > <http://www.linkedin.com/in/chrismackenziephotography/>
> >
> >
> >
> >
> >
> >
> > From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
> > Reply-To:  <us...@hadoop.apache.org>
> > Date:  Thursday, 14 August 2014 01:13
> > To:  <us...@hadoop.apache.org>
> > Subject:  Re: Started learning Hadoop. Which distribution is best for
> > native install in pseudo distributed mode?
> >
> >
> > He didn¹t ask for the best and nobody framed up their answer like that.
> He
> > asked what people were using. Out of the 10 responses only four of them
> > actually
> > answered his question.
> >
> > I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
> > more people would ask for community input and what does what and how.
> >
> > Adaryl
> > "Bob" Wakefield, MBA
> > Principal
> > Mass Street
> > Analytics
> > 913.938.6685
> > www.linkedin.com/in/bobwakefieldmba
> > Twitter:
> > @BobLovesData
> >
> > From: Kilaru, Sambaiah <ma...@intuit.com>
> > Sent: Wednesday, August 13, 2014 1:10 PM
> > To: user@hadoop.apache.org
> > Subject: Re: Started learning Hadoop. Which distribution is best for
> > native install in pseudo distributed mode?
> >
> >
> >
> >
> > Engough wars on going on which is best. You choose one of it and try to
> > learn and there is nothing that x is better or y is better.
> > It is upto your choice.
> >
> > Thanks,
> > Sam
> >
> > From: Sebastiano Di Paola <se...@gmail.com>
> > Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Date: Wednesday, August 13, 2014 at 6:28
> > PM
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Subject: Re: Started learning Hadoop. Which
> > distribution is best for native install in pseudo distributed mode?
> >
> >
> > Hi,
> > I'm a newbie too and I'm not using any particular distribution. Just
> > download the component I need / want to try for my deploiment and use
> > them.
> >
> > It's a slow process but allows me to better understand what I'm
> > doing under the hood.
> >
> > Regards,
> > Seba
> >
> >
> >
> > On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com>
> wrote:
> >
> >  Which distribution are you people using? Cloudera vs Hortonworks vs
> >  Biginsights?
> >
> >
> >
> >
> >
> >
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Mohan Radhakrishnan <ra...@gmail.com>.
Actually there  was another thread about using MR for ML but I didn't see
many responses. I use Octave or R for this but it would be useful to know
how this is solved using Hadoop. The closest community that has an interest
in this could be H2o but they have implemented MR for their engine to solve
these problems. That is what I understand. So we may be able to look at
their code but that could be tedious.

Mohan


On Thu, Aug 14, 2014 at 3:35 PM, Kai Wähner <me...@gmail.com> wrote:

> As a beginner, it depends on what you want to learn? Do you want to
> program MapReduce, just do some SQL queries to hadoop, or install, deploy
> and monitor a Hadoop cluster?
>
> This article might help making a good decision:
> "spoilt for choice - how to choose the right Hadoop distribution"
> http://www.infoq.com/articles/BigDataPlatform
>
> Kai
>
> Sent from my iPhone
>
> > On 14.08.2014, at 11:58, Chris MacKenzie <
> studio@chrismackenziephotography.co.uk> wrote:
> >
> > Hi,
> >
> > I have been using Hadoop since Christmas loosely and from May for an
> > Software engineering MSc at Heriot Watt University in Edinburgh,
> Scotland.
> > I have written a genetic sequence alignment algorithm.
> >
> > I have installed Hadoop in various places including a 32 node cluster and
> > am using eclipse kepler sr 2 as an IDE.
> >
> > My current Hadoop version is 2.4.1 which I download as a tar from the
> > apache mirror servers.
> >
> > It¹s been a tough learning curve, but that has made the learning all the
> > more valuable.
> >
> > I believe using the straight Hadoop version has given insights that
> > proprietary builds wouldn¹t have. There are so many confusing issues that
> > crop up, it¹s easy to attach importance to trying to fix the an error
> > which masks another. With the proprietary versions it would be easy to
> > attach blame where it¹s not that build or this builds fault.
> >
> > Go with your heart but be prepared to work to solve the problems you
> > encounter.
> >
> > Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
> > but it gives you enough detail and structure to build an impression you
> > can work from. The downloadable source code is a great help when trying
> to
> > get started.
> >
> > Good luck.
> >
> >
> > Regards,
> >
> > Chris MacKenzie
> > telephone: 0131 332 6967
> > email: studio@chrismackenziephotography.co.uk
> > corporate: www.chrismackenziephotography.co.uk
> > <http://www.chrismackenziephotography.co.uk/>
> > <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> > <http://www.linkedin.com/in/chrismackenziephotography/>
> >
> >
> >
> >
> >
> >
> > From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
> > Reply-To:  <us...@hadoop.apache.org>
> > Date:  Thursday, 14 August 2014 01:13
> > To:  <us...@hadoop.apache.org>
> > Subject:  Re: Started learning Hadoop. Which distribution is best for
> > native install in pseudo distributed mode?
> >
> >
> > He didn¹t ask for the best and nobody framed up their answer like that.
> He
> > asked what people were using. Out of the 10 responses only four of them
> > actually
> > answered his question.
> >
> > I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
> > more people would ask for community input and what does what and how.
> >
> > Adaryl
> > "Bob" Wakefield, MBA
> > Principal
> > Mass Street
> > Analytics
> > 913.938.6685
> > www.linkedin.com/in/bobwakefieldmba
> > Twitter:
> > @BobLovesData
> >
> > From: Kilaru, Sambaiah <ma...@intuit.com>
> > Sent: Wednesday, August 13, 2014 1:10 PM
> > To: user@hadoop.apache.org
> > Subject: Re: Started learning Hadoop. Which distribution is best for
> > native install in pseudo distributed mode?
> >
> >
> >
> >
> > Engough wars on going on which is best. You choose one of it and try to
> > learn and there is nothing that x is better or y is better.
> > It is upto your choice.
> >
> > Thanks,
> > Sam
> >
> > From: Sebastiano Di Paola <se...@gmail.com>
> > Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Date: Wednesday, August 13, 2014 at 6:28
> > PM
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Subject: Re: Started learning Hadoop. Which
> > distribution is best for native install in pseudo distributed mode?
> >
> >
> > Hi,
> > I'm a newbie too and I'm not using any particular distribution. Just
> > download the component I need / want to try for my deploiment and use
> > them.
> >
> > It's a slow process but allows me to better understand what I'm
> > doing under the hood.
> >
> > Regards,
> > Seba
> >
> >
> >
> > On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com>
> wrote:
> >
> >  Which distribution are you people using? Cloudera vs Hortonworks vs
> >  Biginsights?
> >
> >
> >
> >
> >
> >
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Kai Wähner <me...@gmail.com>.
As a beginner, it depends on what you want to learn? Do you want to program MapReduce, just do some SQL queries to hadoop, or install, deploy and monitor a Hadoop cluster? 

This article might help making a good decision:
"spoilt for choice - how to choose the right Hadoop distribution"
http://www.infoq.com/articles/BigDataPlatform 

Kai

Sent from my iPhone

> On 14.08.2014, at 11:58, Chris MacKenzie <st...@chrismackenziephotography.co.uk> wrote:
> 
> Hi,
> 
> I have been using Hadoop since Christmas loosely and from May for an
> Software engineering MSc at Heriot Watt University in Edinburgh, Scotland.
> I have written a genetic sequence alignment algorithm.
> 
> I have installed Hadoop in various places including a 32 node cluster and
> am using eclipse kepler sr 2 as an IDE.
> 
> My current Hadoop version is 2.4.1 which I download as a tar from the
> apache mirror servers.
> 
> It¹s been a tough learning curve, but that has made the learning all the
> more valuable.
> 
> I believe using the straight Hadoop version has given insights that
> proprietary builds wouldn¹t have. There are so many confusing issues that
> crop up, it¹s easy to attach importance to trying to fix the an error
> which masks another. With the proprietary versions it would be easy to
> attach blame where it¹s not that build or this builds fault.
> 
> Go with your heart but be prepared to work to solve the problems you
> encounter.
> 
> Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
> but it gives you enough detail and structure to build an impression you
> can work from. The downloadable source code is a great help when trying to
> get started.
> 
> Good luck.
> 
> 
> Regards,
> 
> Chris MacKenzie
> telephone: 0131 332 6967
> email: studio@chrismackenziephotography.co.uk
> corporate: www.chrismackenziephotography.co.uk
> <http://www.chrismackenziephotography.co.uk/>
> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> <http://www.linkedin.com/in/chrismackenziephotography/>
> 
> 
> 
> 
> 
> 
> From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
> Reply-To:  <us...@hadoop.apache.org>
> Date:  Thursday, 14 August 2014 01:13
> To:  <us...@hadoop.apache.org>
> Subject:  Re: Started learning Hadoop. Which distribution is best for
> native install in pseudo distributed mode?
> 
> 
> He didn¹t ask for the best and nobody framed up their answer like that. He
> asked what people were using. Out of the 10 responses only four of them
> actually 
> answered his question.
> 
> I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
> more people would ask for community input and what does what and how.
> 
> Adaryl 
> "Bob" Wakefield, MBA
> Principal
> Mass Street 
> Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: 
> @BobLovesData
> 
> From: Kilaru, Sambaiah <ma...@intuit.com>
> Sent: Wednesday, August 13, 2014 1:10 PM
> To: user@hadoop.apache.org
> Subject: Re: Started learning Hadoop. Which distribution is best for
> native install in pseudo distributed mode?
> 
> 
> 
> 
> Engough wars on going on which is best. You choose one of it and try to
> learn and there is nothing that x is better or y is better.
> It is upto your choice.
> 
> Thanks,
> Sam
> 
> From: Sebastiano Di Paola <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, August 13, 2014 at 6:28
> PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Started learning Hadoop. Which
> distribution is best for native install in pseudo distributed mode?
> 
> 
> Hi,
> I'm a newbie too and I'm not using any particular distribution. Just
> download the component I need / want to try for my deploiment and use
> them.
> 
> It's a slow process but allows me to better understand what I'm
> doing under the hood.
> 
> Regards,
> Seba
> 
> 
> 
> On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:
> 
>  Which distribution are you people using? Cloudera vs Hortonworks vs
>  Biginsights? 
> 
> 
> 
> 
> 
> 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Kai Wähner <me...@gmail.com>.
As a beginner, it depends on what you want to learn? Do you want to program MapReduce, just do some SQL queries to hadoop, or install, deploy and monitor a Hadoop cluster? 

This article might help making a good decision:
"spoilt for choice - how to choose the right Hadoop distribution"
http://www.infoq.com/articles/BigDataPlatform 

Kai

Sent from my iPhone

> On 14.08.2014, at 11:58, Chris MacKenzie <st...@chrismackenziephotography.co.uk> wrote:
> 
> Hi,
> 
> I have been using Hadoop since Christmas loosely and from May for an
> Software engineering MSc at Heriot Watt University in Edinburgh, Scotland.
> I have written a genetic sequence alignment algorithm.
> 
> I have installed Hadoop in various places including a 32 node cluster and
> am using eclipse kepler sr 2 as an IDE.
> 
> My current Hadoop version is 2.4.1 which I download as a tar from the
> apache mirror servers.
> 
> It¹s been a tough learning curve, but that has made the learning all the
> more valuable.
> 
> I believe using the straight Hadoop version has given insights that
> proprietary builds wouldn¹t have. There are so many confusing issues that
> crop up, it¹s easy to attach importance to trying to fix the an error
> which masks another. With the proprietary versions it would be easy to
> attach blame where it¹s not that build or this builds fault.
> 
> Go with your heart but be prepared to work to solve the problems you
> encounter.
> 
> Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
> but it gives you enough detail and structure to build an impression you
> can work from. The downloadable source code is a great help when trying to
> get started.
> 
> Good luck.
> 
> 
> Regards,
> 
> Chris MacKenzie
> telephone: 0131 332 6967
> email: studio@chrismackenziephotography.co.uk
> corporate: www.chrismackenziephotography.co.uk
> <http://www.chrismackenziephotography.co.uk/>
> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> <http://www.linkedin.com/in/chrismackenziephotography/>
> 
> 
> 
> 
> 
> 
> From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
> Reply-To:  <us...@hadoop.apache.org>
> Date:  Thursday, 14 August 2014 01:13
> To:  <us...@hadoop.apache.org>
> Subject:  Re: Started learning Hadoop. Which distribution is best for
> native install in pseudo distributed mode?
> 
> 
> He didn¹t ask for the best and nobody framed up their answer like that. He
> asked what people were using. Out of the 10 responses only four of them
> actually 
> answered his question.
> 
> I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
> more people would ask for community input and what does what and how.
> 
> Adaryl 
> "Bob" Wakefield, MBA
> Principal
> Mass Street 
> Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: 
> @BobLovesData
> 
> From: Kilaru, Sambaiah <ma...@intuit.com>
> Sent: Wednesday, August 13, 2014 1:10 PM
> To: user@hadoop.apache.org
> Subject: Re: Started learning Hadoop. Which distribution is best for
> native install in pseudo distributed mode?
> 
> 
> 
> 
> Engough wars on going on which is best. You choose one of it and try to
> learn and there is nothing that x is better or y is better.
> It is upto your choice.
> 
> Thanks,
> Sam
> 
> From: Sebastiano Di Paola <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, August 13, 2014 at 6:28
> PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Started learning Hadoop. Which
> distribution is best for native install in pseudo distributed mode?
> 
> 
> Hi,
> I'm a newbie too and I'm not using any particular distribution. Just
> download the component I need / want to try for my deploiment and use
> them.
> 
> It's a slow process but allows me to better understand what I'm
> doing under the hood.
> 
> Regards,
> Seba
> 
> 
> 
> On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:
> 
>  Which distribution are you people using? Cloudera vs Hortonworks vs
>  Biginsights? 
> 
> 
> 
> 
> 
> 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Kai Wähner <me...@gmail.com>.
As a beginner, it depends on what you want to learn? Do you want to program MapReduce, just do some SQL queries to hadoop, or install, deploy and monitor a Hadoop cluster? 

This article might help making a good decision:
"spoilt for choice - how to choose the right Hadoop distribution"
http://www.infoq.com/articles/BigDataPlatform 

Kai

Sent from my iPhone

> On 14.08.2014, at 11:58, Chris MacKenzie <st...@chrismackenziephotography.co.uk> wrote:
> 
> Hi,
> 
> I have been using Hadoop since Christmas loosely and from May for an
> Software engineering MSc at Heriot Watt University in Edinburgh, Scotland.
> I have written a genetic sequence alignment algorithm.
> 
> I have installed Hadoop in various places including a 32 node cluster and
> am using eclipse kepler sr 2 as an IDE.
> 
> My current Hadoop version is 2.4.1 which I download as a tar from the
> apache mirror servers.
> 
> It¹s been a tough learning curve, but that has made the learning all the
> more valuable.
> 
> I believe using the straight Hadoop version has given insights that
> proprietary builds wouldn¹t have. There are so many confusing issues that
> crop up, it¹s easy to attach importance to trying to fix the an error
> which masks another. With the proprietary versions it would be easy to
> attach blame where it¹s not that build or this builds fault.
> 
> Go with your heart but be prepared to work to solve the problems you
> encounter.
> 
> Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
> but it gives you enough detail and structure to build an impression you
> can work from. The downloadable source code is a great help when trying to
> get started.
> 
> Good luck.
> 
> 
> Regards,
> 
> Chris MacKenzie
> telephone: 0131 332 6967
> email: studio@chrismackenziephotography.co.uk
> corporate: www.chrismackenziephotography.co.uk
> <http://www.chrismackenziephotography.co.uk/>
> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> <http://www.linkedin.com/in/chrismackenziephotography/>
> 
> 
> 
> 
> 
> 
> From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
> Reply-To:  <us...@hadoop.apache.org>
> Date:  Thursday, 14 August 2014 01:13
> To:  <us...@hadoop.apache.org>
> Subject:  Re: Started learning Hadoop. Which distribution is best for
> native install in pseudo distributed mode?
> 
> 
> He didn¹t ask for the best and nobody framed up their answer like that. He
> asked what people were using. Out of the 10 responses only four of them
> actually 
> answered his question.
> 
> I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
> more people would ask for community input and what does what and how.
> 
> Adaryl 
> "Bob" Wakefield, MBA
> Principal
> Mass Street 
> Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: 
> @BobLovesData
> 
> From: Kilaru, Sambaiah <ma...@intuit.com>
> Sent: Wednesday, August 13, 2014 1:10 PM
> To: user@hadoop.apache.org
> Subject: Re: Started learning Hadoop. Which distribution is best for
> native install in pseudo distributed mode?
> 
> 
> 
> 
> Engough wars on going on which is best. You choose one of it and try to
> learn and there is nothing that x is better or y is better.
> It is upto your choice.
> 
> Thanks,
> Sam
> 
> From: Sebastiano Di Paola <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, August 13, 2014 at 6:28
> PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Started learning Hadoop. Which
> distribution is best for native install in pseudo distributed mode?
> 
> 
> Hi,
> I'm a newbie too and I'm not using any particular distribution. Just
> download the component I need / want to try for my deploiment and use
> them.
> 
> It's a slow process but allows me to better understand what I'm
> doing under the hood.
> 
> Regards,
> Seba
> 
> 
> 
> On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:
> 
>  Which distribution are you people using? Cloudera vs Hortonworks vs
>  Biginsights? 
> 
> 
> 
> 
> 
> 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Kai Wähner <me...@gmail.com>.
As a beginner, it depends on what you want to learn? Do you want to program MapReduce, just do some SQL queries to hadoop, or install, deploy and monitor a Hadoop cluster? 

This article might help making a good decision:
"spoilt for choice - how to choose the right Hadoop distribution"
http://www.infoq.com/articles/BigDataPlatform 

Kai

Sent from my iPhone

> On 14.08.2014, at 11:58, Chris MacKenzie <st...@chrismackenziephotography.co.uk> wrote:
> 
> Hi,
> 
> I have been using Hadoop since Christmas loosely and from May for an
> Software engineering MSc at Heriot Watt University in Edinburgh, Scotland.
> I have written a genetic sequence alignment algorithm.
> 
> I have installed Hadoop in various places including a 32 node cluster and
> am using eclipse kepler sr 2 as an IDE.
> 
> My current Hadoop version is 2.4.1 which I download as a tar from the
> apache mirror servers.
> 
> It¹s been a tough learning curve, but that has made the learning all the
> more valuable.
> 
> I believe using the straight Hadoop version has given insights that
> proprietary builds wouldn¹t have. There are so many confusing issues that
> crop up, it¹s easy to attach importance to trying to fix the an error
> which masks another. With the proprietary versions it would be easy to
> attach blame where it¹s not that build or this builds fault.
> 
> Go with your heart but be prepared to work to solve the problems you
> encounter.
> 
> Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
> but it gives you enough detail and structure to build an impression you
> can work from. The downloadable source code is a great help when trying to
> get started.
> 
> Good luck.
> 
> 
> Regards,
> 
> Chris MacKenzie
> telephone: 0131 332 6967
> email: studio@chrismackenziephotography.co.uk
> corporate: www.chrismackenziephotography.co.uk
> <http://www.chrismackenziephotography.co.uk/>
> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
> <http://www.linkedin.com/in/chrismackenziephotography/>
> 
> 
> 
> 
> 
> 
> From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
> Reply-To:  <us...@hadoop.apache.org>
> Date:  Thursday, 14 August 2014 01:13
> To:  <us...@hadoop.apache.org>
> Subject:  Re: Started learning Hadoop. Which distribution is best for
> native install in pseudo distributed mode?
> 
> 
> He didn¹t ask for the best and nobody framed up their answer like that. He
> asked what people were using. Out of the 10 responses only four of them
> actually 
> answered his question.
> 
> I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
> more people would ask for community input and what does what and how.
> 
> Adaryl 
> "Bob" Wakefield, MBA
> Principal
> Mass Street 
> Analytics
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: 
> @BobLovesData
> 
> From: Kilaru, Sambaiah <ma...@intuit.com>
> Sent: Wednesday, August 13, 2014 1:10 PM
> To: user@hadoop.apache.org
> Subject: Re: Started learning Hadoop. Which distribution is best for
> native install in pseudo distributed mode?
> 
> 
> 
> 
> Engough wars on going on which is best. You choose one of it and try to
> learn and there is nothing that x is better or y is better.
> It is upto your choice.
> 
> Thanks,
> Sam
> 
> From: Sebastiano Di Paola <se...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, August 13, 2014 at 6:28
> PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Started learning Hadoop. Which
> distribution is best for native install in pseudo distributed mode?
> 
> 
> Hi,
> I'm a newbie too and I'm not using any particular distribution. Just
> download the component I need / want to try for my deploiment and use
> them.
> 
> It's a slow process but allows me to better understand what I'm
> doing under the hood.
> 
> Regards,
> Seba
> 
> 
> 
> On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:
> 
>  Which distribution are you people using? Cloudera vs Hortonworks vs
>  Biginsights? 
> 
> 
> 
> 
> 
> 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Chris MacKenzie <st...@chrismackenziephotography.co.uk>.
Hi,

I have been using Hadoop since Christmas loosely and from May for an
Software engineering MSc at Heriot Watt University in Edinburgh, Scotland.
I have written a genetic sequence alignment algorithm.

I have installed Hadoop in various places including a 32 node cluster and
am using eclipse kepler sr 2 as an IDE.

My current Hadoop version is 2.4.1 which I download as a tar from the
apache mirror servers.

It¹s been a tough learning curve, but that has made the learning all the
more valuable.

I believe using the straight Hadoop version has given insights that
proprietary builds wouldn¹t have. There are so many confusing issues that
crop up, it¹s easy to attach importance to trying to fix the an error
which masks another. With the proprietary versions it would be easy to
attach blame where it¹s not that build or this builds fault.

Go with your heart but be prepared to work to solve the problems you
encounter.

Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
but it gives you enough detail and structure to build an impression you
can work from. The downloadable source code is a great help when trying to
get started.

Good luck.


Regards,

Chris MacKenzie
telephone: 0131 332 6967
email: studio@chrismackenziephotography.co.uk
corporate: www.chrismackenziephotography.co.uk
<http://www.chrismackenziephotography.co.uk/>
<http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
<http://www.linkedin.com/in/chrismackenziephotography/>






From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
Reply-To:  <us...@hadoop.apache.org>
Date:  Thursday, 14 August 2014 01:13
To:  <us...@hadoop.apache.org>
Subject:  Re: Started learning Hadoop. Which distribution is best for
native install in pseudo distributed mode?


He didn¹t ask for the best and nobody framed up their answer like that. He
asked what people were using. Out of the 10 responses only four of them
actually 
answered his question.
 
I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
more people would ask for community input and what does what and how.
 
Adaryl 
"Bob" Wakefield, MBA
Principal
Mass Street 
Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: 
@BobLovesData
 
From: Kilaru, Sambaiah <ma...@intuit.com>
Sent: Wednesday, August 13, 2014 1:10 PM
To: user@hadoop.apache.org
Subject: Re: Started learning Hadoop. Which distribution is best for
native install in pseudo distributed mode?


 

Engough wars on going on which is best. You choose one of it and try to
learn and there is nothing that x is better or y is better.
It is upto your choice.
 
Thanks,
Sam
 
From: Sebastiano Di Paola <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Wednesday, August 13, 2014 at 6:28
PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Started learning Hadoop. Which
distribution is best for native install in pseudo distributed mode?

 
Hi,
I'm a newbie too and I'm not using any particular distribution. Just
download the component I need / want to try for my deploiment and use
them.

It's a slow process but allows me to better understand what I'm
doing under the hood.

Regards,
Seba



On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

  Which distribution are you people using? Cloudera vs Hortonworks vs
  Biginsights? 



 



Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Chris MacKenzie <st...@chrismackenziephotography.co.uk>.
Hi,

I have been using Hadoop since Christmas loosely and from May for an
Software engineering MSc at Heriot Watt University in Edinburgh, Scotland.
I have written a genetic sequence alignment algorithm.

I have installed Hadoop in various places including a 32 node cluster and
am using eclipse kepler sr 2 as an IDE.

My current Hadoop version is 2.4.1 which I download as a tar from the
apache mirror servers.

It¹s been a tough learning curve, but that has made the learning all the
more valuable.

I believe using the straight Hadoop version has given insights that
proprietary builds wouldn¹t have. There are so many confusing issues that
crop up, it¹s easy to attach importance to trying to fix the an error
which masks another. With the proprietary versions it would be easy to
attach blame where it¹s not that build or this builds fault.

Go with your heart but be prepared to work to solve the problems you
encounter.

Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
but it gives you enough detail and structure to build an impression you
can work from. The downloadable source code is a great help when trying to
get started.

Good luck.


Regards,

Chris MacKenzie
telephone: 0131 332 6967
email: studio@chrismackenziephotography.co.uk
corporate: www.chrismackenziephotography.co.uk
<http://www.chrismackenziephotography.co.uk/>
<http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
<http://www.linkedin.com/in/chrismackenziephotography/>






From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
Reply-To:  <us...@hadoop.apache.org>
Date:  Thursday, 14 August 2014 01:13
To:  <us...@hadoop.apache.org>
Subject:  Re: Started learning Hadoop. Which distribution is best for
native install in pseudo distributed mode?


He didn¹t ask for the best and nobody framed up their answer like that. He
asked what people were using. Out of the 10 responses only four of them
actually 
answered his question.
 
I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
more people would ask for community input and what does what and how.
 
Adaryl 
"Bob" Wakefield, MBA
Principal
Mass Street 
Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: 
@BobLovesData
 
From: Kilaru, Sambaiah <ma...@intuit.com>
Sent: Wednesday, August 13, 2014 1:10 PM
To: user@hadoop.apache.org
Subject: Re: Started learning Hadoop. Which distribution is best for
native install in pseudo distributed mode?


 

Engough wars on going on which is best. You choose one of it and try to
learn and there is nothing that x is better or y is better.
It is upto your choice.
 
Thanks,
Sam
 
From: Sebastiano Di Paola <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Wednesday, August 13, 2014 at 6:28
PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Started learning Hadoop. Which
distribution is best for native install in pseudo distributed mode?

 
Hi,
I'm a newbie too and I'm not using any particular distribution. Just
download the component I need / want to try for my deploiment and use
them.

It's a slow process but allows me to better understand what I'm
doing under the hood.

Regards,
Seba



On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

  Which distribution are you people using? Cloudera vs Hortonworks vs
  Biginsights? 



 



Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Chris MacKenzie <st...@chrismackenziephotography.co.uk>.
Hi,

I have been using Hadoop since Christmas loosely and from May for an
Software engineering MSc at Heriot Watt University in Edinburgh, Scotland.
I have written a genetic sequence alignment algorithm.

I have installed Hadoop in various places including a 32 node cluster and
am using eclipse kepler sr 2 as an IDE.

My current Hadoop version is 2.4.1 which I download as a tar from the
apache mirror servers.

It¹s been a tough learning curve, but that has made the learning all the
more valuable.

I believe using the straight Hadoop version has given insights that
proprietary builds wouldn¹t have. There are so many confusing issues that
crop up, it¹s easy to attach importance to trying to fix the an error
which masks another. With the proprietary versions it would be easy to
attach blame where it¹s not that build or this builds fault.

Go with your heart but be prepared to work to solve the problems you
encounter.

Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
but it gives you enough detail and structure to build an impression you
can work from. The downloadable source code is a great help when trying to
get started.

Good luck.


Regards,

Chris MacKenzie
telephone: 0131 332 6967
email: studio@chrismackenziephotography.co.uk
corporate: www.chrismackenziephotography.co.uk
<http://www.chrismackenziephotography.co.uk/>
<http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
<http://www.linkedin.com/in/chrismackenziephotography/>






From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
Reply-To:  <us...@hadoop.apache.org>
Date:  Thursday, 14 August 2014 01:13
To:  <us...@hadoop.apache.org>
Subject:  Re: Started learning Hadoop. Which distribution is best for
native install in pseudo distributed mode?


He didn¹t ask for the best and nobody framed up their answer like that. He
asked what people were using. Out of the 10 responses only four of them
actually 
answered his question.
 
I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
more people would ask for community input and what does what and how.
 
Adaryl 
"Bob" Wakefield, MBA
Principal
Mass Street 
Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: 
@BobLovesData
 
From: Kilaru, Sambaiah <ma...@intuit.com>
Sent: Wednesday, August 13, 2014 1:10 PM
To: user@hadoop.apache.org
Subject: Re: Started learning Hadoop. Which distribution is best for
native install in pseudo distributed mode?


 

Engough wars on going on which is best. You choose one of it and try to
learn and there is nothing that x is better or y is better.
It is upto your choice.
 
Thanks,
Sam
 
From: Sebastiano Di Paola <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Wednesday, August 13, 2014 at 6:28
PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Started learning Hadoop. Which
distribution is best for native install in pseudo distributed mode?

 
Hi,
I'm a newbie too and I'm not using any particular distribution. Just
download the component I need / want to try for my deploiment and use
them.

It's a slow process but allows me to better understand what I'm
doing under the hood.

Regards,
Seba



On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

  Which distribution are you people using? Cloudera vs Hortonworks vs
  Biginsights? 



 



Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Chris MacKenzie <st...@chrismackenziephotography.co.uk>.
Hi,

I have been using Hadoop since Christmas loosely and from May for an
Software engineering MSc at Heriot Watt University in Edinburgh, Scotland.
I have written a genetic sequence alignment algorithm.

I have installed Hadoop in various places including a 32 node cluster and
am using eclipse kepler sr 2 as an IDE.

My current Hadoop version is 2.4.1 which I download as a tar from the
apache mirror servers.

It¹s been a tough learning curve, but that has made the learning all the
more valuable.

I believe using the straight Hadoop version has given insights that
proprietary builds wouldn¹t have. There are so many confusing issues that
crop up, it¹s easy to attach importance to trying to fix the an error
which masks another. With the proprietary versions it would be easy to
attach blame where it¹s not that build or this builds fault.

Go with your heart but be prepared to work to solve the problems you
encounter.

Buy Tom Whites book, it isn¹t perfect and a couple of years out of date
but it gives you enough detail and structure to build an impression you
can work from. The downloadable source code is a great help when trying to
get started.

Good luck.


Regards,

Chris MacKenzie
telephone: 0131 332 6967
email: studio@chrismackenziephotography.co.uk
corporate: www.chrismackenziephotography.co.uk
<http://www.chrismackenziephotography.co.uk/>
<http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
<http://www.linkedin.com/in/chrismackenziephotography/>






From:  "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>
Reply-To:  <us...@hadoop.apache.org>
Date:  Thursday, 14 August 2014 01:13
To:  <us...@hadoop.apache.org>
Subject:  Re: Started learning Hadoop. Which distribution is best for
native install in pseudo distributed mode?


He didn¹t ask for the best and nobody framed up their answer like that. He
asked what people were using. Out of the 10 responses only four of them
actually 
answered his question.
 
I¹ve been studying Hadoop for two months straight. Quite frankly, I wish
more people would ask for community input and what does what and how.
 
Adaryl 
"Bob" Wakefield, MBA
Principal
Mass Street 
Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: 
@BobLovesData
 
From: Kilaru, Sambaiah <ma...@intuit.com>
Sent: Wednesday, August 13, 2014 1:10 PM
To: user@hadoop.apache.org
Subject: Re: Started learning Hadoop. Which distribution is best for
native install in pseudo distributed mode?


 

Engough wars on going on which is best. You choose one of it and try to
learn and there is nothing that x is better or y is better.
It is upto your choice.
 
Thanks,
Sam
 
From: Sebastiano Di Paola <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Wednesday, August 13, 2014 at 6:28
PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Started learning Hadoop. Which
distribution is best for native install in pseudo distributed mode?

 
Hi,
I'm a newbie too and I'm not using any particular distribution. Just
download the component I need / want to try for my deploiment and use
them.

It's a slow process but allows me to better understand what I'm
doing under the hood.

Regards,
Seba



On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

  Which distribution are you people using? Cloudera vs Hortonworks vs
  Biginsights? 



 



Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
He didn’t ask for the best and nobody framed up their answer like that. He asked what people were using. Out of the 10 responses only four of them actually answered his question.

I’ve been studying Hadoop for two months straight. Quite frankly, I wish more people would ask for community input and what does what and how.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Kilaru, Sambaiah 
Sent: Wednesday, August 13, 2014 1:10 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Engough wars on going on which is best. You choose one of it and try to learn and there is nothing that x is better or y is better.
It is upto your choice.

Thanks,
Sam

From: Sebastiano Di Paola <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Wednesday, August 13, 2014 at 6:28 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?


Hi,
I'm a newbie too and I'm not using any particular distribution. Just download the component I need / want to try for my deploiment and use them.

It's a slow process but allows me to better understand what I'm doing under the hood.

Regards,
Seba




On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

  Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
He didn’t ask for the best and nobody framed up their answer like that. He asked what people were using. Out of the 10 responses only four of them actually answered his question.

I’ve been studying Hadoop for two months straight. Quite frankly, I wish more people would ask for community input and what does what and how.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Kilaru, Sambaiah 
Sent: Wednesday, August 13, 2014 1:10 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Engough wars on going on which is best. You choose one of it and try to learn and there is nothing that x is better or y is better.
It is upto your choice.

Thanks,
Sam

From: Sebastiano Di Paola <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Wednesday, August 13, 2014 at 6:28 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?


Hi,
I'm a newbie too and I'm not using any particular distribution. Just download the component I need / want to try for my deploiment and use them.

It's a slow process but allows me to better understand what I'm doing under the hood.

Regards,
Seba




On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

  Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
He didn’t ask for the best and nobody framed up their answer like that. He asked what people were using. Out of the 10 responses only four of them actually answered his question.

I’ve been studying Hadoop for two months straight. Quite frankly, I wish more people would ask for community input and what does what and how.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Kilaru, Sambaiah 
Sent: Wednesday, August 13, 2014 1:10 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Engough wars on going on which is best. You choose one of it and try to learn and there is nothing that x is better or y is better.
It is upto your choice.

Thanks,
Sam

From: Sebastiano Di Paola <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Wednesday, August 13, 2014 at 6:28 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?


Hi,
I'm a newbie too and I'm not using any particular distribution. Just download the component I need / want to try for my deploiment and use them.

It's a slow process but allows me to better understand what I'm doing under the hood.

Regards,
Seba




On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

  Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
He didn’t ask for the best and nobody framed up their answer like that. He asked what people were using. Out of the 10 responses only four of them actually answered his question.

I’ve been studying Hadoop for two months straight. Quite frankly, I wish more people would ask for community input and what does what and how.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Kilaru, Sambaiah 
Sent: Wednesday, August 13, 2014 1:10 PM
To: user@hadoop.apache.org 
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Engough wars on going on which is best. You choose one of it and try to learn and there is nothing that x is better or y is better.
It is upto your choice.

Thanks,
Sam

From: Sebastiano Di Paola <se...@gmail.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Date: Wednesday, August 13, 2014 at 6:28 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?


Hi,
I'm a newbie too and I'm not using any particular distribution. Just download the component I need / want to try for my deploiment and use them.

It's a slow process but allows me to better understand what I'm doing under the hood.

Regards,
Seba




On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

  Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Kilaru, Sambaiah" <Sa...@intuit.com>.
Engough wars on going on which is best. You choose one of it and try to learn and there is nothing that x is better or y is better.
It is upto your choice.

Thanks,
Sam

From: Sebastiano Di Paola <se...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, August 13, 2014 at 6:28 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Hi,
I'm a newbie too and I'm not using any particular distribution. Just download the component I need / want to try for my deploiment and use them.
It's a slow process but allows me to better understand what I'm doing under the hood.
Regards,
Seba


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com>> wrote:

Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Kilaru, Sambaiah" <Sa...@intuit.com>.
Engough wars on going on which is best. You choose one of it and try to learn and there is nothing that x is better or y is better.
It is upto your choice.

Thanks,
Sam

From: Sebastiano Di Paola <se...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, August 13, 2014 at 6:28 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Hi,
I'm a newbie too and I'm not using any particular distribution. Just download the component I need / want to try for my deploiment and use them.
It's a slow process but allows me to better understand what I'm doing under the hood.
Regards,
Seba


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com>> wrote:

Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Kilaru, Sambaiah" <Sa...@intuit.com>.
Engough wars on going on which is best. You choose one of it and try to learn and there is nothing that x is better or y is better.
It is upto your choice.

Thanks,
Sam

From: Sebastiano Di Paola <se...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, August 13, 2014 at 6:28 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Hi,
I'm a newbie too and I'm not using any particular distribution. Just download the component I need / want to try for my deploiment and use them.
It's a slow process but allows me to better understand what I'm doing under the hood.
Regards,
Seba


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com>> wrote:

Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Kilaru, Sambaiah" <Sa...@intuit.com>.
Engough wars on going on which is best. You choose one of it and try to learn and there is nothing that x is better or y is better.
It is upto your choice.

Thanks,
Sam

From: Sebastiano Di Paola <se...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, August 13, 2014 at 6:28 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Hi,
I'm a newbie too and I'm not using any particular distribution. Just download the component I need / want to try for my deploiment and use them.
It's a slow process but allows me to better understand what I'm doing under the hood.
Regards,
Seba


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com>> wrote:

Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?


Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Sebastiano Di Paola <se...@gmail.com>.
Hi,
I'm a newbie too and I'm not using any particular distribution. Just
download the component I need / want to try for my deploiment and use them.
It's a slow process but allows me to better understand what I'm doing under
the hood.
Regards,
Seba


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Andre Kelpe <ak...@concurrentinc.com>.
Why don't you just use the apache tarball? We even have that automated, if
vagrant is your thing:
https://github.com/Cascading/vagrant-cascading-hadoop-cluster

- André


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>



-- 
André Kelpe
andre@concurrentinc.com
http://concurrentinc.com

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Sebastiano Di Paola <se...@gmail.com>.
Hi,
I'm a newbie too and I'm not using any particular distribution. Just
download the component I need / want to try for my deploiment and use them.
It's a slow process but allows me to better understand what I'm doing under
the hood.
Regards,
Seba


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Hortonworks. Here is my reasoning:
1. Hortonwork is 100% open source.
2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
3. Cloudera has proprietary stuff in their stack. No.
4. Hortonworks makes training super accessible and there is a community around it.
5. Who the heck is BigInsights? (Which should tell you something.)

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: mani kandan 
Sent: Tuesday, August 12, 2014 3:12 PM
To: user@hadoop.apache.org 
Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Hortonworks. Here is my reasoning:
1. Hortonwork is 100% open source.
2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
3. Cloudera has proprietary stuff in their stack. No.
4. Hortonworks makes training super accessible and there is a community around it.
5. Who the heck is BigInsights? (Which should tell you something.)

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: mani kandan 
Sent: Tuesday, August 12, 2014 3:12 PM
To: user@hadoop.apache.org 
Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.
Hortonworks. Here is my reasoning:
1. Hortonwork is 100% open source.
2. MapR has stuff on their roadmap that Hortonworks has already accomplished and has moved on to other things.
3. Cloudera has proprietary stuff in their stack. No.
4. Hortonworks makes training super accessible and there is a community around it.
5. Who the heck is BigInsights? (Which should tell you something.)

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: mani kandan 
Sent: Tuesday, August 12, 2014 3:12 PM
To: user@hadoop.apache.org 
Subject: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights? 

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Sebastiano Di Paola <se...@gmail.com>.
Hi,
I'm a newbie too and I'm not using any particular distribution. Just
download the component I need / want to try for my deploiment and use them.
It's a slow process but allows me to better understand what I'm doing under
the hood.
Regards,
Seba


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Sebastiano Di Paola <se...@gmail.com>.
Hi,
I'm a newbie too and I'm not using any particular distribution. Just
download the component I need / want to try for my deploiment and use them.
It's a slow process but allows me to better understand what I'm doing under
the hood.
Regards,
Seba


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Andre Kelpe <ak...@concurrentinc.com>.
Why don't you just use the apache tarball? We even have that automated, if
vagrant is your thing:
https://github.com/Cascading/vagrant-cascading-hadoop-cluster

- André


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>



-- 
André Kelpe
andre@concurrentinc.com
http://concurrentinc.com

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Andre Kelpe <ak...@concurrentinc.com>.
Why don't you just use the apache tarball? We even have that automated, if
vagrant is your thing:
https://github.com/Cascading/vagrant-cascading-hadoop-cluster

- André


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>



-- 
André Kelpe
andre@concurrentinc.com
http://concurrentinc.com

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

Posted by Andre Kelpe <ak...@concurrentinc.com>.
Why don't you just use the apache tarball? We even have that automated, if
vagrant is your thing:
https://github.com/Cascading/vagrant-cascading-hadoop-cluster

- André


On Tue, Aug 12, 2014 at 10:12 PM, mani kandan <ma...@gmail.com> wrote:

> Which distribution are you people using? Cloudera vs Hortonworks vs
> Biginsights?
>



-- 
André Kelpe
andre@concurrentinc.com
http://concurrentinc.com