You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Krish Donald <go...@gmail.com> on 2015/03/06 00:41:10 UTC

t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Hi,

I am new to AWS and would like to setup Hadoop cluster using cloudera
manager for 6-7 nodes.

t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
I would like to use free service as of now.

Please advise.

Thanks
Krish

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
ok, how we can easily put all hadoop computer names and IPs to /etc/hosts
on all computers?
Do you have a script? or I need manually go to each computer, get its ip
and put it to /etc/hosts and then distribute /etc/hosts to all machines?

Don't you think one time effort to configure freedns is easier?
freedns solution works with AWS spot-instances as well.

You need to create snapshot after you configure freedns, hadoop, etc on
particular box.
Next time you need computer you can can go to your saved snapshots and
create spot-instance from it.


On Thu, Mar 5, 2015 at 6:54 PM, max scalf <or...@gmail.com> wrote:

> unfortunately without DNS you have to rely on /etc/hosts, so put in entry
> for all your nodes(nn,snn,dn1,dn2 etc..) on all nodes(/etc/hosts file) and
> i have that tested for hortonworks(using ambari) and cloudera manager and i
> am certainly sure it will work for MapR
>
> On Thu, Mar 5, 2015 at 8:47 PM, Alexander Pivovarov <ap...@gmail.com>
> wrote:
>
>> what about DNS?
>> if you have 2 computers (nn and dn) how nn knows dn ip?
>>
>> The script puts only this computer ip to /etc/hosts
>>
>> On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:
>>
>>> Here is a easy way to go about assigning static name to your ec2
>>> instance.  When you get the launch an EC2-instance from aws console when
>>> you get to the point of selecting VPC, ip address screen there is a screen
>>> that says "USER DATA"...put the below in with appropriate host name(change
>>> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
>>> you static name.
>>>
>>> #!/bin/bash
>>>
>>> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
>>> cat > /etc/sysconfig/network << EOF
>>> NETWORKING=yes
>>> NETWORKING_IPV6=no
>>> HOSTNAME=${HOSTNAME_TAG}
>>> EOF
>>>
>>> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
>>> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>>>
>>> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
>>> service network restart
>>>
>>>
>>> Also note i was able to do this on couple of spot instance for cheap
>>> price, only thing is once you shut it down or someone outbids you, you
>>> loose that instance but its easy/cheap to play around with.... and i have
>>> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>>>
>>> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  I dont know how you would do that to be honest. With EMR you have
>>>> destinctions master core and task nodes. If you need to change
>>>> configuration you just ssh into the EMR master node.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>>
>>>> What is the easiest way to assign names to aws ec2 computers?
>>>> I guess computer need static hostname and dns name before it can be
>>>> used in hadoop cluster.
>>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>>> wrote:
>>>>
>>>>>  When I started with EMR it was alot of testing and trial and error.
>>>>> HUE is already supported as something that can be installed from the AWS
>>>>> console. What I need to know is if you need this cluster on all the time or
>>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>>> it up run the job and tear it back down.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>>
>>>>>  Thanks Jonathan,
>>>>>
>>>>> I will try to explore EMR option also.
>>>>> Can you please let me know the configuration which you have used it?
>>>>> Can you please recommend for me also?
>>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>>> would like to do below things:
>>>>>
>>>>> setup kerberos
>>>>> setup federation
>>>>> setup monitoring
>>>>> setup hadr
>>>>> backup and recovery
>>>>> authorization using sentry
>>>>> backup and recovery of individual componenets
>>>>> performamce tuning
>>>>> upgrade of cdh
>>>>> upgrade of CM
>>>>> Hue User Administration
>>>>> Spark
>>>>> Solr
>>>>>
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>
>>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>>> jaquilina@eagleeyet.net> wrote:
>>>>>
>>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>>> through the test systems as well as the large amont of data when everythign
>>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>>> there would be enough as java can be pretty ram hungry.
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---
>>>>>> Regards,
>>>>>> Jonathan Aquilina
>>>>>> Founder Eagle Eye T
>>>>>>
>>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>>
>>>>>>  Hi,
>>>>>>
>>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>>> manager for 6-7 nodes.
>>>>>>
>>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>>> I would like to use free service as of now.
>>>>>>
>>>>>> Please advise.
>>>>>>
>>>>>> Thanks
>>>>>> Krish
>>>>>>
>>>>>>
>>>
>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
ok, how we can easily put all hadoop computer names and IPs to /etc/hosts
on all computers?
Do you have a script? or I need manually go to each computer, get its ip
and put it to /etc/hosts and then distribute /etc/hosts to all machines?

Don't you think one time effort to configure freedns is easier?
freedns solution works with AWS spot-instances as well.

You need to create snapshot after you configure freedns, hadoop, etc on
particular box.
Next time you need computer you can can go to your saved snapshots and
create spot-instance from it.


On Thu, Mar 5, 2015 at 6:54 PM, max scalf <or...@gmail.com> wrote:

> unfortunately without DNS you have to rely on /etc/hosts, so put in entry
> for all your nodes(nn,snn,dn1,dn2 etc..) on all nodes(/etc/hosts file) and
> i have that tested for hortonworks(using ambari) and cloudera manager and i
> am certainly sure it will work for MapR
>
> On Thu, Mar 5, 2015 at 8:47 PM, Alexander Pivovarov <ap...@gmail.com>
> wrote:
>
>> what about DNS?
>> if you have 2 computers (nn and dn) how nn knows dn ip?
>>
>> The script puts only this computer ip to /etc/hosts
>>
>> On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:
>>
>>> Here is a easy way to go about assigning static name to your ec2
>>> instance.  When you get the launch an EC2-instance from aws console when
>>> you get to the point of selecting VPC, ip address screen there is a screen
>>> that says "USER DATA"...put the below in with appropriate host name(change
>>> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
>>> you static name.
>>>
>>> #!/bin/bash
>>>
>>> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
>>> cat > /etc/sysconfig/network << EOF
>>> NETWORKING=yes
>>> NETWORKING_IPV6=no
>>> HOSTNAME=${HOSTNAME_TAG}
>>> EOF
>>>
>>> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
>>> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>>>
>>> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
>>> service network restart
>>>
>>>
>>> Also note i was able to do this on couple of spot instance for cheap
>>> price, only thing is once you shut it down or someone outbids you, you
>>> loose that instance but its easy/cheap to play around with.... and i have
>>> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>>>
>>> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  I dont know how you would do that to be honest. With EMR you have
>>>> destinctions master core and task nodes. If you need to change
>>>> configuration you just ssh into the EMR master node.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>>
>>>> What is the easiest way to assign names to aws ec2 computers?
>>>> I guess computer need static hostname and dns name before it can be
>>>> used in hadoop cluster.
>>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>>> wrote:
>>>>
>>>>>  When I started with EMR it was alot of testing and trial and error.
>>>>> HUE is already supported as something that can be installed from the AWS
>>>>> console. What I need to know is if you need this cluster on all the time or
>>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>>> it up run the job and tear it back down.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>>
>>>>>  Thanks Jonathan,
>>>>>
>>>>> I will try to explore EMR option also.
>>>>> Can you please let me know the configuration which you have used it?
>>>>> Can you please recommend for me also?
>>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>>> would like to do below things:
>>>>>
>>>>> setup kerberos
>>>>> setup federation
>>>>> setup monitoring
>>>>> setup hadr
>>>>> backup and recovery
>>>>> authorization using sentry
>>>>> backup and recovery of individual componenets
>>>>> performamce tuning
>>>>> upgrade of cdh
>>>>> upgrade of CM
>>>>> Hue User Administration
>>>>> Spark
>>>>> Solr
>>>>>
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>
>>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>>> jaquilina@eagleeyet.net> wrote:
>>>>>
>>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>>> through the test systems as well as the large amont of data when everythign
>>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>>> there would be enough as java can be pretty ram hungry.
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---
>>>>>> Regards,
>>>>>> Jonathan Aquilina
>>>>>> Founder Eagle Eye T
>>>>>>
>>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>>
>>>>>>  Hi,
>>>>>>
>>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>>> manager for 6-7 nodes.
>>>>>>
>>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>>> I would like to use free service as of now.
>>>>>>
>>>>>> Please advise.
>>>>>>
>>>>>> Thanks
>>>>>> Krish
>>>>>>
>>>>>>
>>>
>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
ok, how we can easily put all hadoop computer names and IPs to /etc/hosts
on all computers?
Do you have a script? or I need manually go to each computer, get its ip
and put it to /etc/hosts and then distribute /etc/hosts to all machines?

Don't you think one time effort to configure freedns is easier?
freedns solution works with AWS spot-instances as well.

You need to create snapshot after you configure freedns, hadoop, etc on
particular box.
Next time you need computer you can can go to your saved snapshots and
create spot-instance from it.


On Thu, Mar 5, 2015 at 6:54 PM, max scalf <or...@gmail.com> wrote:

> unfortunately without DNS you have to rely on /etc/hosts, so put in entry
> for all your nodes(nn,snn,dn1,dn2 etc..) on all nodes(/etc/hosts file) and
> i have that tested for hortonworks(using ambari) and cloudera manager and i
> am certainly sure it will work for MapR
>
> On Thu, Mar 5, 2015 at 8:47 PM, Alexander Pivovarov <ap...@gmail.com>
> wrote:
>
>> what about DNS?
>> if you have 2 computers (nn and dn) how nn knows dn ip?
>>
>> The script puts only this computer ip to /etc/hosts
>>
>> On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:
>>
>>> Here is a easy way to go about assigning static name to your ec2
>>> instance.  When you get the launch an EC2-instance from aws console when
>>> you get to the point of selecting VPC, ip address screen there is a screen
>>> that says "USER DATA"...put the below in with appropriate host name(change
>>> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
>>> you static name.
>>>
>>> #!/bin/bash
>>>
>>> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
>>> cat > /etc/sysconfig/network << EOF
>>> NETWORKING=yes
>>> NETWORKING_IPV6=no
>>> HOSTNAME=${HOSTNAME_TAG}
>>> EOF
>>>
>>> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
>>> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>>>
>>> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
>>> service network restart
>>>
>>>
>>> Also note i was able to do this on couple of spot instance for cheap
>>> price, only thing is once you shut it down or someone outbids you, you
>>> loose that instance but its easy/cheap to play around with.... and i have
>>> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>>>
>>> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  I dont know how you would do that to be honest. With EMR you have
>>>> destinctions master core and task nodes. If you need to change
>>>> configuration you just ssh into the EMR master node.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>>
>>>> What is the easiest way to assign names to aws ec2 computers?
>>>> I guess computer need static hostname and dns name before it can be
>>>> used in hadoop cluster.
>>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>>> wrote:
>>>>
>>>>>  When I started with EMR it was alot of testing and trial and error.
>>>>> HUE is already supported as something that can be installed from the AWS
>>>>> console. What I need to know is if you need this cluster on all the time or
>>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>>> it up run the job and tear it back down.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>>
>>>>>  Thanks Jonathan,
>>>>>
>>>>> I will try to explore EMR option also.
>>>>> Can you please let me know the configuration which you have used it?
>>>>> Can you please recommend for me also?
>>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>>> would like to do below things:
>>>>>
>>>>> setup kerberos
>>>>> setup federation
>>>>> setup monitoring
>>>>> setup hadr
>>>>> backup and recovery
>>>>> authorization using sentry
>>>>> backup and recovery of individual componenets
>>>>> performamce tuning
>>>>> upgrade of cdh
>>>>> upgrade of CM
>>>>> Hue User Administration
>>>>> Spark
>>>>> Solr
>>>>>
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>
>>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>>> jaquilina@eagleeyet.net> wrote:
>>>>>
>>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>>> through the test systems as well as the large amont of data when everythign
>>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>>> there would be enough as java can be pretty ram hungry.
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---
>>>>>> Regards,
>>>>>> Jonathan Aquilina
>>>>>> Founder Eagle Eye T
>>>>>>
>>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>>
>>>>>>  Hi,
>>>>>>
>>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>>> manager for 6-7 nodes.
>>>>>>
>>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>>> I would like to use free service as of now.
>>>>>>
>>>>>> Please advise.
>>>>>>
>>>>>> Thanks
>>>>>> Krish
>>>>>>
>>>>>>
>>>
>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
ok, how we can easily put all hadoop computer names and IPs to /etc/hosts
on all computers?
Do you have a script? or I need manually go to each computer, get its ip
and put it to /etc/hosts and then distribute /etc/hosts to all machines?

Don't you think one time effort to configure freedns is easier?
freedns solution works with AWS spot-instances as well.

You need to create snapshot after you configure freedns, hadoop, etc on
particular box.
Next time you need computer you can can go to your saved snapshots and
create spot-instance from it.


On Thu, Mar 5, 2015 at 6:54 PM, max scalf <or...@gmail.com> wrote:

> unfortunately without DNS you have to rely on /etc/hosts, so put in entry
> for all your nodes(nn,snn,dn1,dn2 etc..) on all nodes(/etc/hosts file) and
> i have that tested for hortonworks(using ambari) and cloudera manager and i
> am certainly sure it will work for MapR
>
> On Thu, Mar 5, 2015 at 8:47 PM, Alexander Pivovarov <ap...@gmail.com>
> wrote:
>
>> what about DNS?
>> if you have 2 computers (nn and dn) how nn knows dn ip?
>>
>> The script puts only this computer ip to /etc/hosts
>>
>> On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:
>>
>>> Here is a easy way to go about assigning static name to your ec2
>>> instance.  When you get the launch an EC2-instance from aws console when
>>> you get to the point of selecting VPC, ip address screen there is a screen
>>> that says "USER DATA"...put the below in with appropriate host name(change
>>> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
>>> you static name.
>>>
>>> #!/bin/bash
>>>
>>> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
>>> cat > /etc/sysconfig/network << EOF
>>> NETWORKING=yes
>>> NETWORKING_IPV6=no
>>> HOSTNAME=${HOSTNAME_TAG}
>>> EOF
>>>
>>> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
>>> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>>>
>>> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
>>> service network restart
>>>
>>>
>>> Also note i was able to do this on couple of spot instance for cheap
>>> price, only thing is once you shut it down or someone outbids you, you
>>> loose that instance but its easy/cheap to play around with.... and i have
>>> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>>>
>>> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  I dont know how you would do that to be honest. With EMR you have
>>>> destinctions master core and task nodes. If you need to change
>>>> configuration you just ssh into the EMR master node.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>>
>>>> What is the easiest way to assign names to aws ec2 computers?
>>>> I guess computer need static hostname and dns name before it can be
>>>> used in hadoop cluster.
>>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>>> wrote:
>>>>
>>>>>  When I started with EMR it was alot of testing and trial and error.
>>>>> HUE is already supported as something that can be installed from the AWS
>>>>> console. What I need to know is if you need this cluster on all the time or
>>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>>> it up run the job and tear it back down.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>>
>>>>>  Thanks Jonathan,
>>>>>
>>>>> I will try to explore EMR option also.
>>>>> Can you please let me know the configuration which you have used it?
>>>>> Can you please recommend for me also?
>>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>>> would like to do below things:
>>>>>
>>>>> setup kerberos
>>>>> setup federation
>>>>> setup monitoring
>>>>> setup hadr
>>>>> backup and recovery
>>>>> authorization using sentry
>>>>> backup and recovery of individual componenets
>>>>> performamce tuning
>>>>> upgrade of cdh
>>>>> upgrade of CM
>>>>> Hue User Administration
>>>>> Spark
>>>>> Solr
>>>>>
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>
>>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>>> jaquilina@eagleeyet.net> wrote:
>>>>>
>>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>>> through the test systems as well as the large amont of data when everythign
>>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>>> there would be enough as java can be pretty ram hungry.
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---
>>>>>> Regards,
>>>>>> Jonathan Aquilina
>>>>>> Founder Eagle Eye T
>>>>>>
>>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>>
>>>>>>  Hi,
>>>>>>
>>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>>> manager for 6-7 nodes.
>>>>>>
>>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>>> I would like to use free service as of now.
>>>>>>
>>>>>> Please advise.
>>>>>>
>>>>>> Thanks
>>>>>> Krish
>>>>>>
>>>>>>
>>>
>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
unfortunately without DNS you have to rely on /etc/hosts, so put in entry
for all your nodes(nn,snn,dn1,dn2 etc..) on all nodes(/etc/hosts file) and
i have that tested for hortonworks(using ambari) and cloudera manager and i
am certainly sure it will work for MapR

On Thu, Mar 5, 2015 at 8:47 PM, Alexander Pivovarov <ap...@gmail.com>
wrote:

> what about DNS?
> if you have 2 computers (nn and dn) how nn knows dn ip?
>
> The script puts only this computer ip to /etc/hosts
>
> On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:
>
>> Here is a easy way to go about assigning static name to your ec2
>> instance.  When you get the launch an EC2-instance from aws console when
>> you get to the point of selecting VPC, ip address screen there is a screen
>> that says "USER DATA"...put the below in with appropriate host name(change
>> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
>> you static name.
>>
>> #!/bin/bash
>>
>> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
>> cat > /etc/sysconfig/network << EOF
>> NETWORKING=yes
>> NETWORKING_IPV6=no
>> HOSTNAME=${HOSTNAME_TAG}
>> EOF
>>
>> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
>> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>>
>> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
>> service network restart
>>
>>
>> Also note i was able to do this on couple of spot instance for cheap
>> price, only thing is once you shut it down or someone outbids you, you
>> loose that instance but its easy/cheap to play around with.... and i have
>> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>>
>> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  I dont know how you would do that to be honest. With EMR you have
>>> destinctions master core and task nodes. If you need to change
>>> configuration you just ssh into the EMR master node.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>
>>> What is the easiest way to assign names to aws ec2 computers?
>>> I guess computer need static hostname and dns name before it can be used
>>> in hadoop cluster.
>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>> wrote:
>>>
>>>>  When I started with EMR it was alot of testing and trial and error.
>>>> HUE is already supported as something that can be installed from the AWS
>>>> console. What I need to know is if you need this cluster on all the time or
>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>> it up run the job and tear it back down.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>
>>>>  Thanks Jonathan,
>>>>
>>>> I will try to explore EMR option also.
>>>> Can you please let me know the configuration which you have used it?
>>>> Can you please recommend for me also?
>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>> would like to do below things:
>>>>
>>>> setup kerberos
>>>> setup federation
>>>> setup monitoring
>>>> setup hadr
>>>> backup and recovery
>>>> authorization using sentry
>>>> backup and recovery of individual componenets
>>>> performamce tuning
>>>> upgrade of cdh
>>>> upgrade of CM
>>>> Hue User Administration
>>>> Spark
>>>> Solr
>>>>
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>> jaquilina@eagleeyet.net> wrote:
>>>>
>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>> through the test systems as well as the large amont of data when everythign
>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>> there would be enough as java can be pretty ram hungry.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>
>>>>>  Hi,
>>>>>
>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>> manager for 6-7 nodes.
>>>>>
>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>> I would like to use free service as of now.
>>>>>
>>>>> Please advise.
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>
>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
unfortunately without DNS you have to rely on /etc/hosts, so put in entry
for all your nodes(nn,snn,dn1,dn2 etc..) on all nodes(/etc/hosts file) and
i have that tested for hortonworks(using ambari) and cloudera manager and i
am certainly sure it will work for MapR

On Thu, Mar 5, 2015 at 8:47 PM, Alexander Pivovarov <ap...@gmail.com>
wrote:

> what about DNS?
> if you have 2 computers (nn and dn) how nn knows dn ip?
>
> The script puts only this computer ip to /etc/hosts
>
> On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:
>
>> Here is a easy way to go about assigning static name to your ec2
>> instance.  When you get the launch an EC2-instance from aws console when
>> you get to the point of selecting VPC, ip address screen there is a screen
>> that says "USER DATA"...put the below in with appropriate host name(change
>> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
>> you static name.
>>
>> #!/bin/bash
>>
>> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
>> cat > /etc/sysconfig/network << EOF
>> NETWORKING=yes
>> NETWORKING_IPV6=no
>> HOSTNAME=${HOSTNAME_TAG}
>> EOF
>>
>> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
>> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>>
>> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
>> service network restart
>>
>>
>> Also note i was able to do this on couple of spot instance for cheap
>> price, only thing is once you shut it down or someone outbids you, you
>> loose that instance but its easy/cheap to play around with.... and i have
>> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>>
>> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  I dont know how you would do that to be honest. With EMR you have
>>> destinctions master core and task nodes. If you need to change
>>> configuration you just ssh into the EMR master node.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>
>>> What is the easiest way to assign names to aws ec2 computers?
>>> I guess computer need static hostname and dns name before it can be used
>>> in hadoop cluster.
>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>> wrote:
>>>
>>>>  When I started with EMR it was alot of testing and trial and error.
>>>> HUE is already supported as something that can be installed from the AWS
>>>> console. What I need to know is if you need this cluster on all the time or
>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>> it up run the job and tear it back down.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>
>>>>  Thanks Jonathan,
>>>>
>>>> I will try to explore EMR option also.
>>>> Can you please let me know the configuration which you have used it?
>>>> Can you please recommend for me also?
>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>> would like to do below things:
>>>>
>>>> setup kerberos
>>>> setup federation
>>>> setup monitoring
>>>> setup hadr
>>>> backup and recovery
>>>> authorization using sentry
>>>> backup and recovery of individual componenets
>>>> performamce tuning
>>>> upgrade of cdh
>>>> upgrade of CM
>>>> Hue User Administration
>>>> Spark
>>>> Solr
>>>>
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>> jaquilina@eagleeyet.net> wrote:
>>>>
>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>> through the test systems as well as the large amont of data when everythign
>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>> there would be enough as java can be pretty ram hungry.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>
>>>>>  Hi,
>>>>>
>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>> manager for 6-7 nodes.
>>>>>
>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>> I would like to use free service as of now.
>>>>>
>>>>> Please advise.
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>
>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
unfortunately without DNS you have to rely on /etc/hosts, so put in entry
for all your nodes(nn,snn,dn1,dn2 etc..) on all nodes(/etc/hosts file) and
i have that tested for hortonworks(using ambari) and cloudera manager and i
am certainly sure it will work for MapR

On Thu, Mar 5, 2015 at 8:47 PM, Alexander Pivovarov <ap...@gmail.com>
wrote:

> what about DNS?
> if you have 2 computers (nn and dn) how nn knows dn ip?
>
> The script puts only this computer ip to /etc/hosts
>
> On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:
>
>> Here is a easy way to go about assigning static name to your ec2
>> instance.  When you get the launch an EC2-instance from aws console when
>> you get to the point of selecting VPC, ip address screen there is a screen
>> that says "USER DATA"...put the below in with appropriate host name(change
>> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
>> you static name.
>>
>> #!/bin/bash
>>
>> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
>> cat > /etc/sysconfig/network << EOF
>> NETWORKING=yes
>> NETWORKING_IPV6=no
>> HOSTNAME=${HOSTNAME_TAG}
>> EOF
>>
>> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
>> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>>
>> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
>> service network restart
>>
>>
>> Also note i was able to do this on couple of spot instance for cheap
>> price, only thing is once you shut it down or someone outbids you, you
>> loose that instance but its easy/cheap to play around with.... and i have
>> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>>
>> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  I dont know how you would do that to be honest. With EMR you have
>>> destinctions master core and task nodes. If you need to change
>>> configuration you just ssh into the EMR master node.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>
>>> What is the easiest way to assign names to aws ec2 computers?
>>> I guess computer need static hostname and dns name before it can be used
>>> in hadoop cluster.
>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>> wrote:
>>>
>>>>  When I started with EMR it was alot of testing and trial and error.
>>>> HUE is already supported as something that can be installed from the AWS
>>>> console. What I need to know is if you need this cluster on all the time or
>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>> it up run the job and tear it back down.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>
>>>>  Thanks Jonathan,
>>>>
>>>> I will try to explore EMR option also.
>>>> Can you please let me know the configuration which you have used it?
>>>> Can you please recommend for me also?
>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>> would like to do below things:
>>>>
>>>> setup kerberos
>>>> setup federation
>>>> setup monitoring
>>>> setup hadr
>>>> backup and recovery
>>>> authorization using sentry
>>>> backup and recovery of individual componenets
>>>> performamce tuning
>>>> upgrade of cdh
>>>> upgrade of CM
>>>> Hue User Administration
>>>> Spark
>>>> Solr
>>>>
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>> jaquilina@eagleeyet.net> wrote:
>>>>
>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>> through the test systems as well as the large amont of data when everythign
>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>> there would be enough as java can be pretty ram hungry.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>
>>>>>  Hi,
>>>>>
>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>> manager for 6-7 nodes.
>>>>>
>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>> I would like to use free service as of now.
>>>>>
>>>>> Please advise.
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>
>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
unfortunately without DNS you have to rely on /etc/hosts, so put in entry
for all your nodes(nn,snn,dn1,dn2 etc..) on all nodes(/etc/hosts file) and
i have that tested for hortonworks(using ambari) and cloudera manager and i
am certainly sure it will work for MapR

On Thu, Mar 5, 2015 at 8:47 PM, Alexander Pivovarov <ap...@gmail.com>
wrote:

> what about DNS?
> if you have 2 computers (nn and dn) how nn knows dn ip?
>
> The script puts only this computer ip to /etc/hosts
>
> On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:
>
>> Here is a easy way to go about assigning static name to your ec2
>> instance.  When you get the launch an EC2-instance from aws console when
>> you get to the point of selecting VPC, ip address screen there is a screen
>> that says "USER DATA"...put the below in with appropriate host name(change
>> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
>> you static name.
>>
>> #!/bin/bash
>>
>> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
>> cat > /etc/sysconfig/network << EOF
>> NETWORKING=yes
>> NETWORKING_IPV6=no
>> HOSTNAME=${HOSTNAME_TAG}
>> EOF
>>
>> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
>> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>>
>> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
>> service network restart
>>
>>
>> Also note i was able to do this on couple of spot instance for cheap
>> price, only thing is once you shut it down or someone outbids you, you
>> loose that instance but its easy/cheap to play around with.... and i have
>> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>>
>> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  I dont know how you would do that to be honest. With EMR you have
>>> destinctions master core and task nodes. If you need to change
>>> configuration you just ssh into the EMR master node.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>
>>> What is the easiest way to assign names to aws ec2 computers?
>>> I guess computer need static hostname and dns name before it can be used
>>> in hadoop cluster.
>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>> wrote:
>>>
>>>>  When I started with EMR it was alot of testing and trial and error.
>>>> HUE is already supported as something that can be installed from the AWS
>>>> console. What I need to know is if you need this cluster on all the time or
>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>> it up run the job and tear it back down.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>
>>>>  Thanks Jonathan,
>>>>
>>>> I will try to explore EMR option also.
>>>> Can you please let me know the configuration which you have used it?
>>>> Can you please recommend for me also?
>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>> would like to do below things:
>>>>
>>>> setup kerberos
>>>> setup federation
>>>> setup monitoring
>>>> setup hadr
>>>> backup and recovery
>>>> authorization using sentry
>>>> backup and recovery of individual componenets
>>>> performamce tuning
>>>> upgrade of cdh
>>>> upgrade of CM
>>>> Hue User Administration
>>>> Spark
>>>> Solr
>>>>
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>> jaquilina@eagleeyet.net> wrote:
>>>>
>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>> through the test systems as well as the large amont of data when everythign
>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>> there would be enough as java can be pretty ram hungry.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>
>>>>>  Hi,
>>>>>
>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>> manager for 6-7 nodes.
>>>>>
>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>> I would like to use free service as of now.
>>>>>
>>>>> Please advise.
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>
>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
what about DNS?
if you have 2 computers (nn and dn) how nn knows dn ip?

The script puts only this computer ip to /etc/hosts

On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:

> Here is a easy way to go about assigning static name to your ec2
> instance.  When you get the launch an EC2-instance from aws console when
> you get to the point of selecting VPC, ip address screen there is a screen
> that says "USER DATA"...put the below in with appropriate host name(change
> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
> you static name.
>
> #!/bin/bash
>
> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
> cat > /etc/sysconfig/network << EOF
> NETWORKING=yes
> NETWORKING_IPV6=no
> HOSTNAME=${HOSTNAME_TAG}
> EOF
>
> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>
> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
> service network restart
>
>
> Also note i was able to do this on couple of spot instance for cheap
> price, only thing is once you shut it down or someone outbids you, you
> loose that instance but its easy/cheap to play around with.... and i have
> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>
> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change
>> configuration you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>
>>>  When I started with EMR it was alot of testing and trial and error.
>>> HUE is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>>  Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>>  Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
what about DNS?
if you have 2 computers (nn and dn) how nn knows dn ip?

The script puts only this computer ip to /etc/hosts

On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:

> Here is a easy way to go about assigning static name to your ec2
> instance.  When you get the launch an EC2-instance from aws console when
> you get to the point of selecting VPC, ip address screen there is a screen
> that says "USER DATA"...put the below in with appropriate host name(change
> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
> you static name.
>
> #!/bin/bash
>
> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
> cat > /etc/sysconfig/network << EOF
> NETWORKING=yes
> NETWORKING_IPV6=no
> HOSTNAME=${HOSTNAME_TAG}
> EOF
>
> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>
> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
> service network restart
>
>
> Also note i was able to do this on couple of spot instance for cheap
> price, only thing is once you shut it down or someone outbids you, you
> loose that instance but its easy/cheap to play around with.... and i have
> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>
> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change
>> configuration you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>
>>>  When I started with EMR it was alot of testing and trial and error.
>>> HUE is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>>  Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>>  Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
what about DNS?
if you have 2 computers (nn and dn) how nn knows dn ip?

The script puts only this computer ip to /etc/hosts

On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:

> Here is a easy way to go about assigning static name to your ec2
> instance.  When you get the launch an EC2-instance from aws console when
> you get to the point of selecting VPC, ip address screen there is a screen
> that says "USER DATA"...put the below in with appropriate host name(change
> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
> you static name.
>
> #!/bin/bash
>
> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
> cat > /etc/sysconfig/network << EOF
> NETWORKING=yes
> NETWORKING_IPV6=no
> HOSTNAME=${HOSTNAME_TAG}
> EOF
>
> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>
> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
> service network restart
>
>
> Also note i was able to do this on couple of spot instance for cheap
> price, only thing is once you shut it down or someone outbids you, you
> loose that instance but its easy/cheap to play around with.... and i have
> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>
> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change
>> configuration you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>
>>>  When I started with EMR it was alot of testing and trial and error.
>>> HUE is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>>  Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>>  Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
what about DNS?
if you have 2 computers (nn and dn) how nn knows dn ip?

The script puts only this computer ip to /etc/hosts

On Thu, Mar 5, 2015 at 6:39 PM, max scalf <or...@gmail.com> wrote:

> Here is a easy way to go about assigning static name to your ec2
> instance.  When you get the launch an EC2-instance from aws console when
> you get to the point of selecting VPC, ip address screen there is a screen
> that says "USER DATA"...put the below in with appropriate host name(change
> CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
> you static name.
>
> #!/bin/bash
>
> HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
> cat > /etc/sysconfig/network << EOF
> NETWORKING=yes
> NETWORKING_IPV6=no
> HOSTNAME=${HOSTNAME_TAG}
> EOF
>
> IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
> echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts
>
> echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
> service network restart
>
>
> Also note i was able to do this on couple of spot instance for cheap
> price, only thing is once you shut it down or someone outbids you, you
> loose that instance but its easy/cheap to play around with.... and i have
> used couple of m3.medium for my NN/SNN and couple of them for data nodes...
>
> On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change
>> configuration you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>
>>>  When I started with EMR it was alot of testing and trial and error.
>>> HUE is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>>  Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>>  Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
Here is a easy way to go about assigning static name to your ec2 instance.
When you get the launch an EC2-instance from aws console when you get to
the point of selecting VPC, ip address screen there is a screen that says
"USER DATA"...put the below in with appropriate host name(change
CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
you static name.

#!/bin/bash

HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
cat > /etc/sysconfig/network << EOF
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=${HOSTNAME_TAG}
EOF

IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts

echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
service network restart


Also note i was able to do this on couple of spot instance for cheap price,
only thing is once you shut it down or someone outbids you, you loose that
instance but its easy/cheap to play around with.... and i have used couple
of m3.medium for my NN/SNN and couple of them for data nodes...

On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  I dont know how you would do that to be honest. With EMR you have
> destinctions master core and task nodes. If you need to change
> configuration you just ssh into the EMR master node.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used
> in hadoop cluster.
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
> wrote:
>
>>  When I started with EMR it was alot of testing and trial and error. HUE
>> is already supported as something that can be installed from the AWS
>> console. What I need to know is if you need this cluster on all the time or
>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>> it up run the job and tear it back down.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 01:10, Krish Donald wrote:
>>
>>  Thanks Jonathan,
>>
>> I will try to explore EMR option also.
>> Can you please let me know the configuration which you have used it?
>> Can you please recommend for me also?
>> I would like to setup Hadoop cluster using cloudera manager and then
>> would like to do below things:
>>
>> setup kerberos
>> setup federation
>> setup monitoring
>> setup hadr
>> backup and recovery
>> authorization using sentry
>> backup and recovery of individual componenets
>> performamce tuning
>> upgrade of cdh
>> upgrade of CM
>> Hue User Administration
>> Spark
>> Solr
>>
>>
>> Thanks
>> Krish
>>
>>
>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  krish EMR wont cost you much with all the testing and data we ran
>>> through the test systems as well as the large amont of data when everythign
>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>> there would be enough as java can be pretty ram hungry.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>
>>>  Hi,
>>>
>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>> manager for 6-7 nodes.
>>>
>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>> I would like to use free service as of now.
>>>
>>> Please advise.
>>>
>>> Thanks
>>> Krish
>>>
>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by daemeon reiydelle <da...@gmail.com>.
Do a reverse lookup and use the name you find. There are a few areas
of Hadoopo that require reverse name lookup, but in general just
create relevant entries (shared across the cluster, e.g. via Ansible
if more than just a few nodes) in /etc/hosts.

Not hard.


On Thu, Mar 5, 2015 at 6:35 PM, Alexander Pivovarov
<ap...@gmail.com> wrote:
> I found the following solution to this problem
>
> I registered 2 subdomains  (public and local) for each computer on
> https://freedns.afraid.org/subdomain/
> e.g.
> myhadoop-nn.crabdance.com
> myhadoop-nn-local.crabdance.com
>
> then I added cron job which sends http requests to update public and local
> ip on freedns server
> hint: public ip is detected automatically
> ip address for local name can be set using request parameter
> &address=10.x.x.x   (don't forget to escape &)
>
> as a result my nn computer has 2 DNS names with currently assigned ip
> addresses , e.g.
> myhadoop-nn.crabdance.com  54.203.181.177
> myhadoop-nn-local.crabdance.com   10.220.149.103
>
> in hadoop configuration I can use local machine names
> to access my cluster outside of AWS I can use public names
>
> Just curious if AWS provides easier way to name EC2 computers?
>
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
> wrote:
>>
>> I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change configuration
>> you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>> On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>>
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>>
>>> When I started with EMR it was alot of testing and trial and error. HUE
>>> is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>> On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>> Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina
>>> <ja...@eagleeyet.net> wrote:
>>>>
>>>> krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>> On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

When i was testing I was using default setup 1 master node 2 core and no
task nodes. i would spiin up the cluster then terminate it. The term for
that is a transient cluster. 

When the big data was needing to be crunched i changed the setup a bit.
An Important note there is a limitation of 20 Nodes be it core or task
with EMR a request can be submitted to lift that limitation. 

When actually live i had 1 master node 3 task nodes (which have HDFS
storage) and 10 task nodes. All instances used were of size m3.large.
Ran another batch of data for 2013 through EMR with this setup in 31 min
just to run the data that isnt including cluster spawn up time. 

One thing to note you do not need to use HDFS storage as that can and
will drive up the cost quickly and there there is a chance of data
corruption or even data loss if a core node crashes. I have been using
amazon S3 and pulling the data from there. The biggest advantage is that
you can spawn up multiple clusters and share the same data to be
processed that way. Using HDFS has its perks too but costs can
drastically increase as well. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-07 09:54, tesmai4@gmail.com wrote: 

> Dear Jonathan,
> 
> Would you please describe the process of running EMR based Hadoop for $15.00, I tried and my cost were rocketing like $60 for one hour.
> 
> Regards
> 
> On 05/03/2015 23:57, Jonathan Aquilina wrote: 
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

When i was testing I was using default setup 1 master node 2 core and no
task nodes. i would spiin up the cluster then terminate it. The term for
that is a transient cluster. 

When the big data was needing to be crunched i changed the setup a bit.
An Important note there is a limitation of 20 Nodes be it core or task
with EMR a request can be submitted to lift that limitation. 

When actually live i had 1 master node 3 task nodes (which have HDFS
storage) and 10 task nodes. All instances used were of size m3.large.
Ran another batch of data for 2013 through EMR with this setup in 31 min
just to run the data that isnt including cluster spawn up time. 

One thing to note you do not need to use HDFS storage as that can and
will drive up the cost quickly and there there is a chance of data
corruption or even data loss if a core node crashes. I have been using
amazon S3 and pulling the data from there. The biggest advantage is that
you can spawn up multiple clusters and share the same data to be
processed that way. Using HDFS has its perks too but costs can
drastically increase as well. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-07 09:54, tesmai4@gmail.com wrote: 

> Dear Jonathan,
> 
> Would you please describe the process of running EMR based Hadoop for $15.00, I tried and my cost were rocketing like $60 for one hour.
> 
> Regards
> 
> On 05/03/2015 23:57, Jonathan Aquilina wrote: 
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

When i was testing I was using default setup 1 master node 2 core and no
task nodes. i would spiin up the cluster then terminate it. The term for
that is a transient cluster. 

When the big data was needing to be crunched i changed the setup a bit.
An Important note there is a limitation of 20 Nodes be it core or task
with EMR a request can be submitted to lift that limitation. 

When actually live i had 1 master node 3 task nodes (which have HDFS
storage) and 10 task nodes. All instances used were of size m3.large.
Ran another batch of data for 2013 through EMR with this setup in 31 min
just to run the data that isnt including cluster spawn up time. 

One thing to note you do not need to use HDFS storage as that can and
will drive up the cost quickly and there there is a chance of data
corruption or even data loss if a core node crashes. I have been using
amazon S3 and pulling the data from there. The biggest advantage is that
you can spawn up multiple clusters and share the same data to be
processed that way. Using HDFS has its perks too but costs can
drastically increase as well. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-07 09:54, tesmai4@gmail.com wrote: 

> Dear Jonathan,
> 
> Would you please describe the process of running EMR based Hadoop for $15.00, I tried and my cost were rocketing like $60 for one hour.
> 
> Regards
> 
> On 05/03/2015 23:57, Jonathan Aquilina wrote: 
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

When i was testing I was using default setup 1 master node 2 core and no
task nodes. i would spiin up the cluster then terminate it. The term for
that is a transient cluster. 

When the big data was needing to be crunched i changed the setup a bit.
An Important note there is a limitation of 20 Nodes be it core or task
with EMR a request can be submitted to lift that limitation. 

When actually live i had 1 master node 3 task nodes (which have HDFS
storage) and 10 task nodes. All instances used were of size m3.large.
Ran another batch of data for 2013 through EMR with this setup in 31 min
just to run the data that isnt including cluster spawn up time. 

One thing to note you do not need to use HDFS storage as that can and
will drive up the cost quickly and there there is a chance of data
corruption or even data loss if a core node crashes. I have been using
amazon S3 and pulling the data from there. The biggest advantage is that
you can spawn up multiple clusters and share the same data to be
processed that way. Using HDFS has its perks too but costs can
drastically increase as well. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-07 09:54, tesmai4@gmail.com wrote: 

> Dear Jonathan,
> 
> Would you please describe the process of running EMR based Hadoop for $15.00, I tried and my cost were rocketing like $60 for one hour.
> 
> Regards
> 
> On 05/03/2015 23:57, Jonathan Aquilina wrote: 
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by "tesmai4@gmail.com" <te...@gmail.com>.
 Dear Jonathan,

Would you please describe the process of running EMR based Hadoop for
$15.00, I tried and my cost were rocketing like $60 for one hour.

Regards


On 05/03/2015 23:57, Jonathan Aquilina wrote:

krish EMR wont cost you much with all the testing and data we ran through
the test systems as well as the large amont of data when everythign was
read we paid about 15.00 USD. I honestly do not think that the specs there
would be enough as java can be pretty ram hungry.



---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

 On 2015-03-06 00:41, Krish Donald wrote:

 Hi,

I am new to AWS and would like to setup Hadoop cluster using cloudera
manager for 6-7 nodes.

t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
I would like to use free service as of now.

Please advise.

Thanks
Krish

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by "tesmai4@gmail.com" <te...@gmail.com>.
 Dear Jonathan,

Would you please describe the process of running EMR based Hadoop for
$15.00, I tried and my cost were rocketing like $60 for one hour.

Regards


On 05/03/2015 23:57, Jonathan Aquilina wrote:

krish EMR wont cost you much with all the testing and data we ran through
the test systems as well as the large amont of data when everythign was
read we paid about 15.00 USD. I honestly do not think that the specs there
would be enough as java can be pretty ram hungry.



---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

 On 2015-03-06 00:41, Krish Donald wrote:

 Hi,

I am new to AWS and would like to setup Hadoop cluster using cloudera
manager for 6-7 nodes.

t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
I would like to use free service as of now.

Please advise.

Thanks
Krish

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by "tesmai4@gmail.com" <te...@gmail.com>.
 Dear Jonathan,

Would you please describe the process of running EMR based Hadoop for
$15.00, I tried and my cost were rocketing like $60 for one hour.

Regards


On 05/03/2015 23:57, Jonathan Aquilina wrote:

krish EMR wont cost you much with all the testing and data we ran through
the test systems as well as the large amont of data when everythign was
read we paid about 15.00 USD. I honestly do not think that the specs there
would be enough as java can be pretty ram hungry.



---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

 On 2015-03-06 00:41, Krish Donald wrote:

 Hi,

I am new to AWS and would like to setup Hadoop cluster using cloudera
manager for 6-7 nodes.

t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
I would like to use free service as of now.

Please advise.

Thanks
Krish

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by "tesmai4@gmail.com" <te...@gmail.com>.
 Dear Jonathan,

Would you please describe the process of running EMR based Hadoop for
$15.00, I tried and my cost were rocketing like $60 for one hour.

Regards


On 05/03/2015 23:57, Jonathan Aquilina wrote:

krish EMR wont cost you much with all the testing and data we ran through
the test systems as well as the large amont of data when everythign was
read we paid about 15.00 USD. I honestly do not think that the specs there
would be enough as java can be pretty ram hungry.



---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

 On 2015-03-06 00:41, Krish Donald wrote:

 Hi,

I am new to AWS and would like to setup Hadoop cluster using cloudera
manager for 6-7 nodes.

t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
I would like to use free service as of now.

Please advise.

Thanks
Krish

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
@jonathan,

I totaly agree that this is reinventing the wheel, but think about the
folks who wants to do this setup from scratch to better under hadoop or
maybe those folks who are going to do admin realted work...and hence the
need to setting is up from scratch...

@alexandar,

Yes you are right, if one time effort of setting up freedns but for me it
was easy enough coz i gave static hostname thru the user data script and
also static ip address for each host...once that was done, the way i pushed
out /etc/hosts was below...on lets say the master node i edited /etc/hosts
file and put all my other nodes info on there...next setup SSH(as we have
to do this anyways) for hadoop install, once SSH is setup just create a new
file calle hosts.txt and put all your hostname in there and run a for loop
like below....

for host in `cat hosts.txt`; do
    scp /etc/hosts root@$host:/etc/hosts
done

when i frist was getting started on HDP i used the below link which helped
me, it pushes out /etc/hosts file and also does other stuff...check it out

http://sacharya.com/deploying-multinode-hadoop-20-cluster-using-apache-ambari/




On Fri, Mar 6, 2015 at 12:43 AM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  The only limitation I know is that of how many nodes you can have and
> how many instances of that particular size the host is on can support. you
> can load hive in EMR and then any other features of the cluster are managed
> at the master node level as you have SSH access there.
>
> What are the advantage of 2.6 over 2.4 for example.
>
> I just feel you guys are reinventing the wheel when amazon already caters
> for hadoop granted it might not be 2.6.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 07:31, Alexander Pivovarov wrote:
>
>    I think EMR has its own limitation
>
> e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my
> hive patch.
>  How EMR can help me?  it supports hadoop up to 2.4.0  (not even 2.4.1)
>
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html
>
>
>
>
>
>
> On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  Hi guys I know you guys want to keep costs down, but why go through all
>> the effort to setup ec2 instances when you deploy EMR it takes the time to
>> provision and setup the ec2 instances for you. All configuration then for
>> the entire cluster is done on the master node of the particular cluster or
>> setting up of additional software that is all done through the EMR console.
>> We were doing some geospatial calculations and we loaded a 3rd party jar
>> file called esri into the EMR cluster. I then had to pass a small bootstrap
>> action (script) to have it distribute esri to the entire cluster.
>>
>> Why are you guys reinventing the wheel?
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 03:35, Alexander Pivovarov wrote:
>>
>>    I found the following solution to this problem
>>
>> I registered 2 subdomains  (public and local) for each computer on
>> https://freedns.afraid.org/subdomain/
>> e.g.
>> myhadoop-nn.crabdance.com
>> myhadoop-nn-local.crabdance.com
>>
>> then I added cron job which sends http requests to update public and
>> local ip on freedns server
>> hint: public ip is detected automatically
>> ip address for local name can be set using request parameter &address=10.x.x.x
>> (don't forget to escape &)
>>
>> as a result my nn computer has 2 DNS names with currently assigned ip
>> addresses , e.g.
>> myhadoop-nn.crabdance.com  54.203.181.177
>> myhadoop-nn-local.crabdance.com   10.220.149.103
>>
>> in hadoop configuration I can use local machine names
>> to access my cluster outside of AWS I can use public names
>>
>> Just curious if AWS provides easier way to name EC2 computers?
>>
>> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  I dont know how you would do that to be honest. With EMR you have
>>> destinctions master core and task nodes. If you need to change
>>> configuration you just ssh into the EMR master node.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>
>>> What is the easiest way to assign names to aws ec2 computers?
>>> I guess computer need static hostname and dns name before it can be used
>>> in hadoop cluster.
>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>> wrote:
>>>
>>>>  When I started with EMR it was alot of testing and trial and error.
>>>> HUE is already supported as something that can be installed from the AWS
>>>> console. What I need to know is if you need this cluster on all the time or
>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>> it up run the job and tear it back down.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>
>>>>  Thanks Jonathan,
>>>>
>>>> I will try to explore EMR option also.
>>>> Can you please let me know the configuration which you have used it?
>>>> Can you please recommend for me also?
>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>> would like to do below things:
>>>>
>>>> setup kerberos
>>>> setup federation
>>>> setup monitoring
>>>> setup hadr
>>>> backup and recovery
>>>> authorization using sentry
>>>> backup and recovery of individual componenets
>>>> performamce tuning
>>>> upgrade of cdh
>>>> upgrade of CM
>>>> Hue User Administration
>>>> Spark
>>>> Solr
>>>>
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>> jaquilina@eagleeyet.net> wrote:
>>>>
>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>> through the test systems as well as the large amont of data when everythign
>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>> there would be enough as java can be pretty ram hungry.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>
>>>>>  Hi,
>>>>>
>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>> manager for 6-7 nodes.
>>>>>
>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>> I would like to use free service as of now.
>>>>>
>>>>> Please advise.
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
@jonathan,

I totaly agree that this is reinventing the wheel, but think about the
folks who wants to do this setup from scratch to better under hadoop or
maybe those folks who are going to do admin realted work...and hence the
need to setting is up from scratch...

@alexandar,

Yes you are right, if one time effort of setting up freedns but for me it
was easy enough coz i gave static hostname thru the user data script and
also static ip address for each host...once that was done, the way i pushed
out /etc/hosts was below...on lets say the master node i edited /etc/hosts
file and put all my other nodes info on there...next setup SSH(as we have
to do this anyways) for hadoop install, once SSH is setup just create a new
file calle hosts.txt and put all your hostname in there and run a for loop
like below....

for host in `cat hosts.txt`; do
    scp /etc/hosts root@$host:/etc/hosts
done

when i frist was getting started on HDP i used the below link which helped
me, it pushes out /etc/hosts file and also does other stuff...check it out

http://sacharya.com/deploying-multinode-hadoop-20-cluster-using-apache-ambari/




On Fri, Mar 6, 2015 at 12:43 AM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  The only limitation I know is that of how many nodes you can have and
> how many instances of that particular size the host is on can support. you
> can load hive in EMR and then any other features of the cluster are managed
> at the master node level as you have SSH access there.
>
> What are the advantage of 2.6 over 2.4 for example.
>
> I just feel you guys are reinventing the wheel when amazon already caters
> for hadoop granted it might not be 2.6.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 07:31, Alexander Pivovarov wrote:
>
>    I think EMR has its own limitation
>
> e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my
> hive patch.
>  How EMR can help me?  it supports hadoop up to 2.4.0  (not even 2.4.1)
>
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html
>
>
>
>
>
>
> On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  Hi guys I know you guys want to keep costs down, but why go through all
>> the effort to setup ec2 instances when you deploy EMR it takes the time to
>> provision and setup the ec2 instances for you. All configuration then for
>> the entire cluster is done on the master node of the particular cluster or
>> setting up of additional software that is all done through the EMR console.
>> We were doing some geospatial calculations and we loaded a 3rd party jar
>> file called esri into the EMR cluster. I then had to pass a small bootstrap
>> action (script) to have it distribute esri to the entire cluster.
>>
>> Why are you guys reinventing the wheel?
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 03:35, Alexander Pivovarov wrote:
>>
>>    I found the following solution to this problem
>>
>> I registered 2 subdomains  (public and local) for each computer on
>> https://freedns.afraid.org/subdomain/
>> e.g.
>> myhadoop-nn.crabdance.com
>> myhadoop-nn-local.crabdance.com
>>
>> then I added cron job which sends http requests to update public and
>> local ip on freedns server
>> hint: public ip is detected automatically
>> ip address for local name can be set using request parameter &address=10.x.x.x
>> (don't forget to escape &)
>>
>> as a result my nn computer has 2 DNS names with currently assigned ip
>> addresses , e.g.
>> myhadoop-nn.crabdance.com  54.203.181.177
>> myhadoop-nn-local.crabdance.com   10.220.149.103
>>
>> in hadoop configuration I can use local machine names
>> to access my cluster outside of AWS I can use public names
>>
>> Just curious if AWS provides easier way to name EC2 computers?
>>
>> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  I dont know how you would do that to be honest. With EMR you have
>>> destinctions master core and task nodes. If you need to change
>>> configuration you just ssh into the EMR master node.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>
>>> What is the easiest way to assign names to aws ec2 computers?
>>> I guess computer need static hostname and dns name before it can be used
>>> in hadoop cluster.
>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>> wrote:
>>>
>>>>  When I started with EMR it was alot of testing and trial and error.
>>>> HUE is already supported as something that can be installed from the AWS
>>>> console. What I need to know is if you need this cluster on all the time or
>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>> it up run the job and tear it back down.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>
>>>>  Thanks Jonathan,
>>>>
>>>> I will try to explore EMR option also.
>>>> Can you please let me know the configuration which you have used it?
>>>> Can you please recommend for me also?
>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>> would like to do below things:
>>>>
>>>> setup kerberos
>>>> setup federation
>>>> setup monitoring
>>>> setup hadr
>>>> backup and recovery
>>>> authorization using sentry
>>>> backup and recovery of individual componenets
>>>> performamce tuning
>>>> upgrade of cdh
>>>> upgrade of CM
>>>> Hue User Administration
>>>> Spark
>>>> Solr
>>>>
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>> jaquilina@eagleeyet.net> wrote:
>>>>
>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>> through the test systems as well as the large amont of data when everythign
>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>> there would be enough as java can be pretty ram hungry.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>
>>>>>  Hi,
>>>>>
>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>> manager for 6-7 nodes.
>>>>>
>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>> I would like to use free service as of now.
>>>>>
>>>>> Please advise.
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
@jonathan,

I totaly agree that this is reinventing the wheel, but think about the
folks who wants to do this setup from scratch to better under hadoop or
maybe those folks who are going to do admin realted work...and hence the
need to setting is up from scratch...

@alexandar,

Yes you are right, if one time effort of setting up freedns but for me it
was easy enough coz i gave static hostname thru the user data script and
also static ip address for each host...once that was done, the way i pushed
out /etc/hosts was below...on lets say the master node i edited /etc/hosts
file and put all my other nodes info on there...next setup SSH(as we have
to do this anyways) for hadoop install, once SSH is setup just create a new
file calle hosts.txt and put all your hostname in there and run a for loop
like below....

for host in `cat hosts.txt`; do
    scp /etc/hosts root@$host:/etc/hosts
done

when i frist was getting started on HDP i used the below link which helped
me, it pushes out /etc/hosts file and also does other stuff...check it out

http://sacharya.com/deploying-multinode-hadoop-20-cluster-using-apache-ambari/




On Fri, Mar 6, 2015 at 12:43 AM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  The only limitation I know is that of how many nodes you can have and
> how many instances of that particular size the host is on can support. you
> can load hive in EMR and then any other features of the cluster are managed
> at the master node level as you have SSH access there.
>
> What are the advantage of 2.6 over 2.4 for example.
>
> I just feel you guys are reinventing the wheel when amazon already caters
> for hadoop granted it might not be 2.6.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 07:31, Alexander Pivovarov wrote:
>
>    I think EMR has its own limitation
>
> e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my
> hive patch.
>  How EMR can help me?  it supports hadoop up to 2.4.0  (not even 2.4.1)
>
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html
>
>
>
>
>
>
> On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  Hi guys I know you guys want to keep costs down, but why go through all
>> the effort to setup ec2 instances when you deploy EMR it takes the time to
>> provision and setup the ec2 instances for you. All configuration then for
>> the entire cluster is done on the master node of the particular cluster or
>> setting up of additional software that is all done through the EMR console.
>> We were doing some geospatial calculations and we loaded a 3rd party jar
>> file called esri into the EMR cluster. I then had to pass a small bootstrap
>> action (script) to have it distribute esri to the entire cluster.
>>
>> Why are you guys reinventing the wheel?
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 03:35, Alexander Pivovarov wrote:
>>
>>    I found the following solution to this problem
>>
>> I registered 2 subdomains  (public and local) for each computer on
>> https://freedns.afraid.org/subdomain/
>> e.g.
>> myhadoop-nn.crabdance.com
>> myhadoop-nn-local.crabdance.com
>>
>> then I added cron job which sends http requests to update public and
>> local ip on freedns server
>> hint: public ip is detected automatically
>> ip address for local name can be set using request parameter &address=10.x.x.x
>> (don't forget to escape &)
>>
>> as a result my nn computer has 2 DNS names with currently assigned ip
>> addresses , e.g.
>> myhadoop-nn.crabdance.com  54.203.181.177
>> myhadoop-nn-local.crabdance.com   10.220.149.103
>>
>> in hadoop configuration I can use local machine names
>> to access my cluster outside of AWS I can use public names
>>
>> Just curious if AWS provides easier way to name EC2 computers?
>>
>> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  I dont know how you would do that to be honest. With EMR you have
>>> destinctions master core and task nodes. If you need to change
>>> configuration you just ssh into the EMR master node.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>
>>> What is the easiest way to assign names to aws ec2 computers?
>>> I guess computer need static hostname and dns name before it can be used
>>> in hadoop cluster.
>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>> wrote:
>>>
>>>>  When I started with EMR it was alot of testing and trial and error.
>>>> HUE is already supported as something that can be installed from the AWS
>>>> console. What I need to know is if you need this cluster on all the time or
>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>> it up run the job and tear it back down.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>
>>>>  Thanks Jonathan,
>>>>
>>>> I will try to explore EMR option also.
>>>> Can you please let me know the configuration which you have used it?
>>>> Can you please recommend for me also?
>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>> would like to do below things:
>>>>
>>>> setup kerberos
>>>> setup federation
>>>> setup monitoring
>>>> setup hadr
>>>> backup and recovery
>>>> authorization using sentry
>>>> backup and recovery of individual componenets
>>>> performamce tuning
>>>> upgrade of cdh
>>>> upgrade of CM
>>>> Hue User Administration
>>>> Spark
>>>> Solr
>>>>
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>> jaquilina@eagleeyet.net> wrote:
>>>>
>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>> through the test systems as well as the large amont of data when everythign
>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>> there would be enough as java can be pretty ram hungry.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>
>>>>>  Hi,
>>>>>
>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>> manager for 6-7 nodes.
>>>>>
>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>> I would like to use free service as of now.
>>>>>
>>>>> Please advise.
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
@jonathan,

I totaly agree that this is reinventing the wheel, but think about the
folks who wants to do this setup from scratch to better under hadoop or
maybe those folks who are going to do admin realted work...and hence the
need to setting is up from scratch...

@alexandar,

Yes you are right, if one time effort of setting up freedns but for me it
was easy enough coz i gave static hostname thru the user data script and
also static ip address for each host...once that was done, the way i pushed
out /etc/hosts was below...on lets say the master node i edited /etc/hosts
file and put all my other nodes info on there...next setup SSH(as we have
to do this anyways) for hadoop install, once SSH is setup just create a new
file calle hosts.txt and put all your hostname in there and run a for loop
like below....

for host in `cat hosts.txt`; do
    scp /etc/hosts root@$host:/etc/hosts
done

when i frist was getting started on HDP i used the below link which helped
me, it pushes out /etc/hosts file and also does other stuff...check it out

http://sacharya.com/deploying-multinode-hadoop-20-cluster-using-apache-ambari/




On Fri, Mar 6, 2015 at 12:43 AM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  The only limitation I know is that of how many nodes you can have and
> how many instances of that particular size the host is on can support. you
> can load hive in EMR and then any other features of the cluster are managed
> at the master node level as you have SSH access there.
>
> What are the advantage of 2.6 over 2.4 for example.
>
> I just feel you guys are reinventing the wheel when amazon already caters
> for hadoop granted it might not be 2.6.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 07:31, Alexander Pivovarov wrote:
>
>    I think EMR has its own limitation
>
> e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my
> hive patch.
>  How EMR can help me?  it supports hadoop up to 2.4.0  (not even 2.4.1)
>
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html
>
>
>
>
>
>
> On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  Hi guys I know you guys want to keep costs down, but why go through all
>> the effort to setup ec2 instances when you deploy EMR it takes the time to
>> provision and setup the ec2 instances for you. All configuration then for
>> the entire cluster is done on the master node of the particular cluster or
>> setting up of additional software that is all done through the EMR console.
>> We were doing some geospatial calculations and we loaded a 3rd party jar
>> file called esri into the EMR cluster. I then had to pass a small bootstrap
>> action (script) to have it distribute esri to the entire cluster.
>>
>> Why are you guys reinventing the wheel?
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 03:35, Alexander Pivovarov wrote:
>>
>>    I found the following solution to this problem
>>
>> I registered 2 subdomains  (public and local) for each computer on
>> https://freedns.afraid.org/subdomain/
>> e.g.
>> myhadoop-nn.crabdance.com
>> myhadoop-nn-local.crabdance.com
>>
>> then I added cron job which sends http requests to update public and
>> local ip on freedns server
>> hint: public ip is detected automatically
>> ip address for local name can be set using request parameter &address=10.x.x.x
>> (don't forget to escape &)
>>
>> as a result my nn computer has 2 DNS names with currently assigned ip
>> addresses , e.g.
>> myhadoop-nn.crabdance.com  54.203.181.177
>> myhadoop-nn-local.crabdance.com   10.220.149.103
>>
>> in hadoop configuration I can use local machine names
>> to access my cluster outside of AWS I can use public names
>>
>> Just curious if AWS provides easier way to name EC2 computers?
>>
>> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  I dont know how you would do that to be honest. With EMR you have
>>> destinctions master core and task nodes. If you need to change
>>> configuration you just ssh into the EMR master node.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>>
>>> What is the easiest way to assign names to aws ec2 computers?
>>> I guess computer need static hostname and dns name before it can be used
>>> in hadoop cluster.
>>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>>> wrote:
>>>
>>>>  When I started with EMR it was alot of testing and trial and error.
>>>> HUE is already supported as something that can be installed from the AWS
>>>> console. What I need to know is if you need this cluster on all the time or
>>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>>> it up run the job and tear it back down.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>>
>>>>  Thanks Jonathan,
>>>>
>>>> I will try to explore EMR option also.
>>>> Can you please let me know the configuration which you have used it?
>>>> Can you please recommend for me also?
>>>> I would like to setup Hadoop cluster using cloudera manager and then
>>>> would like to do below things:
>>>>
>>>> setup kerberos
>>>> setup federation
>>>> setup monitoring
>>>> setup hadr
>>>> backup and recovery
>>>> authorization using sentry
>>>> backup and recovery of individual componenets
>>>> performamce tuning
>>>> upgrade of cdh
>>>> upgrade of CM
>>>> Hue User Administration
>>>> Spark
>>>> Solr
>>>>
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>
>>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>>> jaquilina@eagleeyet.net> wrote:
>>>>
>>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>>> through the test systems as well as the large amont of data when everythign
>>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>>> there would be enough as java can be pretty ram hungry.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Regards,
>>>>> Jonathan Aquilina
>>>>> Founder Eagle Eye T
>>>>>
>>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>>
>>>>>  Hi,
>>>>>
>>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>>> manager for 6-7 nodes.
>>>>>
>>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>>> I would like to use free service as of now.
>>>>>
>>>>> Please advise.
>>>>>
>>>>> Thanks
>>>>> Krish
>>>>>
>>>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

The only limitation I know is that of how many nodes you can have and
how many instances of that particular size the host is on can support.
you can load hive in EMR and then any other features of the cluster are
managed at the master node level as you have SSH access there. 

What are the advantage of 2.6 over 2.4 for example. 

I just feel you guys are reinventing the wheel when amazon already
caters for hadoop granted it might not be 2.6. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 07:31, Alexander Pivovarov wrote: 

> I think EMR has its own limitation
> 
> e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my hive patch. How EMR can help me? it supports hadoop up to 2.4.0 (not even 2.4.1)
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html [1]
> 
> On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> Hi guys I know you guys want to keep costs down, but why go through all the effort to setup ec2 instances when you deploy EMR it takes the time to provision and setup the ec2 instances for you. All configuration then for the entire cluster is done on the master node of the particular cluster or setting up of additional software that is all done through the EMR console. We were doing some geospatial calculations and we loaded a 3rd party jar file called esri into the EMR cluster. I then had to pass a small bootstrap action (script) to have it distribute esri to the entire cluster. 
> 
> Why are you guys reinventing the wheel? 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 03:35, Alexander Pivovarov wrote: 
> 
> I found the following solution to this problem
> 
> I registered 2 subdomains (public and local) for each computer on https://freedns.afraid.org/subdomain/ [2] 
> e.g. 
> myhadoop-nn.crabdance.com [3]
> myhadoop-nn-local.crabdance.com [4] 
> then I added cron job which sends http requests to update public and local ip on freedns server hint: public ip is detected automatically ip address for local name can be set using request parameter &address=10.x.x.x (don't forget to escape &)
> 
> as a result my nn computer has 2 DNS names with currently assigned ip addresses , e.g.
> myhadoop-nn.crabdance.com [3] 54.203.181.177
> myhadoop-nn-local.crabdance.com [4] 10.220.149.103
> 
> in hadoop configuration I can use local machine names to access my cluster outside of AWS I can use public names
> 
> Just curious if AWS provides easier way to name EC2 computers?
> 
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> I dont know how you would do that to be honest. With EMR you have destinctions master core and task nodes. If you need to change configuration you just ssh into the EMR master node. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 02:11, Alexander Pivovarov wrote: 
> 
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Links:
------
[1]
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html
[2] https://freedns.afraid.org/subdomain/
[3] http://myhadoop-nn.crabdance.com
[4] http://myhadoop-nn-local.crabdance.com

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

The only limitation I know is that of how many nodes you can have and
how many instances of that particular size the host is on can support.
you can load hive in EMR and then any other features of the cluster are
managed at the master node level as you have SSH access there. 

What are the advantage of 2.6 over 2.4 for example. 

I just feel you guys are reinventing the wheel when amazon already
caters for hadoop granted it might not be 2.6. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 07:31, Alexander Pivovarov wrote: 

> I think EMR has its own limitation
> 
> e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my hive patch. How EMR can help me? it supports hadoop up to 2.4.0 (not even 2.4.1)
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html [1]
> 
> On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> Hi guys I know you guys want to keep costs down, but why go through all the effort to setup ec2 instances when you deploy EMR it takes the time to provision and setup the ec2 instances for you. All configuration then for the entire cluster is done on the master node of the particular cluster or setting up of additional software that is all done through the EMR console. We were doing some geospatial calculations and we loaded a 3rd party jar file called esri into the EMR cluster. I then had to pass a small bootstrap action (script) to have it distribute esri to the entire cluster. 
> 
> Why are you guys reinventing the wheel? 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 03:35, Alexander Pivovarov wrote: 
> 
> I found the following solution to this problem
> 
> I registered 2 subdomains (public and local) for each computer on https://freedns.afraid.org/subdomain/ [2] 
> e.g. 
> myhadoop-nn.crabdance.com [3]
> myhadoop-nn-local.crabdance.com [4] 
> then I added cron job which sends http requests to update public and local ip on freedns server hint: public ip is detected automatically ip address for local name can be set using request parameter &address=10.x.x.x (don't forget to escape &)
> 
> as a result my nn computer has 2 DNS names with currently assigned ip addresses , e.g.
> myhadoop-nn.crabdance.com [3] 54.203.181.177
> myhadoop-nn-local.crabdance.com [4] 10.220.149.103
> 
> in hadoop configuration I can use local machine names to access my cluster outside of AWS I can use public names
> 
> Just curious if AWS provides easier way to name EC2 computers?
> 
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> I dont know how you would do that to be honest. With EMR you have destinctions master core and task nodes. If you need to change configuration you just ssh into the EMR master node. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 02:11, Alexander Pivovarov wrote: 
> 
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Links:
------
[1]
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html
[2] https://freedns.afraid.org/subdomain/
[3] http://myhadoop-nn.crabdance.com
[4] http://myhadoop-nn-local.crabdance.com

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

The only limitation I know is that of how many nodes you can have and
how many instances of that particular size the host is on can support.
you can load hive in EMR and then any other features of the cluster are
managed at the master node level as you have SSH access there. 

What are the advantage of 2.6 over 2.4 for example. 

I just feel you guys are reinventing the wheel when amazon already
caters for hadoop granted it might not be 2.6. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 07:31, Alexander Pivovarov wrote: 

> I think EMR has its own limitation
> 
> e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my hive patch. How EMR can help me? it supports hadoop up to 2.4.0 (not even 2.4.1)
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html [1]
> 
> On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> Hi guys I know you guys want to keep costs down, but why go through all the effort to setup ec2 instances when you deploy EMR it takes the time to provision and setup the ec2 instances for you. All configuration then for the entire cluster is done on the master node of the particular cluster or setting up of additional software that is all done through the EMR console. We were doing some geospatial calculations and we loaded a 3rd party jar file called esri into the EMR cluster. I then had to pass a small bootstrap action (script) to have it distribute esri to the entire cluster. 
> 
> Why are you guys reinventing the wheel? 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 03:35, Alexander Pivovarov wrote: 
> 
> I found the following solution to this problem
> 
> I registered 2 subdomains (public and local) for each computer on https://freedns.afraid.org/subdomain/ [2] 
> e.g. 
> myhadoop-nn.crabdance.com [3]
> myhadoop-nn-local.crabdance.com [4] 
> then I added cron job which sends http requests to update public and local ip on freedns server hint: public ip is detected automatically ip address for local name can be set using request parameter &address=10.x.x.x (don't forget to escape &)
> 
> as a result my nn computer has 2 DNS names with currently assigned ip addresses , e.g.
> myhadoop-nn.crabdance.com [3] 54.203.181.177
> myhadoop-nn-local.crabdance.com [4] 10.220.149.103
> 
> in hadoop configuration I can use local machine names to access my cluster outside of AWS I can use public names
> 
> Just curious if AWS provides easier way to name EC2 computers?
> 
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> I dont know how you would do that to be honest. With EMR you have destinctions master core and task nodes. If you need to change configuration you just ssh into the EMR master node. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 02:11, Alexander Pivovarov wrote: 
> 
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Links:
------
[1]
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html
[2] https://freedns.afraid.org/subdomain/
[3] http://myhadoop-nn.crabdance.com
[4] http://myhadoop-nn-local.crabdance.com

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

The only limitation I know is that of how many nodes you can have and
how many instances of that particular size the host is on can support.
you can load hive in EMR and then any other features of the cluster are
managed at the master node level as you have SSH access there. 

What are the advantage of 2.6 over 2.4 for example. 

I just feel you guys are reinventing the wheel when amazon already
caters for hadoop granted it might not be 2.6. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 07:31, Alexander Pivovarov wrote: 

> I think EMR has its own limitation
> 
> e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my hive patch. How EMR can help me? it supports hadoop up to 2.4.0 (not even 2.4.1)
> http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html [1]
> 
> On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> Hi guys I know you guys want to keep costs down, but why go through all the effort to setup ec2 instances when you deploy EMR it takes the time to provision and setup the ec2 instances for you. All configuration then for the entire cluster is done on the master node of the particular cluster or setting up of additional software that is all done through the EMR console. We were doing some geospatial calculations and we loaded a 3rd party jar file called esri into the EMR cluster. I then had to pass a small bootstrap action (script) to have it distribute esri to the entire cluster. 
> 
> Why are you guys reinventing the wheel? 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 03:35, Alexander Pivovarov wrote: 
> 
> I found the following solution to this problem
> 
> I registered 2 subdomains (public and local) for each computer on https://freedns.afraid.org/subdomain/ [2] 
> e.g. 
> myhadoop-nn.crabdance.com [3]
> myhadoop-nn-local.crabdance.com [4] 
> then I added cron job which sends http requests to update public and local ip on freedns server hint: public ip is detected automatically ip address for local name can be set using request parameter &address=10.x.x.x (don't forget to escape &)
> 
> as a result my nn computer has 2 DNS names with currently assigned ip addresses , e.g.
> myhadoop-nn.crabdance.com [3] 54.203.181.177
> myhadoop-nn-local.crabdance.com [4] 10.220.149.103
> 
> in hadoop configuration I can use local machine names to access my cluster outside of AWS I can use public names
> 
> Just curious if AWS provides easier way to name EC2 computers?
> 
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> I dont know how you would do that to be honest. With EMR you have destinctions master core and task nodes. If you need to change configuration you just ssh into the EMR master node. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 02:11, Alexander Pivovarov wrote: 
> 
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Links:
------
[1]
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html
[2] https://freedns.afraid.org/subdomain/
[3] http://myhadoop-nn.crabdance.com
[4] http://myhadoop-nn-local.crabdance.com

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
I think EMR has its own limitation

e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my
hive patch.

How EMR can help me?  it supports hadoop up to 2.4.0  (not even 2.4.1)
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html







On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  Hi guys I know you guys want to keep costs down, but why go through all
> the effort to setup ec2 instances when you deploy EMR it takes the time to
> provision and setup the ec2 instances for you. All configuration then for
> the entire cluster is done on the master node of the particular cluster or
> setting up of additional software that is all done through the EMR console.
> We were doing some geospatial calculations and we loaded a 3rd party jar
> file called esri into the EMR cluster. I then had to pass a small bootstrap
> action (script) to have it distribute esri to the entire cluster.
>
> Why are you guys reinventing the wheel?
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 03:35, Alexander Pivovarov wrote:
>
>    I found the following solution to this problem
>
> I registered 2 subdomains  (public and local) for each computer on
> https://freedns.afraid.org/subdomain/
> e.g.
> myhadoop-nn.crabdance.com
> myhadoop-nn-local.crabdance.com
>
> then I added cron job which sends http requests to update public and local
> ip on freedns server
> hint: public ip is detected automatically
> ip address for local name can be set using request parameter &address=10.x.x.x
> (don't forget to escape &)
>
> as a result my nn computer has 2 DNS names with currently assigned ip
> addresses , e.g.
> myhadoop-nn.crabdance.com  54.203.181.177
> myhadoop-nn-local.crabdance.com   10.220.149.103
>
> in hadoop configuration I can use local machine names
> to access my cluster outside of AWS I can use public names
>
> Just curious if AWS provides easier way to name EC2 computers?
>
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change
>> configuration you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>
>>>  When I started with EMR it was alot of testing and trial and error.
>>> HUE is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>>  Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>>  Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
I think EMR has its own limitation

e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my
hive patch.

How EMR can help me?  it supports hadoop up to 2.4.0  (not even 2.4.1)
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html







On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  Hi guys I know you guys want to keep costs down, but why go through all
> the effort to setup ec2 instances when you deploy EMR it takes the time to
> provision and setup the ec2 instances for you. All configuration then for
> the entire cluster is done on the master node of the particular cluster or
> setting up of additional software that is all done through the EMR console.
> We were doing some geospatial calculations and we loaded a 3rd party jar
> file called esri into the EMR cluster. I then had to pass a small bootstrap
> action (script) to have it distribute esri to the entire cluster.
>
> Why are you guys reinventing the wheel?
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 03:35, Alexander Pivovarov wrote:
>
>    I found the following solution to this problem
>
> I registered 2 subdomains  (public and local) for each computer on
> https://freedns.afraid.org/subdomain/
> e.g.
> myhadoop-nn.crabdance.com
> myhadoop-nn-local.crabdance.com
>
> then I added cron job which sends http requests to update public and local
> ip on freedns server
> hint: public ip is detected automatically
> ip address for local name can be set using request parameter &address=10.x.x.x
> (don't forget to escape &)
>
> as a result my nn computer has 2 DNS names with currently assigned ip
> addresses , e.g.
> myhadoop-nn.crabdance.com  54.203.181.177
> myhadoop-nn-local.crabdance.com   10.220.149.103
>
> in hadoop configuration I can use local machine names
> to access my cluster outside of AWS I can use public names
>
> Just curious if AWS provides easier way to name EC2 computers?
>
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change
>> configuration you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>
>>>  When I started with EMR it was alot of testing and trial and error.
>>> HUE is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>>  Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>>  Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
I think EMR has its own limitation

e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my
hive patch.

How EMR can help me?  it supports hadoop up to 2.4.0  (not even 2.4.1)
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html







On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  Hi guys I know you guys want to keep costs down, but why go through all
> the effort to setup ec2 instances when you deploy EMR it takes the time to
> provision and setup the ec2 instances for you. All configuration then for
> the entire cluster is done on the master node of the particular cluster or
> setting up of additional software that is all done through the EMR console.
> We were doing some geospatial calculations and we loaded a 3rd party jar
> file called esri into the EMR cluster. I then had to pass a small bootstrap
> action (script) to have it distribute esri to the entire cluster.
>
> Why are you guys reinventing the wheel?
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 03:35, Alexander Pivovarov wrote:
>
>    I found the following solution to this problem
>
> I registered 2 subdomains  (public and local) for each computer on
> https://freedns.afraid.org/subdomain/
> e.g.
> myhadoop-nn.crabdance.com
> myhadoop-nn-local.crabdance.com
>
> then I added cron job which sends http requests to update public and local
> ip on freedns server
> hint: public ip is detected automatically
> ip address for local name can be set using request parameter &address=10.x.x.x
> (don't forget to escape &)
>
> as a result my nn computer has 2 DNS names with currently assigned ip
> addresses , e.g.
> myhadoop-nn.crabdance.com  54.203.181.177
> myhadoop-nn-local.crabdance.com   10.220.149.103
>
> in hadoop configuration I can use local machine names
> to access my cluster outside of AWS I can use public names
>
> Just curious if AWS provides easier way to name EC2 computers?
>
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change
>> configuration you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>
>>>  When I started with EMR it was alot of testing and trial and error.
>>> HUE is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>>  Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>>  Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
I think EMR has its own limitation

e.g. I want to setup hadoop 2.6.0 with kerberos + hive-1.2.0 to test my
hive patch.

How EMR can help me?  it supports hadoop up to 2.4.0  (not even 2.4.1)
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html







On Thu, Mar 5, 2015 at 9:51 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  Hi guys I know you guys want to keep costs down, but why go through all
> the effort to setup ec2 instances when you deploy EMR it takes the time to
> provision and setup the ec2 instances for you. All configuration then for
> the entire cluster is done on the master node of the particular cluster or
> setting up of additional software that is all done through the EMR console.
> We were doing some geospatial calculations and we loaded a 3rd party jar
> file called esri into the EMR cluster. I then had to pass a small bootstrap
> action (script) to have it distribute esri to the entire cluster.
>
> Why are you guys reinventing the wheel?
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 03:35, Alexander Pivovarov wrote:
>
>    I found the following solution to this problem
>
> I registered 2 subdomains  (public and local) for each computer on
> https://freedns.afraid.org/subdomain/
> e.g.
> myhadoop-nn.crabdance.com
> myhadoop-nn-local.crabdance.com
>
> then I added cron job which sends http requests to update public and local
> ip on freedns server
> hint: public ip is detected automatically
> ip address for local name can be set using request parameter &address=10.x.x.x
> (don't forget to escape &)
>
> as a result my nn computer has 2 DNS names with currently assigned ip
> addresses , e.g.
> myhadoop-nn.crabdance.com  54.203.181.177
> myhadoop-nn-local.crabdance.com   10.220.149.103
>
> in hadoop configuration I can use local machine names
> to access my cluster outside of AWS I can use public names
>
> Just curious if AWS provides easier way to name EC2 computers?
>
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change
>> configuration you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>
>>>  When I started with EMR it was alot of testing and trial and error.
>>> HUE is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>  On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>>  Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>>> jaquilina@eagleeyet.net> wrote:
>>>
>>>>  krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>>  Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>>>>
>>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

Hi guys I know you guys want to keep costs down, but why go through all
the effort to setup ec2 instances when you deploy EMR it takes the time
to provision and setup the ec2 instances for you. All configuration then
for the entire cluster is done on the master node of the particular
cluster or setting up of additional software that is all done through
the EMR console. We were doing some geospatial calculations and we
loaded a 3rd party jar file called esri into the EMR cluster. I then had
to pass a small bootstrap action (script) to have it distribute esri to
the entire cluster. 

Why are you guys reinventing the wheel? 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 03:35, Alexander Pivovarov wrote: 

> I found the following solution to this problem
> 
> I registered 2 subdomains (public and local) for each computer on https://freedns.afraid.org/subdomain/ [1] 
> e.g. 
> myhadoop-nn.crabdance.com [2]
> myhadoop-nn-local.crabdance.com [3] 
> then I added cron job which sends http requests to update public and local ip on freedns server hint: public ip is detected automatically ip address for local name can be set using request parameter &address=10.x.x.x (don't forget to escape &)
> 
> as a result my nn computer has 2 DNS names with currently assigned ip addresses , e.g.
> myhadoop-nn.crabdance.com [2] 54.203.181.177
> myhadoop-nn-local.crabdance.com [3] 10.220.149.103
> 
> in hadoop configuration I can use local machine names to access my cluster outside of AWS I can use public names
> 
> Just curious if AWS provides easier way to name EC2 computers?
> 
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> I dont know how you would do that to be honest. With EMR you have destinctions master core and task nodes. If you need to change configuration you just ssh into the EMR master node. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 02:11, Alexander Pivovarov wrote: 
> 
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Links:
------
[1] https://freedns.afraid.org/subdomain/
[2] http://myhadoop-nn.crabdance.com
[3] http://myhadoop-nn-local.crabdance.com

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by daemeon reiydelle <da...@gmail.com>.
Do a reverse lookup and use the name you find. There are a few areas
of Hadoopo that require reverse name lookup, but in general just
create relevant entries (shared across the cluster, e.g. via Ansible
if more than just a few nodes) in /etc/hosts.

Not hard.


On Thu, Mar 5, 2015 at 6:35 PM, Alexander Pivovarov
<ap...@gmail.com> wrote:
> I found the following solution to this problem
>
> I registered 2 subdomains  (public and local) for each computer on
> https://freedns.afraid.org/subdomain/
> e.g.
> myhadoop-nn.crabdance.com
> myhadoop-nn-local.crabdance.com
>
> then I added cron job which sends http requests to update public and local
> ip on freedns server
> hint: public ip is detected automatically
> ip address for local name can be set using request parameter
> &address=10.x.x.x   (don't forget to escape &)
>
> as a result my nn computer has 2 DNS names with currently assigned ip
> addresses , e.g.
> myhadoop-nn.crabdance.com  54.203.181.177
> myhadoop-nn-local.crabdance.com   10.220.149.103
>
> in hadoop configuration I can use local machine names
> to access my cluster outside of AWS I can use public names
>
> Just curious if AWS provides easier way to name EC2 computers?
>
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
> wrote:
>>
>> I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change configuration
>> you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>> On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>>
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>>
>>> When I started with EMR it was alot of testing and trial and error. HUE
>>> is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>> On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>> Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina
>>> <ja...@eagleeyet.net> wrote:
>>>>
>>>> krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>> On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by daemeon reiydelle <da...@gmail.com>.
Do a reverse lookup and use the name you find. There are a few areas
of Hadoopo that require reverse name lookup, but in general just
create relevant entries (shared across the cluster, e.g. via Ansible
if more than just a few nodes) in /etc/hosts.

Not hard.


On Thu, Mar 5, 2015 at 6:35 PM, Alexander Pivovarov
<ap...@gmail.com> wrote:
> I found the following solution to this problem
>
> I registered 2 subdomains  (public and local) for each computer on
> https://freedns.afraid.org/subdomain/
> e.g.
> myhadoop-nn.crabdance.com
> myhadoop-nn-local.crabdance.com
>
> then I added cron job which sends http requests to update public and local
> ip on freedns server
> hint: public ip is detected automatically
> ip address for local name can be set using request parameter
> &address=10.x.x.x   (don't forget to escape &)
>
> as a result my nn computer has 2 DNS names with currently assigned ip
> addresses , e.g.
> myhadoop-nn.crabdance.com  54.203.181.177
> myhadoop-nn-local.crabdance.com   10.220.149.103
>
> in hadoop configuration I can use local machine names
> to access my cluster outside of AWS I can use public names
>
> Just curious if AWS provides easier way to name EC2 computers?
>
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
> wrote:
>>
>> I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change configuration
>> you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>> On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>>
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>>
>>> When I started with EMR it was alot of testing and trial and error. HUE
>>> is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>> On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>> Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina
>>> <ja...@eagleeyet.net> wrote:
>>>>
>>>> krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>> On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by daemeon reiydelle <da...@gmail.com>.
Do a reverse lookup and use the name you find. There are a few areas
of Hadoopo that require reverse name lookup, but in general just
create relevant entries (shared across the cluster, e.g. via Ansible
if more than just a few nodes) in /etc/hosts.

Not hard.


On Thu, Mar 5, 2015 at 6:35 PM, Alexander Pivovarov
<ap...@gmail.com> wrote:
> I found the following solution to this problem
>
> I registered 2 subdomains  (public and local) for each computer on
> https://freedns.afraid.org/subdomain/
> e.g.
> myhadoop-nn.crabdance.com
> myhadoop-nn-local.crabdance.com
>
> then I added cron job which sends http requests to update public and local
> ip on freedns server
> hint: public ip is detected automatically
> ip address for local name can be set using request parameter
> &address=10.x.x.x   (don't forget to escape &)
>
> as a result my nn computer has 2 DNS names with currently assigned ip
> addresses , e.g.
> myhadoop-nn.crabdance.com  54.203.181.177
> myhadoop-nn-local.crabdance.com   10.220.149.103
>
> in hadoop configuration I can use local machine names
> to access my cluster outside of AWS I can use public names
>
> Just curious if AWS provides easier way to name EC2 computers?
>
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
> wrote:
>>
>> I dont know how you would do that to be honest. With EMR you have
>> destinctions master core and task nodes. If you need to change configuration
>> you just ssh into the EMR master node.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>> On 2015-03-06 02:11, Alexander Pivovarov wrote:
>>
>> What is the easiest way to assign names to aws ec2 computers?
>> I guess computer need static hostname and dns name before it can be used
>> in hadoop cluster.
>>
>> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
>> wrote:
>>>
>>> When I started with EMR it was alot of testing and trial and error. HUE
>>> is already supported as something that can be installed from the AWS
>>> console. What I need to know is if you need this cluster on all the time or
>>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>>> it up run the job and tear it back down.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>> On 2015-03-06 01:10, Krish Donald wrote:
>>>
>>> Thanks Jonathan,
>>>
>>> I will try to explore EMR option also.
>>> Can you please let me know the configuration which you have used it?
>>> Can you please recommend for me also?
>>> I would like to setup Hadoop cluster using cloudera manager and then
>>> would like to do below things:
>>>
>>> setup kerberos
>>> setup federation
>>> setup monitoring
>>> setup hadr
>>> backup and recovery
>>> authorization using sentry
>>> backup and recovery of individual componenets
>>> performamce tuning
>>> upgrade of cdh
>>> upgrade of CM
>>> Hue User Administration
>>> Spark
>>> Solr
>>>
>>>
>>> Thanks
>>> Krish
>>>
>>>
>>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina
>>> <ja...@eagleeyet.net> wrote:
>>>>
>>>> krish EMR wont cost you much with all the testing and data we ran
>>>> through the test systems as well as the large amont of data when everythign
>>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>>> there would be enough as java can be pretty ram hungry.
>>>>
>>>>
>>>>
>>>> ---
>>>> Regards,
>>>> Jonathan Aquilina
>>>> Founder Eagle Eye T
>>>>
>>>> On 2015-03-06 00:41, Krish Donald wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>>> manager for 6-7 nodes.
>>>>
>>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>>> I would like to use free service as of now.
>>>>
>>>> Please advise.
>>>>
>>>> Thanks
>>>> Krish
>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

Hi guys I know you guys want to keep costs down, but why go through all
the effort to setup ec2 instances when you deploy EMR it takes the time
to provision and setup the ec2 instances for you. All configuration then
for the entire cluster is done on the master node of the particular
cluster or setting up of additional software that is all done through
the EMR console. We were doing some geospatial calculations and we
loaded a 3rd party jar file called esri into the EMR cluster. I then had
to pass a small bootstrap action (script) to have it distribute esri to
the entire cluster. 

Why are you guys reinventing the wheel? 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 03:35, Alexander Pivovarov wrote: 

> I found the following solution to this problem
> 
> I registered 2 subdomains (public and local) for each computer on https://freedns.afraid.org/subdomain/ [1] 
> e.g. 
> myhadoop-nn.crabdance.com [2]
> myhadoop-nn-local.crabdance.com [3] 
> then I added cron job which sends http requests to update public and local ip on freedns server hint: public ip is detected automatically ip address for local name can be set using request parameter &address=10.x.x.x (don't forget to escape &)
> 
> as a result my nn computer has 2 DNS names with currently assigned ip addresses , e.g.
> myhadoop-nn.crabdance.com [2] 54.203.181.177
> myhadoop-nn-local.crabdance.com [3] 10.220.149.103
> 
> in hadoop configuration I can use local machine names to access my cluster outside of AWS I can use public names
> 
> Just curious if AWS provides easier way to name EC2 computers?
> 
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> I dont know how you would do that to be honest. With EMR you have destinctions master core and task nodes. If you need to change configuration you just ssh into the EMR master node. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 02:11, Alexander Pivovarov wrote: 
> 
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Links:
------
[1] https://freedns.afraid.org/subdomain/
[2] http://myhadoop-nn.crabdance.com
[3] http://myhadoop-nn-local.crabdance.com

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

Hi guys I know you guys want to keep costs down, but why go through all
the effort to setup ec2 instances when you deploy EMR it takes the time
to provision and setup the ec2 instances for you. All configuration then
for the entire cluster is done on the master node of the particular
cluster or setting up of additional software that is all done through
the EMR console. We were doing some geospatial calculations and we
loaded a 3rd party jar file called esri into the EMR cluster. I then had
to pass a small bootstrap action (script) to have it distribute esri to
the entire cluster. 

Why are you guys reinventing the wheel? 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 03:35, Alexander Pivovarov wrote: 

> I found the following solution to this problem
> 
> I registered 2 subdomains (public and local) for each computer on https://freedns.afraid.org/subdomain/ [1] 
> e.g. 
> myhadoop-nn.crabdance.com [2]
> myhadoop-nn-local.crabdance.com [3] 
> then I added cron job which sends http requests to update public and local ip on freedns server hint: public ip is detected automatically ip address for local name can be set using request parameter &address=10.x.x.x (don't forget to escape &)
> 
> as a result my nn computer has 2 DNS names with currently assigned ip addresses , e.g.
> myhadoop-nn.crabdance.com [2] 54.203.181.177
> myhadoop-nn-local.crabdance.com [3] 10.220.149.103
> 
> in hadoop configuration I can use local machine names to access my cluster outside of AWS I can use public names
> 
> Just curious if AWS provides easier way to name EC2 computers?
> 
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> I dont know how you would do that to be honest. With EMR you have destinctions master core and task nodes. If you need to change configuration you just ssh into the EMR master node. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 02:11, Alexander Pivovarov wrote: 
> 
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Links:
------
[1] https://freedns.afraid.org/subdomain/
[2] http://myhadoop-nn.crabdance.com
[3] http://myhadoop-nn-local.crabdance.com

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

Hi guys I know you guys want to keep costs down, but why go through all
the effort to setup ec2 instances when you deploy EMR it takes the time
to provision and setup the ec2 instances for you. All configuration then
for the entire cluster is done on the master node of the particular
cluster or setting up of additional software that is all done through
the EMR console. We were doing some geospatial calculations and we
loaded a 3rd party jar file called esri into the EMR cluster. I then had
to pass a small bootstrap action (script) to have it distribute esri to
the entire cluster. 

Why are you guys reinventing the wheel? 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 03:35, Alexander Pivovarov wrote: 

> I found the following solution to this problem
> 
> I registered 2 subdomains (public and local) for each computer on https://freedns.afraid.org/subdomain/ [1] 
> e.g. 
> myhadoop-nn.crabdance.com [2]
> myhadoop-nn-local.crabdance.com [3] 
> then I added cron job which sends http requests to update public and local ip on freedns server hint: public ip is detected automatically ip address for local name can be set using request parameter &address=10.x.x.x (don't forget to escape &)
> 
> as a result my nn computer has 2 DNS names with currently assigned ip addresses , e.g.
> myhadoop-nn.crabdance.com [2] 54.203.181.177
> myhadoop-nn-local.crabdance.com [3] 10.220.149.103
> 
> in hadoop configuration I can use local machine names to access my cluster outside of AWS I can use public names
> 
> Just curious if AWS provides easier way to name EC2 computers?
> 
> On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> I dont know how you would do that to be honest. With EMR you have destinctions master core and task nodes. If you need to change configuration you just ssh into the EMR master node. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 02:11, Alexander Pivovarov wrote: 
> 
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Links:
------
[1] https://freedns.afraid.org/subdomain/
[2] http://myhadoop-nn.crabdance.com
[3] http://myhadoop-nn-local.crabdance.com

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
I found the following solution to this problem

I registered 2 subdomains  (public and local) for each computer on
https://freedns.afraid.org/subdomain/
e.g.
myhadoop-nn.crabdance.com
myhadoop-nn-local.crabdance.com

then I added cron job which sends http requests to update public and local
ip on freedns server
hint: public ip is detected automatically
ip address for local name can be set using request parameter
&address=10.x.x.x
(don't forget to escape &)

as a result my nn computer has 2 DNS names with currently assigned ip
addresses , e.g.
myhadoop-nn.crabdance.com  54.203.181.177
myhadoop-nn-local.crabdance.com   10.220.149.103

in hadoop configuration I can use local machine names
to access my cluster outside of AWS I can use public names

Just curious if AWS provides easier way to name EC2 computers?

On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  I dont know how you would do that to be honest. With EMR you have
> destinctions master core and task nodes. If you need to change
> configuration you just ssh into the EMR master node.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used
> in hadoop cluster.
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
> wrote:
>
>>  When I started with EMR it was alot of testing and trial and error. HUE
>> is already supported as something that can be installed from the AWS
>> console. What I need to know is if you need this cluster on all the time or
>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>> it up run the job and tear it back down.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 01:10, Krish Donald wrote:
>>
>>  Thanks Jonathan,
>>
>> I will try to explore EMR option also.
>> Can you please let me know the configuration which you have used it?
>> Can you please recommend for me also?
>> I would like to setup Hadoop cluster using cloudera manager and then
>> would like to do below things:
>>
>> setup kerberos
>> setup federation
>> setup monitoring
>> setup hadr
>> backup and recovery
>> authorization using sentry
>> backup and recovery of individual componenets
>> performamce tuning
>> upgrade of cdh
>> upgrade of CM
>> Hue User Administration
>> Spark
>> Solr
>>
>>
>> Thanks
>> Krish
>>
>>
>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  krish EMR wont cost you much with all the testing and data we ran
>>> through the test systems as well as the large amont of data when everythign
>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>> there would be enough as java can be pretty ram hungry.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>
>>>  Hi,
>>>
>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>> manager for 6-7 nodes.
>>>
>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>> I would like to use free service as of now.
>>>
>>> Please advise.
>>>
>>> Thanks
>>> Krish
>>>
>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
I found the following solution to this problem

I registered 2 subdomains  (public and local) for each computer on
https://freedns.afraid.org/subdomain/
e.g.
myhadoop-nn.crabdance.com
myhadoop-nn-local.crabdance.com

then I added cron job which sends http requests to update public and local
ip on freedns server
hint: public ip is detected automatically
ip address for local name can be set using request parameter
&address=10.x.x.x
(don't forget to escape &)

as a result my nn computer has 2 DNS names with currently assigned ip
addresses , e.g.
myhadoop-nn.crabdance.com  54.203.181.177
myhadoop-nn-local.crabdance.com   10.220.149.103

in hadoop configuration I can use local machine names
to access my cluster outside of AWS I can use public names

Just curious if AWS provides easier way to name EC2 computers?

On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  I dont know how you would do that to be honest. With EMR you have
> destinctions master core and task nodes. If you need to change
> configuration you just ssh into the EMR master node.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used
> in hadoop cluster.
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
> wrote:
>
>>  When I started with EMR it was alot of testing and trial and error. HUE
>> is already supported as something that can be installed from the AWS
>> console. What I need to know is if you need this cluster on all the time or
>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>> it up run the job and tear it back down.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 01:10, Krish Donald wrote:
>>
>>  Thanks Jonathan,
>>
>> I will try to explore EMR option also.
>> Can you please let me know the configuration which you have used it?
>> Can you please recommend for me also?
>> I would like to setup Hadoop cluster using cloudera manager and then
>> would like to do below things:
>>
>> setup kerberos
>> setup federation
>> setup monitoring
>> setup hadr
>> backup and recovery
>> authorization using sentry
>> backup and recovery of individual componenets
>> performamce tuning
>> upgrade of cdh
>> upgrade of CM
>> Hue User Administration
>> Spark
>> Solr
>>
>>
>> Thanks
>> Krish
>>
>>
>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  krish EMR wont cost you much with all the testing and data we ran
>>> through the test systems as well as the large amont of data when everythign
>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>> there would be enough as java can be pretty ram hungry.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>
>>>  Hi,
>>>
>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>> manager for 6-7 nodes.
>>>
>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>> I would like to use free service as of now.
>>>
>>> Please advise.
>>>
>>> Thanks
>>> Krish
>>>
>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
I found the following solution to this problem

I registered 2 subdomains  (public and local) for each computer on
https://freedns.afraid.org/subdomain/
e.g.
myhadoop-nn.crabdance.com
myhadoop-nn-local.crabdance.com

then I added cron job which sends http requests to update public and local
ip on freedns server
hint: public ip is detected automatically
ip address for local name can be set using request parameter
&address=10.x.x.x
(don't forget to escape &)

as a result my nn computer has 2 DNS names with currently assigned ip
addresses , e.g.
myhadoop-nn.crabdance.com  54.203.181.177
myhadoop-nn-local.crabdance.com   10.220.149.103

in hadoop configuration I can use local machine names
to access my cluster outside of AWS I can use public names

Just curious if AWS provides easier way to name EC2 computers?

On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  I dont know how you would do that to be honest. With EMR you have
> destinctions master core and task nodes. If you need to change
> configuration you just ssh into the EMR master node.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used
> in hadoop cluster.
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
> wrote:
>
>>  When I started with EMR it was alot of testing and trial and error. HUE
>> is already supported as something that can be installed from the AWS
>> console. What I need to know is if you need this cluster on all the time or
>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>> it up run the job and tear it back down.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 01:10, Krish Donald wrote:
>>
>>  Thanks Jonathan,
>>
>> I will try to explore EMR option also.
>> Can you please let me know the configuration which you have used it?
>> Can you please recommend for me also?
>> I would like to setup Hadoop cluster using cloudera manager and then
>> would like to do below things:
>>
>> setup kerberos
>> setup federation
>> setup monitoring
>> setup hadr
>> backup and recovery
>> authorization using sentry
>> backup and recovery of individual componenets
>> performamce tuning
>> upgrade of cdh
>> upgrade of CM
>> Hue User Administration
>> Spark
>> Solr
>>
>>
>> Thanks
>> Krish
>>
>>
>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  krish EMR wont cost you much with all the testing and data we ran
>>> through the test systems as well as the large amont of data when everythign
>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>> there would be enough as java can be pretty ram hungry.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>
>>>  Hi,
>>>
>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>> manager for 6-7 nodes.
>>>
>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>> I would like to use free service as of now.
>>>
>>> Please advise.
>>>
>>> Thanks
>>> Krish
>>>
>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
Here is a easy way to go about assigning static name to your ec2 instance.
When you get the launch an EC2-instance from aws console when you get to
the point of selecting VPC, ip address screen there is a screen that says
"USER DATA"...put the below in with appropriate host name(change
CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
you static name.

#!/bin/bash

HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
cat > /etc/sysconfig/network << EOF
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=${HOSTNAME_TAG}
EOF

IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts

echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
service network restart


Also note i was able to do this on couple of spot instance for cheap price,
only thing is once you shut it down or someone outbids you, you loose that
instance but its easy/cheap to play around with.... and i have used couple
of m3.medium for my NN/SNN and couple of them for data nodes...

On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  I dont know how you would do that to be honest. With EMR you have
> destinctions master core and task nodes. If you need to change
> configuration you just ssh into the EMR master node.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used
> in hadoop cluster.
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
> wrote:
>
>>  When I started with EMR it was alot of testing and trial and error. HUE
>> is already supported as something that can be installed from the AWS
>> console. What I need to know is if you need this cluster on all the time or
>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>> it up run the job and tear it back down.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 01:10, Krish Donald wrote:
>>
>>  Thanks Jonathan,
>>
>> I will try to explore EMR option also.
>> Can you please let me know the configuration which you have used it?
>> Can you please recommend for me also?
>> I would like to setup Hadoop cluster using cloudera manager and then
>> would like to do below things:
>>
>> setup kerberos
>> setup federation
>> setup monitoring
>> setup hadr
>> backup and recovery
>> authorization using sentry
>> backup and recovery of individual componenets
>> performamce tuning
>> upgrade of cdh
>> upgrade of CM
>> Hue User Administration
>> Spark
>> Solr
>>
>>
>> Thanks
>> Krish
>>
>>
>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  krish EMR wont cost you much with all the testing and data we ran
>>> through the test systems as well as the large amont of data when everythign
>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>> there would be enough as java can be pretty ram hungry.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>
>>>  Hi,
>>>
>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>> manager for 6-7 nodes.
>>>
>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>> I would like to use free service as of now.
>>>
>>> Please advise.
>>>
>>> Thanks
>>> Krish
>>>
>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
Here is a easy way to go about assigning static name to your ec2 instance.
When you get the launch an EC2-instance from aws console when you get to
the point of selecting VPC, ip address screen there is a screen that says
"USER DATA"...put the below in with appropriate host name(change
CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
you static name.

#!/bin/bash

HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
cat > /etc/sysconfig/network << EOF
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=${HOSTNAME_TAG}
EOF

IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts

echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
service network restart


Also note i was able to do this on couple of spot instance for cheap price,
only thing is once you shut it down or someone outbids you, you loose that
instance but its easy/cheap to play around with.... and i have used couple
of m3.medium for my NN/SNN and couple of them for data nodes...

On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  I dont know how you would do that to be honest. With EMR you have
> destinctions master core and task nodes. If you need to change
> configuration you just ssh into the EMR master node.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used
> in hadoop cluster.
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
> wrote:
>
>>  When I started with EMR it was alot of testing and trial and error. HUE
>> is already supported as something that can be installed from the AWS
>> console. What I need to know is if you need this cluster on all the time or
>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>> it up run the job and tear it back down.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 01:10, Krish Donald wrote:
>>
>>  Thanks Jonathan,
>>
>> I will try to explore EMR option also.
>> Can you please let me know the configuration which you have used it?
>> Can you please recommend for me also?
>> I would like to setup Hadoop cluster using cloudera manager and then
>> would like to do below things:
>>
>> setup kerberos
>> setup federation
>> setup monitoring
>> setup hadr
>> backup and recovery
>> authorization using sentry
>> backup and recovery of individual componenets
>> performamce tuning
>> upgrade of cdh
>> upgrade of CM
>> Hue User Administration
>> Spark
>> Solr
>>
>>
>> Thanks
>> Krish
>>
>>
>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  krish EMR wont cost you much with all the testing and data we ran
>>> through the test systems as well as the large amont of data when everythign
>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>> there would be enough as java can be pretty ram hungry.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>
>>>  Hi,
>>>
>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>> manager for 6-7 nodes.
>>>
>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>> I would like to use free service as of now.
>>>
>>> Please advise.
>>>
>>> Thanks
>>> Krish
>>>
>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by max scalf <or...@gmail.com>.
Here is a easy way to go about assigning static name to your ec2 instance.
When you get the launch an EC2-instance from aws console when you get to
the point of selecting VPC, ip address screen there is a screen that says
"USER DATA"...put the below in with appropriate host name(change
CHANGE_HOST_NAME_HERE to whatever you want) and that should be able to get
you static name.

#!/bin/bash

HOSTNAME_TAG=CHANGE_HOST_NAME_HERE
cat > /etc/sysconfig/network << EOF
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=${HOSTNAME_TAG}
EOF

IP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
echo "${IP} ${HOSTNAME_TAG}.localhost ${HOSTNAME_TAG}" >> /etc/hosts

echo ${HOSTNAME_TAG} > /proc/sys/kernel/hostname
service network restart


Also note i was able to do this on couple of spot instance for cheap price,
only thing is once you shut it down or someone outbids you, you loose that
instance but its easy/cheap to play around with.... and i have used couple
of m3.medium for my NN/SNN and couple of them for data nodes...

On Thu, Mar 5, 2015 at 7:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  I dont know how you would do that to be honest. With EMR you have
> destinctions master core and task nodes. If you need to change
> configuration you just ssh into the EMR master node.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used
> in hadoop cluster.
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
> wrote:
>
>>  When I started with EMR it was alot of testing and trial and error. HUE
>> is already supported as something that can be installed from the AWS
>> console. What I need to know is if you need this cluster on all the time or
>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>> it up run the job and tear it back down.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 01:10, Krish Donald wrote:
>>
>>  Thanks Jonathan,
>>
>> I will try to explore EMR option also.
>> Can you please let me know the configuration which you have used it?
>> Can you please recommend for me also?
>> I would like to setup Hadoop cluster using cloudera manager and then
>> would like to do below things:
>>
>> setup kerberos
>> setup federation
>> setup monitoring
>> setup hadr
>> backup and recovery
>> authorization using sentry
>> backup and recovery of individual componenets
>> performamce tuning
>> upgrade of cdh
>> upgrade of CM
>> Hue User Administration
>> Spark
>> Solr
>>
>>
>> Thanks
>> Krish
>>
>>
>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  krish EMR wont cost you much with all the testing and data we ran
>>> through the test systems as well as the large amont of data when everythign
>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>> there would be enough as java can be pretty ram hungry.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>
>>>  Hi,
>>>
>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>> manager for 6-7 nodes.
>>>
>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>> I would like to use free service as of now.
>>>
>>> Please advise.
>>>
>>> Thanks
>>> Krish
>>>
>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
I found the following solution to this problem

I registered 2 subdomains  (public and local) for each computer on
https://freedns.afraid.org/subdomain/
e.g.
myhadoop-nn.crabdance.com
myhadoop-nn-local.crabdance.com

then I added cron job which sends http requests to update public and local
ip on freedns server
hint: public ip is detected automatically
ip address for local name can be set using request parameter
&address=10.x.x.x
(don't forget to escape &)

as a result my nn computer has 2 DNS names with currently assigned ip
addresses , e.g.
myhadoop-nn.crabdance.com  54.203.181.177
myhadoop-nn-local.crabdance.com   10.220.149.103

in hadoop configuration I can use local machine names
to access my cluster outside of AWS I can use public names

Just curious if AWS provides easier way to name EC2 computers?

On Thu, Mar 5, 2015 at 5:19 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  I dont know how you would do that to be honest. With EMR you have
> destinctions master core and task nodes. If you need to change
> configuration you just ssh into the EMR master node.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 02:11, Alexander Pivovarov wrote:
>
> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used
> in hadoop cluster.
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net>
> wrote:
>
>>  When I started with EMR it was alot of testing and trial and error. HUE
>> is already supported as something that can be installed from the AWS
>> console. What I need to know is if you need this cluster on all the time or
>> this is goign ot be what amazon call a transient cluster. Meaning you fire
>> it up run the job and tear it back down.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>  On 2015-03-06 01:10, Krish Donald wrote:
>>
>>  Thanks Jonathan,
>>
>> I will try to explore EMR option also.
>> Can you please let me know the configuration which you have used it?
>> Can you please recommend for me also?
>> I would like to setup Hadoop cluster using cloudera manager and then
>> would like to do below things:
>>
>> setup kerberos
>> setup federation
>> setup monitoring
>> setup hadr
>> backup and recovery
>> authorization using sentry
>> backup and recovery of individual componenets
>> performamce tuning
>> upgrade of cdh
>> upgrade of CM
>> Hue User Administration
>> Spark
>> Solr
>>
>>
>> Thanks
>> Krish
>>
>>
>> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <
>> jaquilina@eagleeyet.net> wrote:
>>
>>>  krish EMR wont cost you much with all the testing and data we ran
>>> through the test systems as well as the large amont of data when everythign
>>> was read we paid about 15.00 USD. I honestly do not think that the specs
>>> there would be enough as java can be pretty ram hungry.
>>>
>>>
>>>
>>> ---
>>> Regards,
>>> Jonathan Aquilina
>>> Founder Eagle Eye T
>>>
>>>   On 2015-03-06 00:41, Krish Donald wrote:
>>>
>>>  Hi,
>>>
>>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>>> manager for 6-7 nodes.
>>>
>>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>>> I would like to use free service as of now.
>>>
>>> Please advise.
>>>
>>> Thanks
>>> Krish
>>>
>>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

I dont know how you would do that to be honest. With EMR you have
destinctions master core and task nodes. If you need to change
configuration you just ssh into the EMR master node. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 02:11, Alexander Pivovarov wrote: 

> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

I dont know how you would do that to be honest. With EMR you have
destinctions master core and task nodes. If you need to change
configuration you just ssh into the EMR master node. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 02:11, Alexander Pivovarov wrote: 

> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

I dont know how you would do that to be honest. With EMR you have
destinctions master core and task nodes. If you need to change
configuration you just ssh into the EMR master node. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 02:11, Alexander Pivovarov wrote: 

> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

I dont know how you would do that to be honest. With EMR you have
destinctions master core and task nodes. If you need to change
configuration you just ssh into the EMR master node. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 02:11, Alexander Pivovarov wrote: 

> What is the easiest way to assign names to aws ec2 computers?
> I guess computer need static hostname and dns name before it can be used in hadoop cluster. 
> On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:
> 
> When I started with EMR it was alot of testing and trial and error. HUE is already supported as something that can be installed from the AWS console. What I need to know is if you need this cluster on all the time or this is goign ot be what amazon call a transient cluster. Meaning you fire it up run the job and tear it back down. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 01:10, Krish Donald wrote: 
> 
> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
What is the easiest way to assign names to aws ec2 computers?
I guess computer need static hostname and dns name before it can be used in
hadoop cluster.
On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:

>  When I started with EMR it was alot of testing and trial and error. HUE
> is already supported as something that can be installed from the AWS
> console. What I need to know is if you need this cluster on all the time or
> this is goign ot be what amazon call a transient cluster. Meaning you fire
> it up run the job and tear it back down.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 01:10, Krish Donald wrote:
>
>  Thanks Jonathan,
>
> I will try to explore EMR option also.
> Can you please let me know the configuration which you have used it?
> Can you please recommend for me also?
> I would like to setup Hadoop cluster using cloudera manager and then would
> like to do below things:
>
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh
> upgrade of CM
> Hue User Administration
> Spark
> Solr
>
>
> Thanks
> Krish
>
>
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  krish EMR wont cost you much with all the testing and data we ran
>> through the test systems as well as the large amont of data when everythign
>> was read we paid about 15.00 USD. I honestly do not think that the specs
>> there would be enough as java can be pretty ram hungry.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 00:41, Krish Donald wrote:
>>
>>  Hi,
>>
>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>> manager for 6-7 nodes.
>>
>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>> I would like to use free service as of now.
>>
>> Please advise.
>>
>> Thanks
>> Krish
>>
>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
What is the easiest way to assign names to aws ec2 computers?
I guess computer need static hostname and dns name before it can be used in
hadoop cluster.
On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:

>  When I started with EMR it was alot of testing and trial and error. HUE
> is already supported as something that can be installed from the AWS
> console. What I need to know is if you need this cluster on all the time or
> this is goign ot be what amazon call a transient cluster. Meaning you fire
> it up run the job and tear it back down.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 01:10, Krish Donald wrote:
>
>  Thanks Jonathan,
>
> I will try to explore EMR option also.
> Can you please let me know the configuration which you have used it?
> Can you please recommend for me also?
> I would like to setup Hadoop cluster using cloudera manager and then would
> like to do below things:
>
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh
> upgrade of CM
> Hue User Administration
> Spark
> Solr
>
>
> Thanks
> Krish
>
>
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  krish EMR wont cost you much with all the testing and data we ran
>> through the test systems as well as the large amont of data when everythign
>> was read we paid about 15.00 USD. I honestly do not think that the specs
>> there would be enough as java can be pretty ram hungry.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 00:41, Krish Donald wrote:
>>
>>  Hi,
>>
>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>> manager for 6-7 nodes.
>>
>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>> I would like to use free service as of now.
>>
>> Please advise.
>>
>> Thanks
>> Krish
>>
>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
What is the easiest way to assign names to aws ec2 computers?
I guess computer need static hostname and dns name before it can be used in
hadoop cluster.
On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:

>  When I started with EMR it was alot of testing and trial and error. HUE
> is already supported as something that can be installed from the AWS
> console. What I need to know is if you need this cluster on all the time or
> this is goign ot be what amazon call a transient cluster. Meaning you fire
> it up run the job and tear it back down.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 01:10, Krish Donald wrote:
>
>  Thanks Jonathan,
>
> I will try to explore EMR option also.
> Can you please let me know the configuration which you have used it?
> Can you please recommend for me also?
> I would like to setup Hadoop cluster using cloudera manager and then would
> like to do below things:
>
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh
> upgrade of CM
> Hue User Administration
> Spark
> Solr
>
>
> Thanks
> Krish
>
>
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  krish EMR wont cost you much with all the testing and data we ran
>> through the test systems as well as the large amont of data when everythign
>> was read we paid about 15.00 USD. I honestly do not think that the specs
>> there would be enough as java can be pretty ram hungry.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 00:41, Krish Donald wrote:
>>
>>  Hi,
>>
>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>> manager for 6-7 nodes.
>>
>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>> I would like to use free service as of now.
>>
>> Please advise.
>>
>> Thanks
>> Krish
>>
>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Alexander Pivovarov <ap...@gmail.com>.
What is the easiest way to assign names to aws ec2 computers?
I guess computer need static hostname and dns name before it can be used in
hadoop cluster.
On Mar 5, 2015 4:36 PM, "Jonathan Aquilina" <ja...@eagleeyet.net> wrote:

>  When I started with EMR it was alot of testing and trial and error. HUE
> is already supported as something that can be installed from the AWS
> console. What I need to know is if you need this cluster on all the time or
> this is goign ot be what amazon call a transient cluster. Meaning you fire
> it up run the job and tear it back down.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 01:10, Krish Donald wrote:
>
>  Thanks Jonathan,
>
> I will try to explore EMR option also.
> Can you please let me know the configuration which you have used it?
> Can you please recommend for me also?
> I would like to setup Hadoop cluster using cloudera manager and then would
> like to do below things:
>
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh
> upgrade of CM
> Hue User Administration
> Spark
> Solr
>
>
> Thanks
> Krish
>
>
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <jaquilina@eagleeyet.net
> > wrote:
>
>>  krish EMR wont cost you much with all the testing and data we ran
>> through the test systems as well as the large amont of data when everythign
>> was read we paid about 15.00 USD. I honestly do not think that the specs
>> there would be enough as java can be pretty ram hungry.
>>
>>
>>
>> ---
>> Regards,
>> Jonathan Aquilina
>> Founder Eagle Eye T
>>
>>   On 2015-03-06 00:41, Krish Donald wrote:
>>
>>  Hi,
>>
>> I am new to AWS and would like to setup Hadoop cluster using cloudera
>> manager for 6-7 nodes.
>>
>> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
>> I would like to use free service as of now.
>>
>> Please advise.
>>
>> Thanks
>> Krish
>>
>>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

When I started with EMR it was alot of testing and trial and error. HUE
is already supported as something that can be installed from the AWS
console. What I need to know is if you need this cluster on all the time
or this is goign ot be what amazon call a transient cluster. Meaning you
fire it up run the job and tear it back down. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 01:10, Krish Donald wrote: 

> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

When I started with EMR it was alot of testing and trial and error. HUE
is already supported as something that can be installed from the AWS
console. What I need to know is if you need this cluster on all the time
or this is goign ot be what amazon call a transient cluster. Meaning you
fire it up run the job and tear it back down. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 01:10, Krish Donald wrote: 

> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

When I started with EMR it was alot of testing and trial and error. HUE
is already supported as something that can be installed from the AWS
console. What I need to know is if you need this cluster on all the time
or this is goign ot be what amazon call a transient cluster. Meaning you
fire it up run the job and tear it back down. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 01:10, Krish Donald wrote: 

> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

When I started with EMR it was alot of testing and trial and error. HUE
is already supported as something that can be installed from the AWS
console. What I need to know is if you need this cluster on all the time
or this is goign ot be what amazon call a transient cluster. Meaning you
fire it up run the job and tear it back down. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 01:10, Krish Donald wrote: 

> Thanks Jonathan, 
> 
> I will try to explore EMR option also. 
> Can you please let me know the configuration which you have used it? 
> Can you please recommend for me also? 
> I would like to setup Hadoop cluster using cloudera manager and then would like to do below things: 
> 
> setup kerberos
> setup federation
> setup monitoring
> setup hadr
> backup and recovery
> authorization using sentry
> backup and recovery of individual componenets
> performamce tuning
> upgrade of cdh 
> upgrade of CM
> Hue User Administration 
> Spark 
> Solr 
> 
> Thanks 
> Krish 
> 
> On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net> wrote:
> 
> krish EMR wont cost you much with all the testing and data we ran through the test systems as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly do not think that the specs there would be enough as java can be pretty ram hungry. 
> 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> 
> On 2015-03-06 00:41, Krish Donald wrote: 
> 
> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Krish Donald <go...@gmail.com>.
Thanks Jonathan,

I will try to explore EMR option also.
Can you please let me know the configuration which you have used it?
Can you please recommend for me also?
I would like to setup Hadoop cluster using cloudera manager and then would
like to do below things:

setup kerberos
setup federation
setup monitoring
setup hadr
backup and recovery
authorization using sentry
backup and recovery of individual componenets
performamce tuning
upgrade of cdh
upgrade of CM
Hue User Administration
Spark
Solr


Thanks
Krish


On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  krish EMR wont cost you much with all the testing and data we ran
> through the test systems as well as the large amont of data when everythign
> was read we paid about 15.00 USD. I honestly do not think that the specs
> there would be enough as java can be pretty ram hungry.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 00:41, Krish Donald wrote:
>
>  Hi,
>
> I am new to AWS and would like to setup Hadoop cluster using cloudera
> manager for 6-7 nodes.
>
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
> I would like to use free service as of now.
>
> Please advise.
>
> Thanks
> Krish
>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Krish Donald <go...@gmail.com>.
Thanks Jonathan,

I will try to explore EMR option also.
Can you please let me know the configuration which you have used it?
Can you please recommend for me also?
I would like to setup Hadoop cluster using cloudera manager and then would
like to do below things:

setup kerberos
setup federation
setup monitoring
setup hadr
backup and recovery
authorization using sentry
backup and recovery of individual componenets
performamce tuning
upgrade of cdh
upgrade of CM
Hue User Administration
Spark
Solr


Thanks
Krish


On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  krish EMR wont cost you much with all the testing and data we ran
> through the test systems as well as the large amont of data when everythign
> was read we paid about 15.00 USD. I honestly do not think that the specs
> there would be enough as java can be pretty ram hungry.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 00:41, Krish Donald wrote:
>
>  Hi,
>
> I am new to AWS and would like to setup Hadoop cluster using cloudera
> manager for 6-7 nodes.
>
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
> I would like to use free service as of now.
>
> Please advise.
>
> Thanks
> Krish
>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Krish Donald <go...@gmail.com>.
Thanks Jonathan,

I will try to explore EMR option also.
Can you please let me know the configuration which you have used it?
Can you please recommend for me also?
I would like to setup Hadoop cluster using cloudera manager and then would
like to do below things:

setup kerberos
setup federation
setup monitoring
setup hadr
backup and recovery
authorization using sentry
backup and recovery of individual componenets
performamce tuning
upgrade of cdh
upgrade of CM
Hue User Administration
Spark
Solr


Thanks
Krish


On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  krish EMR wont cost you much with all the testing and data we ran
> through the test systems as well as the large amont of data when everythign
> was read we paid about 15.00 USD. I honestly do not think that the specs
> there would be enough as java can be pretty ram hungry.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 00:41, Krish Donald wrote:
>
>  Hi,
>
> I am new to AWS and would like to setup Hadoop cluster using cloudera
> manager for 6-7 nodes.
>
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
> I would like to use free service as of now.
>
> Please advise.
>
> Thanks
> Krish
>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Krish Donald <go...@gmail.com>.
Thanks Jonathan,

I will try to explore EMR option also.
Can you please let me know the configuration which you have used it?
Can you please recommend for me also?
I would like to setup Hadoop cluster using cloudera manager and then would
like to do below things:

setup kerberos
setup federation
setup monitoring
setup hadr
backup and recovery
authorization using sentry
backup and recovery of individual componenets
performamce tuning
upgrade of cdh
upgrade of CM
Hue User Administration
Spark
Solr


Thanks
Krish


On Thu, Mar 5, 2015 at 3:57 PM, Jonathan Aquilina <ja...@eagleeyet.net>
wrote:

>  krish EMR wont cost you much with all the testing and data we ran
> through the test systems as well as the large amont of data when everythign
> was read we paid about 15.00 USD. I honestly do not think that the specs
> there would be enough as java can be pretty ram hungry.
>
>
>
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
>
>  On 2015-03-06 00:41, Krish Donald wrote:
>
>  Hi,
>
> I am new to AWS and would like to setup Hadoop cluster using cloudera
> manager for 6-7 nodes.
>
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
> I would like to use free service as of now.
>
> Please advise.
>
> Thanks
> Krish
>
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

krish EMR wont cost you much with all the testing and data we ran
through the test systems as well as the large amont of data when
everythign was read we paid about 15.00 USD. I honestly do not think
that the specs there would be enough as java can be pretty ram hungry. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 00:41, Krish Donald wrote: 

> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by daemeon reiydelle <da...@gmail.com>.
for testing sure. While "t" is tiny, there is still a good bit of power
there. Make sure your storage is persistent though. I don't recall how that
works on the free nodes.


On Thu, Mar 5, 2015 at 3:41 PM, Krish Donald <go...@gmail.com> wrote:

> Hi,
>
> I am new to AWS and would like to setup Hadoop cluster using cloudera
> manager for 6-7 nodes.
>
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
> I would like to use free service as of now.
>
> Please advise.
>
> Thanks
> Krish
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

krish EMR wont cost you much with all the testing and data we ran
through the test systems as well as the large amont of data when
everythign was read we paid about 15.00 USD. I honestly do not think
that the specs there would be enough as java can be pretty ram hungry. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 00:41, Krish Donald wrote: 

> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by daemeon reiydelle <da...@gmail.com>.
for testing sure. While "t" is tiny, there is still a good bit of power
there. Make sure your storage is persistent though. I don't recall how that
works on the free nodes.


On Thu, Mar 5, 2015 at 3:41 PM, Krish Donald <go...@gmail.com> wrote:

> Hi,
>
> I am new to AWS and would like to setup Hadoop cluster using cloudera
> manager for 6-7 nodes.
>
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
> I would like to use free service as of now.
>
> Please advise.
>
> Thanks
> Krish
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by daemeon reiydelle <da...@gmail.com>.
for testing sure. While "t" is tiny, there is still a good bit of power
there. Make sure your storage is persistent though. I don't recall how that
works on the free nodes.


On Thu, Mar 5, 2015 at 3:41 PM, Krish Donald <go...@gmail.com> wrote:

> Hi,
>
> I am new to AWS and would like to setup Hadoop cluster using cloudera
> manager for 6-7 nodes.
>
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
> I would like to use free service as of now.
>
> Please advise.
>
> Thanks
> Krish
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by daemeon reiydelle <da...@gmail.com>.
for testing sure. While "t" is tiny, there is still a good bit of power
there. Make sure your storage is persistent though. I don't recall how that
works on the free nodes.


On Thu, Mar 5, 2015 at 3:41 PM, Krish Donald <go...@gmail.com> wrote:

> Hi,
>
> I am new to AWS and would like to setup Hadoop cluster using cloudera
> manager for 6-7 nodes.
>
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
> I would like to use free service as of now.
>
> Please advise.
>
> Thanks
> Krish
>

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

krish EMR wont cost you much with all the testing and data we ran
through the test systems as well as the large amont of data when
everythign was read we paid about 15.00 USD. I honestly do not think
that the specs there would be enough as java can be pretty ram hungry. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 00:41, Krish Donald wrote: 

> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish
 

Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?

Posted by Jonathan Aquilina <ja...@eagleeyet.net>.
 

krish EMR wont cost you much with all the testing and data we ran
through the test systems as well as the large amont of data when
everythign was read we paid about 15.00 USD. I honestly do not think
that the specs there would be enough as java can be pretty ram hungry. 

---
Regards,
Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-06 00:41, Krish Donald wrote: 

> Hi, 
> 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7 nodes. 
> 
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> 
> Please advise. 
> 
> Thanks 
> Krish