You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@whirr.apache.org by Andrei Savu <sa...@gmail.com> on 2011/12/08 00:14:25 UTC

Re: contribution to Whirr...

You are more than welcome! Thanks for adding WHIRR-445.

I think it's best if you start by contributing fixes around your pain
points (e.g. 445, hive
as a service etc.) It makes a lot of sense to work on issues that directly
affect your research.

Can you elaborate on how are you planning to use Whirr and for what kind of
applications?

I am available to assist you as much as possible via the email list or IRC
on #whirr

Cheers,

-- Andrei Savu

On Thu, Dec 8, 2011 at 1:02 AM, Periya.Data <pe...@gmail.com> wrote:

> Dear Andrei,
>    Greetings. As you suggested, I created a Jira bug report on the
> JAVA_HOME stuff : https://issues.apache.org/jira/browse/WHIRR-445
>
> I would like to contribute to Whirr (even though I am facing some initial
> problems). Maybe I can start with some documentation and fixing minor bugs.
> May need you assistance.
>
> Please let me know your thoughts.
>
> -Srini. (aka PD).
>
>
> On Wed, Dec 7, 2011 at 7:15 AM, Andrei Savu <sa...@gmail.com> wrote:
>
>> See inline.
>>
>> On Wed, Dec 7, 2011 at 7:14 AM, Periya.Data <pe...@gmail.com>wrote:
>>
>>> Thanks ! A few observations:
>>>
>>>    - After I do export conf dir and execute "hadoop fs -ls /", I see a
>>>    different dir structure from what I see when I ssh into the machine and
>>>    execute it as root. See outputs below.
>>>
>>> sri@PeriyaData:~$
>>> sri@PeriyaData:~$ export HADOOP_CONF_DIR=/\$HOME/.whirr/HadoopCluster/
>>> sri@PeriyaData:~$
>>> sri@PeriyaData:~$ hadoop fs -ls /
>>> Found 25 items
>>> -rw-------   1 root root    4767328 2011-11-02 12:55 /vmlinuz
>>> drwxr-xr-x   - root root      12288 2011-12-03 10:49 /etc
>>> dr-xr-xr-x   - root root          0 2011-12-02 03:28 /proc
>>> drwxrwxrwt   - root root       4096 2011-12-05 18:07 /tmp
>>> drwxr-xr-x   - root root       4096 2011-04-25 15:50 /srv
>>> -rw-r--r--   1 root root   13631900 2011-11-01 22:46 /initrd.img.old
>>> drwx------   - root root       4096 2011-11-23 22:27 /root
>>> drwxr-xr-x   - root root       4096 2011-04-21 09:50 /mnt
>>> drwxr-xr-x   - root root       4096 2011-12-02 09:01 /var
>>> drwxr-xr-x   - root root       4096 2011-10-01 19:14 /cdrom
>>> -rw-------   1 root root    4766528 2011-10-07 14:03 /vmlinuz.old
>>> drwxr-xr-x   - root root        780 2011-12-02 16:28 /run
>>> drwxr-xr-x   - root root       4096 2011-10-23 18:27 /usr
>>> drwx------   - root root      16384 2011-10-01 19:05 /lost+found
>>> drwxr-xr-x   - root root       4096 2011-11-22 22:26 /bin
>>> drwxr-xr-x   - root root       4096 2011-04-25 15:50 /opt
>>> drwxr-xr-x   - root root       4096 2011-10-01 19:21 /home
>>> drwxr-xr-x   - root root       4320 2011-12-02 11:29 /dev
>>> drwxr-xr-x   - root root       4096 2011-03-21 01:26 /selinux
>>> drwxr-xr-x   - root root       4096 2011-11-22 22:31 /boot
>>> drwxr-xr-x   - root root          0 2011-12-02 03:28 /sys
>>> -rw-r--r--   1 root root   13645361 2011-11-22 22:31 /initrd.img
>>> drwxr-xr-x   - root root       4096 2011-11-22 22:28 /lib
>>> drwxr-xr-x   - root root       4096 2011-12-03 10:49 /media
>>> drwxr-xr-x   - root root      12288 2011-11-22 22:29 /sbin
>>> sri@PeriyaData:~$
>>> sri@PeriyaData:~$
>>
>>
>> This is no different from the output you get when running "ls -l /" and
>> this is happening because Hadoop
>> is not able to find the config file. Try:
>>
>> $ export HADOOP_CONF_DIR=~/.whirr/HadoopCluster/
>>
>> When running "hadoop fs -ls /" you should get the same output as bellow.
>>
>> Note: make sure the SOCKS proxy is running.
>>
>> % . ~/.whirr/HadoopCluster/hadoop-proxy.sh
>>
>>
>> *After SSH-ing into the master node:*
>>>
>>> sri@ip-10-90-131-240:~$ sudo su
>>> root@ip-10-90-131-240:/home/users/sri#
>>>
>>> root@ip-10-90-131-240::/home/users/jtv# jps
>>> 2860 Jps
>>> 2667 JobTracker
>>> 2088 NameNode
>>> root@ip-10-90-131-240::/home/users/jtv# hadoop fs -ls /
>>> Error: JAVA_HOME is not set.
>>> root@ip-10-90-131-240::/home/users/jtv#
>>>
>>> *After editing (setting java home) in the .bashrc file and sourcing it
>>> , i get the expected dir structure:*
>>>
>>> root@ip-10-90-131-240:/home/users/sri# hadoop fs -ls /
>>> Found 3 items
>>> drwxr-xr-x   - hadoop supergroup          0 2011-12-05 23:09 /hadoop
>>> drwxrwxrwx   - hadoop supergroup          0 2011-12-05 23:08 /tmp
>>> drwxrwxrwx   - hadoop supergroup          0 2011-12-06 01:16 /user
>>> root@ip-10-90-131-240:/home/users/sri#
>>> root@ip-10-90-131-240:/home/users/sri#
>>>
>>> Is the above normal behavior?
>>>
>>
>> It looks normal to me. I think you should be able to load data & run MR
>> jobs as expected. Can you open an issue
>> so that we can make sure that JAVA_HOME is exported as expected by the
>> install script?
>>
>>
>>>
>>> Thanks,
>>> PD/
>>>
>>>
>>>
>>>  *Questions:*
>>>>>
>>>>>    1. Assuming everything is fine, where does Hadoop gets installed
>>>>>    on the EC2 instance? What is the path?
>>>>>
>>>>>
>>>> Run jps as root and you should see the daemons running.
>>>>
>>>>>
>>>>>    1. Even if Hadoop is successfully installed on the EC2 instance,
>>>>>    are the env variables properly changed on that instance? Like, path must be
>>>>>    updated either on its .bashrc or .bash_profile ...right?
>>>>>
>>>>>
>>>> Try to run "hadoop fs -ls /" as root.
>>>>
>>>>
>>>
>>
>

Re: contribution to Whirr...

Posted by Andrei Savu <sa...@gmail.com>.
>
>
>> Thanks. Is there a "jumpstart" guide that explains:
> - How/where to get the latest SVN code base
> - The recommeded way to build (ant/maven etc)
> - Basically, how to setup a local environment to run/test...etc. I have
> never done this before. I will also google around and try to find out.
> - After making a patch, what is the procedure to submit..
>
>
See How to Contribute on the wiki:
https://cwiki.apache.org/confluence/display/WHIRR/How+To+Contribute

This page may also be useful to you:
http://www.apache.org/dev/contributors.html

Re: contribution to Whirr...

Posted by "Periya.Data" <pe...@gmail.com>.
inline...

On Wed, Dec 7, 2011 at 3:14 PM, Andrei Savu <sa...@gmail.com> wrote:

> You are more than welcome! Thanks for adding WHIRR-445.
>
> I think it's best if you start by contributing fixes around your pain
> points (e.g. 445, hive
> as a service etc.) It makes a lot of sense to work on issues that directly
> affect your research.
>

Will work on it :-)


>
> Can you elaborate on how are you planning to use Whirr and for what kind
> of applications?
>

In the past, I have been involved in setting up hadoop clusters on raw
machines..locally. Setting up clusters on EC2 is new to me. I am planning
to use Whirr to primarily create Hadoop clusters. I plan to use Hive,
Flume, Sqoop along with it. The application is about analytics on
subscriber/ISP data. Will be using Mahout / R sooner or later.


>
> I am available to assist you as much as possible via the email list or IRC
> on #whirr
>

Thanks. Is there a "jumpstart" guide that explains:
- How/where to get the latest SVN code base
- The recommeded way to build (ant/maven etc)
- Basically, how to setup a local environment to run/test...etc. I have
never done this before. I will also google around and try to find out.
- After making a patch, what is the procedure to submit..


Thanks,
Srini.



>
> Cheers,
>
> -- Andrei Savu
>
> On Thu, Dec 8, 2011 at 1:02 AM, Periya.Data <pe...@gmail.com> wrote:
>
>> Dear Andrei,
>>    Greetings. As you suggested, I created a Jira bug report on the
>> JAVA_HOME stuff : https://issues.apache.org/jira/browse/WHIRR-445
>>
>> I would like to contribute to Whirr (even though I am facing some initial
>> problems). Maybe I can start with some documentation and fixing minor bugs.
>> May need you assistance.
>>
>> Please let me know your thoughts.
>>
>> -Srini. (aka PD).
>>
>>
>> On Wed, Dec 7, 2011 at 7:15 AM, Andrei Savu <sa...@gmail.com>wrote:
>>
>>> See inline.
>>>
>>> On Wed, Dec 7, 2011 at 7:14 AM, Periya.Data <pe...@gmail.com>wrote:
>>>
>>>> Thanks ! A few observations:
>>>>
>>>>    - After I do export conf dir and execute "hadoop fs -ls /", I see a
>>>>    different dir structure from what I see when I ssh into the machine and
>>>>    execute it as root. See outputs below.
>>>>
>>>> sri@PeriyaData:~$
>>>> sri@PeriyaData:~$ export HADOOP_CONF_DIR=/\$HOME/.whirr/HadoopCluster/
>>>> sri@PeriyaData:~$
>>>> sri@PeriyaData:~$ hadoop fs -ls /
>>>> Found 25 items
>>>> -rw-------   1 root root    4767328 2011-11-02 12:55 /vmlinuz
>>>> drwxr-xr-x   - root root      12288 2011-12-03 10:49 /etc
>>>> dr-xr-xr-x   - root root          0 2011-12-02 03:28 /proc
>>>> drwxrwxrwt   - root root       4096 2011-12-05 18:07 /tmp
>>>> drwxr-xr-x   - root root       4096 2011-04-25 15:50 /srv
>>>> -rw-r--r--   1 root root   13631900 2011-11-01 22:46 /initrd.img.old
>>>> drwx------   - root root       4096 2011-11-23 22:27 /root
>>>> drwxr-xr-x   - root root       4096 2011-04-21 09:50 /mnt
>>>> drwxr-xr-x   - root root       4096 2011-12-02 09:01 /var
>>>> drwxr-xr-x   - root root       4096 2011-10-01 19:14 /cdrom
>>>> -rw-------   1 root root    4766528 2011-10-07 14:03 /vmlinuz.old
>>>> drwxr-xr-x   - root root        780 2011-12-02 16:28 /run
>>>> drwxr-xr-x   - root root       4096 2011-10-23 18:27 /usr
>>>> drwx------   - root root      16384 2011-10-01 19:05 /lost+found
>>>> drwxr-xr-x   - root root       4096 2011-11-22 22:26 /bin
>>>> drwxr-xr-x   - root root       4096 2011-04-25 15:50 /opt
>>>> drwxr-xr-x   - root root       4096 2011-10-01 19:21 /home
>>>> drwxr-xr-x   - root root       4320 2011-12-02 11:29 /dev
>>>> drwxr-xr-x   - root root       4096 2011-03-21 01:26 /selinux
>>>> drwxr-xr-x   - root root       4096 2011-11-22 22:31 /boot
>>>> drwxr-xr-x   - root root          0 2011-12-02 03:28 /sys
>>>> -rw-r--r--   1 root root   13645361 2011-11-22 22:31 /initrd.img
>>>> drwxr-xr-x   - root root       4096 2011-11-22 22:28 /lib
>>>> drwxr-xr-x   - root root       4096 2011-12-03 10:49 /media
>>>> drwxr-xr-x   - root root      12288 2011-11-22 22:29 /sbin
>>>> sri@PeriyaData:~$
>>>> sri@PeriyaData:~$
>>>
>>>
>>> This is no different from the output you get when running "ls -l /" and
>>> this is happening because Hadoop
>>> is not able to find the config file. Try:
>>>
>>> $ export HADOOP_CONF_DIR=~/.whirr/HadoopCluster/
>>>
>>> When running "hadoop fs -ls /" you should get the same output as bellow.
>>>
>>> Note: make sure the SOCKS proxy is running.
>>>
>>> % . ~/.whirr/HadoopCluster/hadoop-proxy.sh
>>>
>>>
>>> *After SSH-ing into the master node:*
>>>>
>>>> sri@ip-10-90-131-240:~$ sudo su
>>>> root@ip-10-90-131-240:/home/users/sri#
>>>>
>>>> root@ip-10-90-131-240::/home/users/jtv# jps
>>>> 2860 Jps
>>>> 2667 JobTracker
>>>> 2088 NameNode
>>>> root@ip-10-90-131-240::/home/users/jtv# hadoop fs -ls /
>>>> Error: JAVA_HOME is not set.
>>>> root@ip-10-90-131-240::/home/users/jtv#
>>>>
>>>> *After editing (setting java home) in the .bashrc file and sourcing it
>>>> , i get the expected dir structure:*
>>>>
>>>> root@ip-10-90-131-240:/home/users/sri# hadoop fs -ls /
>>>> Found 3 items
>>>> drwxr-xr-x   - hadoop supergroup          0 2011-12-05 23:09 /hadoop
>>>> drwxrwxrwx   - hadoop supergroup          0 2011-12-05 23:08 /tmp
>>>> drwxrwxrwx   - hadoop supergroup          0 2011-12-06 01:16 /user
>>>> root@ip-10-90-131-240:/home/users/sri#
>>>> root@ip-10-90-131-240:/home/users/sri#
>>>>
>>>> Is the above normal behavior?
>>>>
>>>
>>> It looks normal to me. I think you should be able to load data & run MR
>>> jobs as expected. Can you open an issue
>>> so that we can make sure that JAVA_HOME is exported as expected by the
>>> install script?
>>>
>>>
>>>>
>>>> Thanks,
>>>> PD/
>>>>
>>>>
>>>>
>>>>  *Questions:*
>>>>>>
>>>>>>    1. Assuming everything is fine, where does Hadoop gets installed
>>>>>>    on the EC2 instance? What is the path?
>>>>>>
>>>>>>
>>>>> Run jps as root and you should see the daemons running.
>>>>>
>>>>>>
>>>>>>    1. Even if Hadoop is successfully installed on the EC2 instance,
>>>>>>    are the env variables properly changed on that instance? Like, path must be
>>>>>>    updated either on its .bashrc or .bash_profile ...right?
>>>>>>
>>>>>>
>>>>> Try to run "hadoop fs -ls /" as root.
>>>>>
>>>>>
>>>>
>>>
>>
>