You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@whirr.apache.org by Edmar Ferreira <ed...@gmail.com> on 2012/03/30 19:07:27 UTC

Re: Bad connection to FS. command aborted.

Hi Guys,

I just upgraded to whirr 0.7.1 but now I'm seeing the same error again.

*The error:*

12/03/30 13:25:22 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use
core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of
core-default.xml, mapred-default.xml and hdfs-default.xml respectively
12/03/30 13:25:24 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 0 time(s).
12/03/30 13:25:26 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 1 time(s).
12/03/30 13:25:27 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 2 time(s).
12/03/30 13:25:28 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 3 time(s).
12/03/30 13:25:30 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 4 time(s).
12/03/30 13:25:31 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 5 time(s).
12/03/30 13:25:33 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 6 time(s).
12/03/30 13:25:34 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 7 time(s).
12/03/30 13:25:35 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 8 time(s).
12/03/30 13:25:37 INFO ipc.Client: Retrying connect to server: /
107.21.79.75:8020. Already tried 9 time(s).
Bad connection to FS. command aborted.

*Background Information:*
*
*
I already exported HADOOP_CONF_DIR
*
*
*Hadoop Version*

[Cluster]

Hadoop 0.20.2

Subversion
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
911707

Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010

[Local]

Hadoop 0.20.2

Subversion
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
911707

Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010

*Last lines of whirr.log:*

No directory, logging in with HOME=/

, error=, exitCode=0]

2012-03-30 13:15:30,387 INFO
[org.apache.whirr.actions.ScriptBasedClusterAction] (main) Successfully
executed configure script: [output=This function does nothing. It just
needs to exist so Statements.call("retry_helpers") doesn't call something
which doesn't exist

starting datanode, logging to
/var/log/hadoop/logs/hadoop-hadoop-datanode-ip-10-35-6-39.out

No directory, logging in with HOME=/

starting tasktracker, logging to
/var/log/hadoop/logs/hadoop-hadoop-tasktracker-ip-10-35-6-39.out

No directory, logging in with HOME=/

, error=, exitCode=0]

2012-03-30 13:15:30,387 INFO
[org.apache.whirr.actions.ScriptBasedClusterAction] (main) Successfully
executed configure script: [output=This function does nothing. It just
needs to exist so Statements.call("retry_helpers") doesn't call something
which doesn't exist

starting datanode, logging to
/var/log/hadoop/logs/hadoop-hadoop-datanode-ip-10-115-130-203.out

No directory, logging in with HOME=/

starting tasktracker, logging to
/var/log/hadoop/logs/hadoop-hadoop-tasktracker-ip-10-115-130-203.out

No directory, logging in with HOME=/

, error=, exitCode=0]

2012-03-30 13:15:30,387 INFO
[org.apache.whirr.actions.ScriptBasedClusterAction] (main) Finished running
configure phase scripts on all cluster instances

2012-03-30 13:15:30,387 INFO
[org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] (main)
Completed configuration of hadoop role hadoop-namenode

2012-03-30 13:15:30,388 INFO
[org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] (main)
Namenode web UI available at http://107.21.79.75:50070

2012-03-30 13:15:30,391 INFO
[org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] (main)
Wrote Hadoop site file
/Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-site.xml

2012-03-30 13:15:30,393 INFO
[org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] (main)
Wrote Hadoop proxy script
/Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-proxy.sh

2012-03-30 13:15:30,394 INFO
[org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler]
(main) Completed configuration of hadoop role hadoop-jobtracker

2012-03-30 13:15:30,394 INFO
[org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler]
(main) Jobtracker web UI available at http://107.21.79.75:50030

2012-03-30 13:15:30,394 INFO
[org.apache.whirr.service.hadoop.HadoopDataNodeClusterActionHandler] (main)
Completed configuration of hadoop role hadoop-datanode

2012-03-30 13:15:30,394 INFO
[org.apache.whirr.service.hadoop.HadoopTaskTrackerClusterActionHandler]
(main) Completed configuration of hadoop role hadoop-tasktracker

2012-03-30 13:15:30,395 INFO
[org.apache.whirr.state.FileClusterStateStore] (main) Wrote instances file
/Users/edmaroliveiraferreira/.whirr/hadoop/instances

2012-03-30 13:15:30,405 DEBUG [org.apache.whirr.service.ComputeCache]
(Thread-3) closing ComputeServiceContext  [id=aws-ec2, endpoint=https://ec2.
us-east-1.amazonaws.com, apiVersion=2010-06-15,
identity=08WMRG9HQYYGVQDT57R2, iso3166Codes=[US-VA, US-CA, IE, SG, JP-13]]

*My haoop-ec2.properties file*

whirr.cluster-name=hadoop

whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,22
hadoop-datanode+hadoop-tasktracker
whirr.instance-templates-max-percent-failures=100
hadoop-namenode+hadoop-jobtracker,90 hadoop-datanode+hadoop-tasktracker

whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}

whirr.location-id=us-east-1

Thanks.

On Fri, Feb 24, 2012 at 2:23 PM, Edmar Ferreira <
edmaroliveiraferreira@gmail.com> wrote:

> Yes, It makes sense. Looking forward to see the 0.9.0 version.
> Thanks for your great work guys.
>
>
> On Fri, Feb 24, 2012 at 2:18 PM, Andrei Savu <sa...@gmail.com>wrote:
>
>>
>> On Fri, Feb 24, 2012 at 4:11 PM, Edmar Ferreira <
>> edmaroliveiraferreira@gmail.com> wrote:
>>
>>> There are any plans to expand this limit ?
>>
>>
>> Yes. The basic idea is that I think we should be able to start large
>> cluster by resizing in multiple
>> steps smallers ones and rebalancing things on the way as needed. Does it
>> make sense to you?
>>
>> I expect to have something functional for this in 0.9.0 by the time we
>> add the ability to resize clusters.
>>
>> Also there is some work happening in jclouds on being able to start a
>> large number of servers at the same time:
>> http://www.jclouds.org/documentation/reference/pool-design
>>
>
>
>
> --
> Edmar Ferreira
> Co-Founder at Everwrite
>
>

-- 
Edmar Ferreira
Co-Founder at Everwrite

Re: Bad connection to FS. command aborted.

Posted by Sean Zhang <zs...@gmail.com>.

Hi Edmar,

I just joint the email list and I don't know your previous discussion about
this problem.
Have you tried to log in the NameNode and check the service is actually
running, something like 'netstat -nl | grep 8020'? This could help you
identify the problem.

Regards,
Sean

On Fri, Mar 30, 2012 at 1:07 PM, Edmar Ferreira <
edmaroliveiraferreira@gmail.com> wrote:

> Hi Guys,
>
> I just upgraded to whirr 0.7.1 but now I'm seeing the same error again.
>
> *The error:*
>
> 12/03/30 13:25:22 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
> found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use
> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of
> core-default.xml, mapred-default.xml and hdfs-default.xml respectively
> 12/03/30 13:25:24 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 0 time(s).
> 12/03/30 13:25:26 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 1 time(s).
> 12/03/30 13:25:27 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 2 time(s).
> 12/03/30 13:25:28 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 3 time(s).
> 12/03/30 13:25:30 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 4 time(s).
> 12/03/30 13:25:31 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 5 time(s).
> 12/03/30 13:25:33 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 6 time(s).
> 12/03/30 13:25:34 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 7 time(s).
> 12/03/30 13:25:35 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 8 time(s).
> 12/03/30 13:25:37 INFO ipc.Client: Retrying connect to server: /
> 107.21.79.75:8020. Already tried 9 time(s).
> Bad connection to FS. command aborted.
>
>
> *Background Information:*
> *
> *
> I already exported HADOOP_CONF_DIR
> *
> *
> *Hadoop Version*
>
> [Cluster]
>
> Hadoop 0.20.2
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
> 911707
>
> Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010
>
> [Local]
>
> Hadoop 0.20.2
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
> 911707
>
> Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010
>
> *Last lines of whirr.log:*
>
> No directory, logging in with HOME=/
>
> , error=, exitCode=0]
>
> 2012-03-30 13:15:30,387 INFO
> [org.apache.whirr.actions.ScriptBasedClusterAction] (main) Successfully
> executed configure script: [output=This function does nothing. It just
> needs to exist so Statements.call("retry_helpers") doesn't call something
> which doesn't exist
>
> starting datanode, logging to
> /var/log/hadoop/logs/hadoop-hadoop-datanode-ip-10-35-6-39.out
>
> No directory, logging in with HOME=/
>
> starting tasktracker, logging to
> /var/log/hadoop/logs/hadoop-hadoop-tasktracker-ip-10-35-6-39.out
>
> No directory, logging in with HOME=/
>
> , error=, exitCode=0]
>
> 2012-03-30 13:15:30,387 INFO
> [org.apache.whirr.actions.ScriptBasedClusterAction] (main) Successfully
> executed configure script: [output=This function does nothing. It just
> needs to exist so Statements.call("retry_helpers") doesn't call something
> which doesn't exist
>
> starting datanode, logging to
> /var/log/hadoop/logs/hadoop-hadoop-datanode-ip-10-115-130-203.out
>
> No directory, logging in with HOME=/
>
> starting tasktracker, logging to
> /var/log/hadoop/logs/hadoop-hadoop-tasktracker-ip-10-115-130-203.out
>
> No directory, logging in with HOME=/
>
> , error=, exitCode=0]
>
> 2012-03-30 13:15:30,387 INFO
> [org.apache.whirr.actions.ScriptBasedClusterAction] (main) Finished running
> configure phase scripts on all cluster instances
>
> 2012-03-30 13:15:30,387 INFO
> [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] (main)
> Completed configuration of hadoop role hadoop-namenode
>
> 2012-03-30 13:15:30,388 INFO
> [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] (main)
> Namenode web UI available at http://107.21.79.75:50070
>
> 2012-03-30 13:15:30,391 INFO
> [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] (main)
> Wrote Hadoop site file
> /Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-site.xml
>
> 2012-03-30 13:15:30,393 INFO
> [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] (main)
> Wrote Hadoop proxy script
> /Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-proxy.sh
>
> 2012-03-30 13:15:30,394 INFO
> [org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler]
> (main) Completed configuration of hadoop role hadoop-jobtracker
>
> 2012-03-30 13:15:30,394 INFO
> [org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler]
> (main) Jobtracker web UI available at http://107.21.79.75:50030
>
> 2012-03-30 13:15:30,394 INFO
> [org.apache.whirr.service.hadoop.HadoopDataNodeClusterActionHandler] (main)
> Completed configuration of hadoop role hadoop-datanode
>
> 2012-03-30 13:15:30,394 INFO
> [org.apache.whirr.service.hadoop.HadoopTaskTrackerClusterActionHandler]
> (main) Completed configuration of hadoop role hadoop-tasktracker
>
> 2012-03-30 13:15:30,395 INFO
> [org.apache.whirr.state.FileClusterStateStore] (main) Wrote instances file
> /Users/edmaroliveiraferreira/.whirr/hadoop/instances
>
> 2012-03-30 13:15:30,405 DEBUG [org.apache.whirr.service.ComputeCache]
> (Thread-3) closing ComputeServiceContext  [id=aws-ec2, endpoint=
> https://ec2.us-east-1.amazonaws.com, apiVersion=2010-06-15,
> identity=08WMRG9HQYYGVQDT57R2, iso3166Codes=[US-VA, US-CA, IE, SG, JP-13]]
>
> *My haoop-ec2.properties file*
>
>
> whirr.cluster-name=hadoop
>
> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,22
> hadoop-datanode+hadoop-tasktracker
> whirr.instance-templates-max-percent-failures=100
> hadoop-namenode+hadoop-jobtracker,90 hadoop-datanode+hadoop-tasktracker
>
> whirr.provider=aws-ec2
> whirr.identity=${env:AWS_ACCESS_KEY_ID}
> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
>
> whirr.location-id=us-east-1
>
>
> Thanks.
>
> On Fri, Feb 24, 2012 at 2:23 PM, Edmar Ferreira <
> edmaroliveiraferreira@gmail.com> wrote:
>
>> Yes, It makes sense. Looking forward to see the 0.9.0 version.
>> Thanks for your great work guys.
>>
>>
>> On Fri, Feb 24, 2012 at 2:18 PM, Andrei Savu <sa...@gmail.com>wrote:
>>
>>>
>>> On Fri, Feb 24, 2012 at 4:11 PM, Edmar Ferreira <
>>> edmaroliveiraferreira@gmail.com> wrote:
>>>
>>>> There are any plans to expand this limit ?
>>>
>>>
>>> Yes. The basic idea is that I think we should be able to start large
>>> cluster by resizing in multiple
>>> steps smallers ones and rebalancing things on the way as needed. Does it
>>> make sense to you?
>>>
>>> I expect to have something functional for this in 0.9.0 by the time we
>>> add the ability to resize clusters.
>>>
>>> Also there is some work happening in jclouds on being able to start a
>>> large number of servers at the same time:
>>> http://www.jclouds.org/documentation/reference/pool-design
>>>
>>
>>
>>
>> --
>> Edmar Ferreira
>> Co-Founder at Everwrite
>>
>>
>
>
> --
> Edmar Ferreira
> Co-Founder at Everwrite
>
>