You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Brian Wolf <br...@gmail.com> on 2010/01/30 09:27:45 UTC
hadoop under cygwin issue
Hi,
I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
hadoop "quickstart" web page.
I know sshd is running and I can "ssh localhost" without a password.
This is from my hadoop-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/cygwin/tmp/hadoop-${user.name}</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
<property>
<name>mapred.job.reuse.jvm.num.tasks</name>
<value>-1</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>webinterface.private.actions</name>
<value>true</value>
</property>
</configuration>
These are errors from my log files:
2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=NameNode, port=9000
2010-01-30 00:03:33,121 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
localhost/127.0.0.1:9000
2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=NameNode, sessionId=null
2010-01-30 00:03:33,181 INFO
org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
Initializing NameNodeMeterics using context
object:org.apache.hadoop.metrics.spi.NullContext
2010-01-30 00:03:34,603 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
fsOwner=brian,None,Administrators,Users
2010-01-30 00:03:34,603 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2010-01-30 00:03:34,603 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
isPermissionEnabled=false
2010-01-30 00:03:34,653 INFO
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
Initializing FSNamesystemMetrics using context
object:org.apache.hadoop.metrics.spi.NullContext
2010-01-30 00:03:34,653 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
FSNamesystemStatusMBean
2010-01-30 00:03:34,803 INFO
org.apache.hadoop.hdfs.server.common.Storage: Storage directory
C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
2010-01-30 00:03:34,813 ERROR
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
state: storage directory does not exist or is not accessible.
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
server on 9000
=========================================================
2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
problem cleaning system directory: null
java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
connection exception: java.net.ConnectException: Connection refused: no
further information
at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
at org.apache.hadoop.ipc.Client.call(Client.java:700)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy4.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
Thanks
Brian
Re: hadoop under cygwin issue
Posted by Brian Wolf <br...@gmail.com>.
Thanks for the insight, Ed. Thats actually a pretty big "gesalt" for
me, I have to process it a bit (I had read about it, of course)
Brian
Ed Mazur wrote:
> Brian,
>
> It looks like you're confusing your local file system with HDFS. HDFS
> sits on top of your file system and is where data for (non-standalone)
> Hadoop jobs comes from. You can poll it with "fs -ls ...", so do
> something like "hadoop fs -lsr /" to see everything in HDFS. This will
> probably shed some light on why your first attempt failed.
> /user/brian/input should be a directory with several xml files.
>
> Ed
>
> On Wed, Feb 3, 2010 at 5:17 PM, Brian Wolf <br...@gmail.com> wrote:
>
>> Alex Kozlov wrote:
>>
>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
>>>
>>> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs directory
>>> (or where your logs are) and check the errors.
>>>
>>> Alex K
>>>
>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>
>>
>> Thanks for your help, Alex,
>>
>> I managed to get past that problem, now I have this problem:
>>
>> However, when I try to run this example as stated on the quickstart webpage:
>>
>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>
>> I get this error;
>> =============================================================
>> java.io.IOException: Not a file:
>> hdfs://localhost:9000/user/brian/input/conf
>> =========================================================
>> so it seems to default to my home directory looking for "input" it
>> apparently needs an absolute filepath, however, when I run that way:
>>
>> $ bin/hadoop jar hadoop-*-examples.jar grep /usr/local/hadoop-0.19.2/input
>> output 'dfs[a-z.]+'
>>
>> ==============================================================
>> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>> ==============================================================
>> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>> <- does exist
>>
>>>> Aaron,
>>>>
>>>> Thanks or your help. I carefully went through the steps again a couple
>>>> times , and ran
>>>>
>>>> after this
>>>> bin/hadoop namenode -format
>>>>
>>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>>
>>>>
>>>> then
>>>>
>>>>
>>>> bin/start-dfs.sh
>>>>
>>>> and
>>>>
>>>> bin/start-all.sh
>>>>
>>>>
>>>> and then
>>>> bin/hadoop fs -put conf input
>>>>
>>>> now the return for this seemed cryptic:
>>>>
>>>>
>>>> put: Target input/conf is a directory
>>>>
>>>> (??)
>>>>
>>>> and when I tried
>>>>
>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>
>>>> It says something about 0 nodes
>>>>
>>>> (from log file)
>>>>
>>>> 2010-02-01 13:26:29,874 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>>
>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 3 on 9000, call
>>>>
>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>>> File
>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> could
>>>> only be replicated to 0 nodes, instead of 1
>>>> java.io.IOException: File
>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> could
>>>> only be replicated to 0 nodes, instead of 1
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>>
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>> at
>>>>
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>
>>>>
>>>>
>>>>
>>>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>>>
>>>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>>>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>>>
>>>>
>>>> and when I browse to http://localhost:50070/
>>>>
>>>>
>>>> Cluster Summary
>>>>
>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB
>>>> /
>>>> 992.31 MB (0%)
>>>> *
>>>> Configured Capacity : 0 KB
>>>> DFS Used : 0 KB
>>>> Non DFS Used : 0 KB
>>>> DFS Remaining : 0 KB
>>>> DFS Used% : 100 %
>>>> DFS Remaining% : 0 %
>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> : 0
>>>>
>>>>
>>>> so I'm a bit still in the dark, I guess.
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>> Aaron Kimball wrote:
>>>>
>>>>
>>>>
>>>>> Brian, it looks like you missed a step in the instructions. You'll need
>>>>> to
>>>>> format the hdfs filesystem instance before starting the NameNode server:
>>>>>
>>>>> You need to run:
>>>>>
>>>>> $ bin/hadoop namenode -format
>>>>>
>>>>> .. then you can do bin/start-dfs.sh
>>>>> Hope this helps,
>>>>> - Aaron
>>>>>
>>>>>
>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>>>> hadoop "quickstart" web page.
>>>>>>
>>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>>
>>>>>> This is from my hadoop-site.xml
>>>>>>
>>>>>> <configuration>
>>>>>> <property>
>>>>>> <name>hadoop.tmp.dir</name>
>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>fs.default.name</name>
>>>>>> <value>hdfs://localhost:9000</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>mapred.job.tracker</name>
>>>>>> <value>localhost:9001</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>> <value>-1</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>dfs.replication</name>
>>>>>> <value>1</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>dfs.permissions</name>
>>>>>> <value>false</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>webinterface.private.actions</name>
>>>>>> <value>true</value>
>>>>>> </property>
>>>>>> </configuration>
>>>>>>
>>>>>> These are errors from my log files:
>>>>>>
>>>>>>
>>>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>> localhost/
>>>>>> 127.0.0.1:9000
>>>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>> Initializing
>>>>>> NameNodeMeterics using context
>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> supergroup=supergroup
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> isPermissionEnabled=false
>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>> Initializing FSNamesystemMetrics using context
>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>> FSNamesystemStatusMBean
>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>>> initialization failed.
>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>>> state:
>>>>>> storage directory does not exist or is not accessible.
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>> at
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>> at
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>> at
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>>> server
>>>>>> on 9000
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> =========================================================
>>>>>>
>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>>> connect
>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>> problem cleaning system directory: null
>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>>>> further information
>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>> at
>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Brian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
Re: hadoop under cygwin issue
Posted by Ed Mazur <ma...@cs.umass.edu>.
Brian,
It looks like you're confusing your local file system with HDFS. HDFS
sits on top of your file system and is where data for (non-standalone)
Hadoop jobs comes from. You can poll it with "fs -ls ...", so do
something like "hadoop fs -lsr /" to see everything in HDFS. This will
probably shed some light on why your first attempt failed.
/user/brian/input should be a directory with several xml files.
Ed
On Wed, Feb 3, 2010 at 5:17 PM, Brian Wolf <br...@gmail.com> wrote:
> Alex Kozlov wrote:
>>
>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
>>
>> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs directory
>> (or where your logs are) and check the errors.
>>
>> Alex K
>>
>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>
>
>
> Thanks for your help, Alex,
>
> I managed to get past that problem, now I have this problem:
>
> However, when I try to run this example as stated on the quickstart webpage:
>
> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>
> I get this error;
> =============================================================
> java.io.IOException: Not a file:
> hdfs://localhost:9000/user/brian/input/conf
> =========================================================
> so it seems to default to my home directory looking for "input" it
> apparently needs an absolute filepath, however, when I run that way:
>
> $ bin/hadoop jar hadoop-*-examples.jar grep /usr/local/hadoop-0.19.2/input
> output 'dfs[a-z.]+'
>
> ==============================================================
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
> ==============================================================
> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
> <- does exist
>>>
>>> Aaron,
>>>
>>> Thanks or your help. I carefully went through the steps again a couple
>>> times , and ran
>>>
>>> after this
>>> bin/hadoop namenode -format
>>>
>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>
>>>
>>> then
>>>
>>>
>>> bin/start-dfs.sh
>>>
>>> and
>>>
>>> bin/start-all.sh
>>>
>>>
>>> and then
>>> bin/hadoop fs -put conf input
>>>
>>> now the return for this seemed cryptic:
>>>
>>>
>>> put: Target input/conf is a directory
>>>
>>> (??)
>>>
>>> and when I tried
>>>
>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>
>>> It says something about 0 nodes
>>>
>>> (from log file)
>>>
>>> 2010-02-01 13:26:29,874 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>
>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> dst=null perm=brian:supergroup:rw-r--r--
>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 9000, call
>>>
>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>> File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>> java.io.IOException: File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>
>>>
>>>
>>>
>>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>>
>>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>>
>>>
>>> and when I browse to http://localhost:50070/
>>>
>>>
>>> Cluster Summary
>>>
>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB
>>> /
>>> 992.31 MB (0%)
>>> *
>>> Configured Capacity : 0 KB
>>> DFS Used : 0 KB
>>> Non DFS Used : 0 KB
>>> DFS Remaining : 0 KB
>>> DFS Used% : 100 %
>>> DFS Remaining% : 0 %
>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> : 0
>>>
>>>
>>> so I'm a bit still in the dark, I guess.
>>>
>>> Thanks
>>> Brian
>>>
>>>
>>>
>>>
>>> Aaron Kimball wrote:
>>>
>>>
>>>>
>>>> Brian, it looks like you missed a step in the instructions. You'll need
>>>> to
>>>> format the hdfs filesystem instance before starting the NameNode server:
>>>>
>>>> You need to run:
>>>>
>>>> $ bin/hadoop namenode -format
>>>>
>>>> .. then you can do bin/start-dfs.sh
>>>> Hope this helps,
>>>> - Aaron
>>>>
>>>>
>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>>> hadoop "quickstart" web page.
>>>>>
>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>
>>>>> This is from my hadoop-site.xml
>>>>>
>>>>> <configuration>
>>>>> <property>
>>>>> <name>hadoop.tmp.dir</name>
>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>fs.default.name</name>
>>>>> <value>hdfs://localhost:9000</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.tracker</name>
>>>>> <value>localhost:9001</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>> <value>-1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.replication</name>
>>>>> <value>1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.permissions</name>
>>>>> <value>false</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>webinterface.private.actions</name>
>>>>> <value>true</value>
>>>>> </property>
>>>>> </configuration>
>>>>>
>>>>> These are errors from my log files:
>>>>>
>>>>>
>>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>> 2010-01-30 00:03:33,121 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>> localhost/
>>>>> 127.0.0.1:9000
>>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>> 2010-01-30 00:03:33,181 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>> Initializing
>>>>> NameNodeMeterics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> fsOwner=brian,None,Administrators,Users
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> supergroup=supergroup
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> isPermissionEnabled=false
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>> Initializing FSNamesystemMetrics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>> FSNamesystemStatusMBean
>>>>> 2010-01-30 00:03:34,803 INFO
>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>> initialization failed.
>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>> state:
>>>>> storage directory does not exist or is not accessible.
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>> at
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>> server
>>>>> on 9000
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> =========================================================
>>>>>
>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>> connect
>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>> problem cleaning system directory: null
>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>>> further information
>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>> at
>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Brian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
Re: hadoop under cygwin issue
Posted by Brian Wolf <br...@gmail.com>.
Alex Kozlov wrote:
> Hi Brian,
>
> Is your namenode running? Try 'hadoop fs -ls /'.
>
> Alex
>
>
> On Mar 12, 2010, at 5:20 PM, Brian Wolf <br...@gmail.com> wrote:
>
>> Hi Alex,
>>
>> I am back on this problem. Seems it works, but I have this issue
>> with connecting to server.
>> I can connect 'ssh localhost' ok.
>>
>> Thanks
>> Brian
>>
>> $ bin/hadoop jar hadoop-*-examples.jar pi 2 2
>> Number of Maps = 2
>> Samples per Map = 2
>> 10/03/12 17:16:17 INFO ipc.Client: Retrying connect to server:
>> localhost/127.0.0.1:9000. Already tried 0 time(s).
>> 10/03/12 17:16:19 INFO ipc.Client: Retrying connect to server:
>> localhost/127.0.0.1:9000. Already tried 1 time(s).
>>
>>
>>
>> Alex Kozlov wrote:
>>> Can you endeavor a simpler job (just to make sure your setup works):
>>>
>>> $ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2
>>>
>>> Alex K
>>>
>>> On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>> Alex, thanks for the help, it seems to start now, however
>>>>
>>>>
>>>> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
>>>> 'dfs[a-z.]+'
>>>> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated
>>>> filesystem
>>>> name. Use "file:///" instead.
>>>> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to
>>>> process
>>>> : 3
>>>> 10/02/03 20:02:44 INFO mapred.JobClient: Running job:
>>>> job_201002031354_0013
>>>> 10/02/03 20:02:45 INFO mapred.JobClient: map 0% reduce 0%
>>>>
>>>>
>>>>
>>>> it hangs here (is pseudo cluster supposed to work?)
>>>>
>>>>
>>>> these are bottom of various log files
>>>>
>>>> conf log file
>>>>
>>>>
>>>> <property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
>>>>
>>>>
>>>> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property>
>>>>
>>>> <property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030
>>>>
>>>> </value></property>
>>>> <property><name>io.file.buffer.size</name><value>4096</value></property>
>>>>
>>>>
>>>> <property><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
>>>>
>>>>
>>>> <property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
>>>>
>>>>
>>>> <property><name>dfs.datanode.handler.count</name><value>3</value></property>
>>>>
>>>>
>>>> <property><name>mapred.reduce.copy.backoff</name><value>300</value></property>
>>>>
>>>> <property><name>mapred.task.profile</name><value>false</value></property>
>>>>
>>>>
>>>> <property><name>dfs.replication.considerLoad</name><value>true</value></property>
>>>>
>>>>
>>>> <property><name>jobclient.output.filter</name><value>FAILED</value></property>
>>>>
>>>>
>>>> <property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property>
>>>>
>>>>
>>>> <property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
>>>>
>>>> <property><name>fs.checkpoint.size</name><value>67108864</value></property>
>>>>
>>>>
>>>>
>>>> bottom
>>>> namenode log
>>>>
>>>> added to blk_6520091160827873550_1036 size 570
>>>> 2010-02-03 20:02:43,826 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>>>>
>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>> 2010-02-03 20:02:43,866 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1
>>>> cmd=setPermission
>>>>
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>>>>
>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange:
>>>> BLOCK*
>>>> NameSystem.allocateBlock:
>>>> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml.
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange:
>>>> BLOCK*
>>>> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is
>>>> added to
>>>> blk_517844159758473296_1037 size 16238
>>>> 2010-02-03 20:02:44,257 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>>>>
>>>> dst=null perm=null
>>>> 2010-02-03 20:02:44,527 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar
>>>>
>>>> dst=null perm=null
>>>> 2010-02-03 20:02:45,258 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split
>>>>
>>>> dst=null perm=null
>>>>
>>>>
>>>> bottom
>>>> datanode log
>>>>
>>>> 2010-02-03 20:02:44,046 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
>>>> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest:
>>>> /127.0.0.1:50010
>>>> 2010-02-03 20:02:44,076 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE,
>>>> cliID: DFSClient_-1424524646, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,086 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0
>>>> for block
>>>> blk_517844159758473296_1037 terminating
>>>> 2010-02-03 20:02:44,457 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
>>>> cliID: DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,677 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ,
>>>> cliID: DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_-2806977820057440405_1035
>>>> 2010-02-03 20:02:45,278 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ,
>>>> cliID:
>>>> DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104,
>>>> blockid: blk_6520091160827873550_1036
>>>> 2010-02-03 20:04:10,451 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_3301977249866081256_1031
>>>> 2010-02-03 20:09:35,658 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_9116729021606317943_1025
>>>> 2010-02-03 20:09:44,671 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_8602436668984954947_1026
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> jobtracker log
>>>>
>>>> Input size for job job_201002031354_0012 = 53060
>>>> 2010-02-03 19:48:37,599 INFO
>>>> org.apache.hadoop.mapred.JobInProgress: Split
>>>> info for job:job_201002031354_0012
>>>> 2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology:
>>>> Adding
>>>> a new node: /default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000000 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000001 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000002 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000003 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO
>>>> org.apache.hadoop.mapred.JobInProgress: Input
>>>> size for job job_201002031354_0013 = 53060
>>>> 2010-02-03 20:02:45,278 INFO
>>>> org.apache.hadoop.mapred.JobInProgress: Split
>>>> info for job:job_201002031354_0013
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000000 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000001 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000002 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000003 has split on
>>>> node:/default-rack/localhost
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Alex Kozlov wrote:
>>>>
>>>>
>>>>> Try
>>>>>
>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>>>>
>>>>> file:/// is a magical prefix to force hadoop to look for the file
>>>>> in the
>>>>> local FS
>>>>>
>>>>> You can also force it to look into local FS by giving '-fs local'
>>>>> or '-fs
>>>>> file:///' option to the hadoop executable
>>>>>
>>>>> These options basically overwrite the *fs.default.name* configuration
>>>>> setting, which should be in your core-site.xml file
>>>>>
>>>>> You can also copy the content of the input directory to HDFS by
>>>>> executing
>>>>>
>>>>> $ bin/hadoop fs -mkdir input
>>>>> $ bin/hadoop fs -copyFromLocal input/* input
>>>>>
>>>>> Hope this helps
>>>>>
>>>>> Alex K
>>>>>
>>>>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Alex Kozlov wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>>>>> 0
>>>>>>>
>>>>>>> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs
>>>>>>> directory
>>>>>>> (or where your logs are) and check the errors.
>>>>>>>
>>>>>>> Alex K
>>>>>>>
>>>>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Thanks for your help, Alex,
>>>>>>
>>>>>> I managed to get past that problem, now I have this problem:
>>>>>>
>>>>>> However, when I try to run this example as stated on the quickstart
>>>>>> webpage:
>>>>>>
>>>>>>
>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>>>
>>>>>> I get this error;
>>>>>> =============================================================
>>>>>> java.io.IOException: Not a file:
>>>>>> hdfs://localhost:9000/user/brian/input/conf
>>>>>> =========================================================
>>>>>> so it seems to default to my home directory looking for "input" it
>>>>>> apparently needs an absolute filepath, however, when I run that
>>>>>> way:
>>>>>>
>>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>>> /usr/local/hadoop-0.19.2/input
>>>>>> output 'dfs[a-z.]+'
>>>>>>
>>>>>> ==============================================================
>>>>>> org.apache.hadoop.mapred.InvalidInputException: Input path does not
>>>>>> exist:
>>>>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>>>>> ==============================================================
>>>>>> It still isn't happy although this part ->
>>>>>> /usr/local/hadoop-0.19.2/input
>>>>>> <- does exist
>>>>>>
>>>>>> Aaron,
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks or your help. I carefully went through the steps again a
>>>>>>> couple
>>>>>>>
>>>>>>>> times , and ran
>>>>>>>>
>>>>>>>> after this
>>>>>>>> bin/hadoop namenode -format
>>>>>>>>
>>>>>>>> (by the way, it asks if I want to reformat, I've tried it both
>>>>>>>> ways)
>>>>>>>>
>>>>>>>>
>>>>>>>> then
>>>>>>>>
>>>>>>>>
>>>>>>>> bin/start-dfs.sh
>>>>>>>>
>>>>>>>> and
>>>>>>>>
>>>>>>>> bin/start-all.sh
>>>>>>>>
>>>>>>>>
>>>>>>>> and then
>>>>>>>> bin/hadoop fs -put conf input
>>>>>>>>
>>>>>>>> now the return for this seemed cryptic:
>>>>>>>>
>>>>>>>>
>>>>>>>> put: Target input/conf is a directory
>>>>>>>>
>>>>>>>> (??)
>>>>>>>>
>>>>>>>> and when I tried
>>>>>>>>
>>>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output
>>>>>>>> 'dfs[a-z.]+'
>>>>>>>>
>>>>>>>> It says something about 0 nodes
>>>>>>>>
>>>>>>>> (from log file)
>>>>>>>>
>>>>>>>> 2010-02-01 13:26:29,874 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>>>>>>
>>>>>>>>
>>>>>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>>>>
>>>>>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC
>>>>>>>> Server
>>>>>>>> handler 3 on 9000, call
>>>>>>>>
>>>>>>>>
>>>>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>>>>>>
>>>>>>>> DFSClient_725490811) from 127.0.0.1:3003: error:
>>>>>>>> java.io.IOException:
>>>>>>>> File
>>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>>>>
>>>>>>>> could
>>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>>> java.io.IOException: File
>>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>>>>
>>>>>>>> could
>>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>>>>>>
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>>>>>>
>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>>
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>>>>> netstat:
>>>>>>>>
>>>>>>>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>>>>>>>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>>>>>>>
>>>>>>>>
>>>>>>>> and when I browse to http://localhost:50070/
>>>>>>>>
>>>>>>>>
>>>>>>>> Cluster Summary
>>>>>>>>
>>>>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size
>>>>>>>> is 8.01
>>>>>>>> MB
>>>>>>>> /
>>>>>>>> 992.31 MB (0%)
>>>>>>>> *
>>>>>>>> Configured Capacity : 0 KB
>>>>>>>> DFS Used : 0 KB
>>>>>>>> Non DFS Used : 0 KB
>>>>>>>> DFS Remaining : 0 KB
>>>>>>>> DFS Used% : 100 %
>>>>>>>> DFS Remaining% : 0 %
>>>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>>>>>> 0
>>>>>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> :
>>>>>>>> 0
>>>>>>>>
>>>>>>>>
>>>>>>>> so I'm a bit still in the dark, I guess.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Brian
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Aaron Kimball wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Brian, it looks like you missed a step in the instructions.
>>>>>>>>> You'll
>>>>>>>>> need
>>>>>>>>> to
>>>>>>>>> format the hdfs filesystem instance before starting the NameNode
>>>>>>>>> server:
>>>>>>>>>
>>>>>>>>> You need to run:
>>>>>>>>>
>>>>>>>>> $ bin/hadoop namenode -format
>>>>>>>>>
>>>>>>>>> .. then you can do bin/start-dfs.sh
>>>>>>>>> Hope this helps,
>>>>>>>>> - Aaron
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per
>>>>>>>>>> directions on
>>>>>>>>>> the
>>>>>>>>>> hadoop "quickstart" web page.
>>>>>>>>>>
>>>>>>>>>> I know sshd is running and I can "ssh localhost" without a
>>>>>>>>>> password.
>>>>>>>>>>
>>>>>>>>>> This is from my hadoop-site.xml
>>>>>>>>>>
>>>>>>>>>> <configuration>
>>>>>>>>>> <property>
>>>>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>fs.default.name</name>
>>>>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>mapred.job.tracker</name>
>>>>>>>>>> <value>localhost:9001</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>>>>> <value>-1</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.replication</name>
>>>>>>>>>> <value>1</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.permissions</name>
>>>>>>>>>> <value>false</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>webinterface.private.actions</name>
>>>>>>>>>> <value>true</value>
>>>>>>>>>> </property>
>>>>>>>>>> </configuration>
>>>>>>>>>>
>>>>>>>>>> These are errors from my log files:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>>>>>> localhost/
>>>>>>>>>> 127.0.0.1:9000
>>>>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>>>>> Initializing JVM Metrics with processName=NameNode,
>>>>>>>>>> sessionId=null
>>>>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>>>>> Initializing
>>>>>>>>>> NameNodeMeterics using context
>>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> supergroup=supergroup
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> isPermissionEnabled=false
>>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>>>>>>
>>>>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>>>>>> FSNamesystemStatusMBean
>>>>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does
>>>>>>>>>> not exist.
>>>>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> FSNamesystem
>>>>>>>>>> initialization failed.
>>>>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>>>>>>
>>>>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an
>>>>>>>>>> inconsistent
>>>>>>>>>> state:
>>>>>>>>>> storage directory does not exist or is not accessible.
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>>>>>>
>>>>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server:
>>>>>>>>>> Stopping
>>>>>>>>>> server
>>>>>>>>>> on 9000
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> =========================================================
>>>>>>>>>>
>>>>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client:
>>>>>>>>>> Retrying
>>>>>>>>>> connect
>>>>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>>>>> problem cleaning system directory: null
>>>>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000
>>>>>>>>>> failed
>>>>>>>>>> on
>>>>>>>>>> connection exception: java.net.ConnectException: Connection
>>>>>>>>>> refused:
>>>>>>>>>> no
>>>>>>>>>> further information
>>>>>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Brian
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
Hi Alex,
Im using a different system, it seems to be running better now.
Thanks
Brian
Re: hadoop under cygwin issue
Posted by Brian Wolf <br...@gmail.com>.
Hi Alex,
seems to:
$ bin/hadoop fs -ls /
Found 1 items
drwxr-xr-x - brian supergroup 0 2010-03-13 10:45 /tmp
However, I think this might be the source of the problems, whenever I
invoke any of the scripts, I get always get these issues:
localhost: /usr/bin/bash: /usr/local/hadoop-0.20.2/bin/hadoop-daemon.sh:
No such file or directory
I 'm thinking this is something to do with cygwin (?). Ive been careful
not to open these files with a windows editor (I've already been through
that headache!)
which I guess I have been ignoring, but I thnk what ever hadoop-daemon
is suppossed to do isn't getting done.
However, I have tried to invoke it by hand by echoing out what I guess
the arguments are supposed to be , like "hadoop-daemon start datanode"
, but that doesn't seem to work, ie
(also, is there a minimum amt of hd space required, as I have only 1 gig
or so free )
like:
after i run start-all.sh, I run
$ bin/hadoop-daemon.sh start datanode
starting datanode, logging to
/usr/local/hadoop-0.20.2/bin/../logs/hadoop-brian-datanode-wynn6266448332.out
ok, but then I try to run the grep example, I get these errors:
2010-03-13 11:27:57,149 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block null bad datanode[0] nodes == null
2010-03-13 11:27:57,149 WARN org.apache.hadoop.hdfs.DFSClient: Could not
get block locations. Source file
"/tmp/hadoop-SYSTEM/mapred/system/jobtracker.info" - Aborting...
2010-03-13 11:27:57,149 WARN org.apache.hadoop.mapred.JobTracker:
Writing to file
hdfs://localhost:9000/tmp/hadoop-SYSTEM/mapred/system/jobtracker.info
failed!
2010-03-13 11:27:57,149 WARN org.apache.hadoop.mapred.JobTracker:
FileSystem is not ready yet!
2010-03-13 11:27:57,369 WARN org.apache.hadoop.mapred.JobTracker: Failed
to initialize recovery manager.
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/tmp/hadoop-SYSTEM/mapred/system/jobtracker.info could only be
replicated to 0 nodes, instead of 1
Alex Kozlov wrote:
> Hi Brian,
>
> Is your namenode running? Try 'hadoop fs -ls /'.
>
> Alex
>
>
> On Mar 12, 2010, at 5:20 PM, Brian Wolf <br...@gmail.com> wrote:
>
>> Hi Alex,
>>
>> I am back on this problem. Seems it works, but I have this issue
>> with connecting to server.
>> I can connect 'ssh localhost' ok.
>>
>> Thanks
>> Brian
>>
>> $ bin/hadoop jar hadoop-*-examples.jar pi 2 2
>> Number of Maps = 2
>> Samples per Map = 2
>> 10/03/12 17:16:17 INFO ipc.Client: Retrying connect to server:
>> localhost/127.0.0.1:9000. Already tried 0 time(s).
>> 10/03/12 17:16:19 INFO ipc.Client: Retrying connect to server:
>> localhost/127.0.0.1:9000. Already tried 1 time(s).
>>
>>
>>
>> Alex Kozlov wrote:
>>> Can you endeavor a simpler job (just to make sure your setup works):
>>>
>>> $ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2
>>>
>>> Alex K
>>>
>>> On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>> Alex, thanks for the help, it seems to start now, however
>>>>
>>>>
>>>> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
>>>> 'dfs[a-z.]+'
>>>> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated
>>>> filesystem
>>>> name. Use "file:///" instead.
>>>> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to
>>>> process
>>>> : 3
>>>> 10/02/03 20:02:44 INFO mapred.JobClient: Running job:
>>>> job_201002031354_0013
>>>> 10/02/03 20:02:45 INFO mapred.JobClient: map 0% reduce 0%
>>>>
>>>>
>>>>
>>>> it hangs here (is pseudo cluster supposed to work?)
>>>>
>>>>
>>>> these are bottom of various log files
>>>>
>>>> conf log file
>>>>
>>>>
>>>> <property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
>>>>
>>>>
>>>> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property>
>>>>
>>>> <property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030
>>>>
>>>> </value></property>
>>>> <property><name>io.file.buffer.size</name><value>4096</value></property>
>>>>
>>>>
>>>> <property><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
>>>>
>>>>
>>>> <property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
>>>>
>>>>
>>>> <property><name>dfs.datanode.handler.count</name><value>3</value></property>
>>>>
>>>>
>>>> <property><name>mapred.reduce.copy.backoff</name><value>300</value></property>
>>>>
>>>> <property><name>mapred.task.profile</name><value>false</value></property>
>>>>
>>>>
>>>> <property><name>dfs.replication.considerLoad</name><value>true</value></property>
>>>>
>>>>
>>>> <property><name>jobclient.output.filter</name><value>FAILED</value></property>
>>>>
>>>>
>>>> <property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property>
>>>>
>>>>
>>>> <property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
>>>>
>>>> <property><name>fs.checkpoint.size</name><value>67108864</value></property>
>>>>
>>>>
>>>>
>>>> bottom
>>>> namenode log
>>>>
>>>> added to blk_6520091160827873550_1036 size 570
>>>> 2010-02-03 20:02:43,826 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>>>>
>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>> 2010-02-03 20:02:43,866 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1
>>>> cmd=setPermission
>>>>
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>>>>
>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange:
>>>> BLOCK*
>>>> NameSystem.allocateBlock:
>>>> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml.
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange:
>>>> BLOCK*
>>>> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is
>>>> added to
>>>> blk_517844159758473296_1037 size 16238
>>>> 2010-02-03 20:02:44,257 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>>>>
>>>> dst=null perm=null
>>>> 2010-02-03 20:02:44,527 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar
>>>>
>>>> dst=null perm=null
>>>> 2010-02-03 20:02:45,258 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split
>>>>
>>>> dst=null perm=null
>>>>
>>>>
>>>> bottom
>>>> datanode log
>>>>
>>>> 2010-02-03 20:02:44,046 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
>>>> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest:
>>>> /127.0.0.1:50010
>>>> 2010-02-03 20:02:44,076 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE,
>>>> cliID: DFSClient_-1424524646, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,086 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0
>>>> for block
>>>> blk_517844159758473296_1037 terminating
>>>> 2010-02-03 20:02:44,457 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
>>>> cliID: DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,677 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ,
>>>> cliID: DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_-2806977820057440405_1035
>>>> 2010-02-03 20:02:45,278 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ,
>>>> cliID:
>>>> DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104,
>>>> blockid: blk_6520091160827873550_1036
>>>> 2010-02-03 20:04:10,451 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_3301977249866081256_1031
>>>> 2010-02-03 20:09:35,658 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_9116729021606317943_1025
>>>> 2010-02-03 20:09:44,671 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_8602436668984954947_1026
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> jobtracker log
>>>>
>>>> Input size for job job_201002031354_0012 = 53060
>>>> 2010-02-03 19:48:37,599 INFO
>>>> org.apache.hadoop.mapred.JobInProgress: Split
>>>> info for job:job_201002031354_0012
>>>> 2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology:
>>>> Adding
>>>> a new node: /default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000000 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000001 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000002 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000003 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO
>>>> org.apache.hadoop.mapred.JobInProgress: Input
>>>> size for job job_201002031354_0013 = 53060
>>>> 2010-02-03 20:02:45,278 INFO
>>>> org.apache.hadoop.mapred.JobInProgress: Split
>>>> info for job:job_201002031354_0013
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000000 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000001 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000002 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000003 has split on
>>>> node:/default-rack/localhost
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Alex Kozlov wrote:
>>>>
>>>>
>>>>> Try
>>>>>
>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>>>>
>>>>> file:/// is a magical prefix to force hadoop to look for the file
>>>>> in the
>>>>> local FS
>>>>>
>>>>> You can also force it to look into local FS by giving '-fs local'
>>>>> or '-fs
>>>>> file:///' option to the hadoop executable
>>>>>
>>>>> These options basically overwrite the *fs.default.name* configuration
>>>>> setting, which should be in your core-site.xml file
>>>>>
>>>>> You can also copy the content of the input directory to HDFS by
>>>>> executing
>>>>>
>>>>> $ bin/hadoop fs -mkdir input
>>>>> $ bin/hadoop fs -copyFromLocal input/* input
>>>>>
>>>>> Hope this helps
>>>>>
>>>>> Alex K
>>>>>
>>>>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Alex Kozlov wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>>>>> 0
>>>>>>>
>>>>>>> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs
>>>>>>> directory
>>>>>>> (or where your logs are) and check the errors.
>>>>>>>
>>>>>>> Alex K
>>>>>>>
>>>>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Thanks for your help, Alex,
>>>>>>
>>>>>> I managed to get past that problem, now I have this problem:
>>>>>>
>>>>>> However, when I try to run this example as stated on the quickstart
>>>>>> webpage:
>>>>>>
>>>>>>
>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>>>
>>>>>> I get this error;
>>>>>> =============================================================
>>>>>> java.io.IOException: Not a file:
>>>>>> hdfs://localhost:9000/user/brian/input/conf
>>>>>> =========================================================
>>>>>> so it seems to default to my home directory looking for "input" it
>>>>>> apparently needs an absolute filepath, however, when I run that
>>>>>> way:
>>>>>>
>>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>>> /usr/local/hadoop-0.19.2/input
>>>>>> output 'dfs[a-z.]+'
>>>>>>
>>>>>> ==============================================================
>>>>>> org.apache.hadoop.mapred.InvalidInputException: Input path does not
>>>>>> exist:
>>>>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>>>>> ==============================================================
>>>>>> It still isn't happy although this part ->
>>>>>> /usr/local/hadoop-0.19.2/input
>>>>>> <- does exist
>>>>>>
>>>>>> Aaron,
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks or your help. I carefully went through the steps again a
>>>>>>> couple
>>>>>>>
>>>>>>>> times , and ran
>>>>>>>>
>>>>>>>> after this
>>>>>>>> bin/hadoop namenode -format
>>>>>>>>
>>>>>>>> (by the way, it asks if I want to reformat, I've tried it both
>>>>>>>> ways)
>>>>>>>>
>>>>>>>>
>>>>>>>> then
>>>>>>>>
>>>>>>>>
>>>>>>>> bin/start-dfs.sh
>>>>>>>>
>>>>>>>> and
>>>>>>>>
>>>>>>>> bin/start-all.sh
>>>>>>>>
>>>>>>>>
>>>>>>>> and then
>>>>>>>> bin/hadoop fs -put conf input
>>>>>>>>
>>>>>>>> now the return for this seemed cryptic:
>>>>>>>>
>>>>>>>>
>>>>>>>> put: Target input/conf is a directory
>>>>>>>>
>>>>>>>> (??)
>>>>>>>>
>>>>>>>> and when I tried
>>>>>>>>
>>>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output
>>>>>>>> 'dfs[a-z.]+'
>>>>>>>>
>>>>>>>> It says something about 0 nodes
>>>>>>>>
>>>>>>>> (from log file)
>>>>>>>>
>>>>>>>> 2010-02-01 13:26:29,874 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>>>>>>
>>>>>>>>
>>>>>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>>>>
>>>>>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC
>>>>>>>> Server
>>>>>>>> handler 3 on 9000, call
>>>>>>>>
>>>>>>>>
>>>>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>>>>>>
>>>>>>>> DFSClient_725490811) from 127.0.0.1:3003: error:
>>>>>>>> java.io.IOException:
>>>>>>>> File
>>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>>>>
>>>>>>>> could
>>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>>> java.io.IOException: File
>>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>>>>
>>>>>>>> could
>>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>>>>>>
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>>>>>>
>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>>
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>>>>> netstat:
>>>>>>>>
>>>>>>>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>>>>>>>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>>>>>>>
>>>>>>>>
>>>>>>>> and when I browse to http://localhost:50070/
>>>>>>>>
>>>>>>>>
>>>>>>>> Cluster Summary
>>>>>>>>
>>>>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size
>>>>>>>> is 8.01
>>>>>>>> MB
>>>>>>>> /
>>>>>>>> 992.31 MB (0%)
>>>>>>>> *
>>>>>>>> Configured Capacity : 0 KB
>>>>>>>> DFS Used : 0 KB
>>>>>>>> Non DFS Used : 0 KB
>>>>>>>> DFS Remaining : 0 KB
>>>>>>>> DFS Used% : 100 %
>>>>>>>> DFS Remaining% : 0 %
>>>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>>>>>> 0
>>>>>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> :
>>>>>>>> 0
>>>>>>>>
>>>>>>>>
>>>>>>>> so I'm a bit still in the dark, I guess.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Brian
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Aaron Kimball wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Brian, it looks like you missed a step in the instructions.
>>>>>>>>> You'll
>>>>>>>>> need
>>>>>>>>> to
>>>>>>>>> format the hdfs filesystem instance before starting the NameNode
>>>>>>>>> server:
>>>>>>>>>
>>>>>>>>> You need to run:
>>>>>>>>>
>>>>>>>>> $ bin/hadoop namenode -format
>>>>>>>>>
>>>>>>>>> .. then you can do bin/start-dfs.sh
>>>>>>>>> Hope this helps,
>>>>>>>>> - Aaron
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per
>>>>>>>>>> directions on
>>>>>>>>>> the
>>>>>>>>>> hadoop "quickstart" web page.
>>>>>>>>>>
>>>>>>>>>> I know sshd is running and I can "ssh localhost" without a
>>>>>>>>>> password.
>>>>>>>>>>
>>>>>>>>>> This is from my hadoop-site.xml
>>>>>>>>>>
>>>>>>>>>> <configuration>
>>>>>>>>>> <property>
>>>>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>fs.default.name</name>
>>>>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>mapred.job.tracker</name>
>>>>>>>>>> <value>localhost:9001</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>>>>> <value>-1</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.replication</name>
>>>>>>>>>> <value>1</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.permissions</name>
>>>>>>>>>> <value>false</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>webinterface.private.actions</name>
>>>>>>>>>> <value>true</value>
>>>>>>>>>> </property>
>>>>>>>>>> </configuration>
>>>>>>>>>>
>>>>>>>>>> These are errors from my log files:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>>>>>> localhost/
>>>>>>>>>> 127.0.0.1:9000
>>>>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>>>>> Initializing JVM Metrics with processName=NameNode,
>>>>>>>>>> sessionId=null
>>>>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>>>>> Initializing
>>>>>>>>>> NameNodeMeterics using context
>>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> supergroup=supergroup
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> isPermissionEnabled=false
>>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>>>>>>
>>>>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>>>>>> FSNamesystemStatusMBean
>>>>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does
>>>>>>>>>> not exist.
>>>>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> FSNamesystem
>>>>>>>>>> initialization failed.
>>>>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>>>>>>
>>>>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an
>>>>>>>>>> inconsistent
>>>>>>>>>> state:
>>>>>>>>>> storage directory does not exist or is not accessible.
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>>>>>>
>>>>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server:
>>>>>>>>>> Stopping
>>>>>>>>>> server
>>>>>>>>>> on 9000
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> =========================================================
>>>>>>>>>>
>>>>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client:
>>>>>>>>>> Retrying
>>>>>>>>>> connect
>>>>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>>>>> problem cleaning system directory: null
>>>>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000
>>>>>>>>>> failed
>>>>>>>>>> on
>>>>>>>>>> connection exception: java.net.ConnectException: Connection
>>>>>>>>>> refused:
>>>>>>>>>> no
>>>>>>>>>> further information
>>>>>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Brian
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
Re: hadoop under cygwin issue
Posted by Alex Kozlov <al...@cloudera.com>.
Hi Brian,
Is your namenode running? Try 'hadoop fs -ls /'.
Alex
On Mar 12, 2010, at 5:20 PM, Brian Wolf <br...@gmail.com> wrote:
> Hi Alex,
>
> I am back on this problem. Seems it works, but I have this issue
> with connecting to server.
> I can connect 'ssh localhost' ok.
>
> Thanks
> Brian
>
> $ bin/hadoop jar hadoop-*-examples.jar pi 2 2
> Number of Maps = 2
> Samples per Map = 2
> 10/03/12 17:16:17 INFO ipc.Client: Retrying connect to server:
> localhost/127.0.0.1:9000. Already tried 0 time(s).
> 10/03/12 17:16:19 INFO ipc.Client: Retrying connect to server:
> localhost/127.0.0.1:9000. Already tried 1 time(s).
>
>
>
> Alex Kozlov wrote:
>> Can you endeavor a simpler job (just to make sure your setup works):
>>
>> $ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2
>>
>> Alex K
>>
>> On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>>> Alex, thanks for the help, it seems to start now, however
>>>
>>>
>>> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
>>> 'dfs[a-z.]+'
>>> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated
>>> filesystem
>>> name. Use "file:///" instead.
>>> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths
>>> to process
>>> : 3
>>> 10/02/03 20:02:44 INFO mapred.JobClient: Running job:
>>> job_201002031354_0013
>>> 10/02/03 20:02:45 INFO mapred.JobClient: map 0% reduce 0%
>>>
>>>
>>>
>>> it hangs here (is pseudo cluster supposed to work?)
>>>
>>>
>>> these are bottom of various log files
>>>
>>> conf log file
>>>
>>>
>>> <property><name>fs.s3.impl</
>>> name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
>>>
>>> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/
>>> local/hadoop-0.19.2/input</value></property>
>>> <property><name>mapred.job.tracker.http.address</
>>> name><value>0.0.0.0:50030
>>> </value></property>
>>> <property><name>io.file.buffer.size</name><value>4096</value></
>>> property>
>>>
>>> <property><name>mapred.jobtracker.restart.recover</
>>> name><value>false</value></property>
>>>
>>> <property><name>io.serializations</
>>> name><value>org.apache.hadoop.io.serializer.WritableSerialization</
>>> value></property>
>>>
>>> <property><name>dfs.datanode.handler.count</name><value>3</value></
>>> property>
>>>
>>> <property><name>mapred.reduce.copy.backoff</name><value>300</
>>> value></property>
>>> <property><name>mapred.task.profile</name><value>false</value></
>>> property>
>>>
>>> <property><name>dfs.replication.considerLoad</name><value>true</
>>> value></property>
>>>
>>> <property><name>jobclient.output.filter</name><value>FAILED</
>>> value></property>
>>>
>>> <property><name>mapred.tasktracker.map.tasks.maximum</
>>> name><value>2</value></property>
>>>
>>> <property><name>io.compression.codecs</
>>> name>
>>> <value>
>>> org.apache.hadoop.io.compress.DefaultCodec,
>>> org.apache.hadoop.io.compress.GzipCodec,
>>> org.apache.hadoop.io.compress.BZip2Codec</value></property>
>>> <property><name>fs.checkpoint.size</name><value>67108864</value></
>>> property>
>>>
>>>
>>> bottom
>>> namenode log
>>>
>>> added to blk_6520091160827873550_1036 size 570
>>> 2010-02-03 20:02:43,826 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/
>>> job.xml
>>> dst=null perm=brian:supergroup:rw-r--r--
>>> 2010-02-03 20:02:43,866 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1
>>> cmd=setPermission
>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/
>>> job.xml
>>> dst=null perm=brian:supergroup:rw-r--r--
>>> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK*
>>> NameSystem.allocateBlock:
>>> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/
>>> job.xml.
>>> blk_517844159758473296_1037
>>> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK*
>>> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is
>>> added to
>>> blk_517844159758473296_1037 size 16238
>>> 2010-02-03 20:02:44,257 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/
>>> job.xml
>>> dst=null perm=null
>>> 2010-02-03 20:02:44,527 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/
>>> job.jar
>>> dst=null perm=null
>>> 2010-02-03 20:02:45,258 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/
>>> job.split
>>> dst=null perm=null
>>>
>>>
>>> bottom
>>> datanode log
>>>
>>> 2010-02-03 20:02:44,046 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
>>> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: /
>>> 127.0.0.1:50010
>>> 2010-02-03 20:02:44,076 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op:
>>> HDFS_WRITE,
>>> cliID: DFSClient_-1424524646, srvID:
>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>> blk_517844159758473296_1037
>>> 2010-02-03 20:02:44,086 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0
>>> for block
>>> blk_517844159758473296_1037 terminating
>>> 2010-02-03 20:02:44,457 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
>>> cliID: DFSClient_-548531246, srvID:
>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>> blk_517844159758473296_1037
>>> 2010-02-03 20:02:44,677 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op:
>>> HDFS_READ,
>>> cliID: DFSClient_-548531246, srvID:
>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>> blk_-2806977820057440405_1035
>>> 2010-02-03 20:02:45,278 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ,
>>> cliID:
>>> DFSClient_-548531246, srvID:
>>> DS-1812377383-192.168.1.5-50010-1265088397104,
>>> blockid: blk_6520091160827873550_1036
>>> 2010-02-03 20:04:10,451 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner:
>>> Verification
>>> succeeded for blk_3301977249866081256_1031
>>> 2010-02-03 20:09:35,658 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner:
>>> Verification
>>> succeeded for blk_9116729021606317943_1025
>>> 2010-02-03 20:09:44,671 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner:
>>> Verification
>>> succeeded for blk_8602436668984954947_1026
>>>
>>>
>>>
>>>
>>>
>>> jobtracker log
>>>
>>> Input size for job job_201002031354_0012 = 53060
>>> 2010-02-03 19:48:37,599 INFO
>>> org.apache.hadoop.mapred.JobInProgress: Split
>>> info for job:job_201002031354_0012
>>> 2010-02-03 19:48:37,649 INFO
>>> org.apache.hadoop.net.NetworkTopology: Adding
>>> a new node: /default-rack/localhost
>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0012_m_000000 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0012_m_000001 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0012_m_000002 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0012_m_000003 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 20:02:45,278 INFO
>>> org.apache.hadoop.mapred.JobInProgress: Input
>>> size for job job_201002031354_0013 = 53060
>>> 2010-02-03 20:02:45,278 INFO
>>> org.apache.hadoop.mapred.JobInProgress: Split
>>> info for job:job_201002031354_0013
>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0013_m_000000 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0013_m_000001 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0013_m_000002 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0013_m_000003 has split on
>>> node:/default-rack/localhost
>>>
>>>
>>>
>>> Thanks
>>> Brian
>>>
>>>
>>>
>>>
>>>
>>> Alex Kozlov wrote:
>>>
>>>
>>>> Try
>>>>
>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>>>
>>>> file:/// is a magical prefix to force hadoop to look for the file
>>>> in the
>>>> local FS
>>>>
>>>> You can also force it to look into local FS by giving '-fs local'
>>>> or '-fs
>>>> file:///' option to the hadoop executable
>>>>
>>>> These options basically overwrite the *fs.default.name*
>>>> configuration
>>>> setting, which should be in your core-site.xml file
>>>>
>>>> You can also copy the content of the input directory to HDFS by
>>>> executing
>>>>
>>>> $ bin/hadoop fs -mkdir input
>>>> $ bin/hadoop fs -copyFromLocal input/* input
>>>>
>>>> Hope this helps
>>>>
>>>> Alex K
>>>>
>>>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com>
>>>> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>> Alex Kozlov wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>>>> 0
>>>>>>
>>>>>> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs
>>>>>> directory
>>>>>> (or where your logs are) and check the errors.
>>>>>>
>>>>>> Alex K
>>>>>>
>>>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> Thanks for your help, Alex,
>>>>>
>>>>> I managed to get past that problem, now I have this problem:
>>>>>
>>>>> However, when I try to run this example as stated on the
>>>>> quickstart
>>>>> webpage:
>>>>>
>>>>>
>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-
>>>>> z.]+'
>>>>>
>>>>> I get this error;
>>>>> =============================================================
>>>>> java.io.IOException: Not a file:
>>>>> hdfs://localhost:9000/user/brian/input/conf
>>>>> =========================================================
>>>>> so it seems to default to my home directory looking for "input" it
>>>>> apparently needs an absolute filepath, however, when I run
>>>>> that way:
>>>>>
>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>> /usr/local/hadoop-0.19.2/input
>>>>> output 'dfs[a-z.]+'
>>>>>
>>>>> ==============================================================
>>>>> org.apache.hadoop.mapred.InvalidInputException: Input path does
>>>>> not
>>>>> exist:
>>>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>>>> ==============================================================
>>>>> It still isn't happy although this part -> /usr/local/
>>>>> hadoop-0.19.2/input
>>>>> <- does exist
>>>>>
>>>>> Aaron,
>>>>>
>>>>>
>>>>>
>>>>>> Thanks or your help. I carefully went through the steps again
>>>>>> a couple
>>>>>>
>>>>>>> times , and ran
>>>>>>>
>>>>>>> after this
>>>>>>> bin/hadoop namenode -format
>>>>>>>
>>>>>>> (by the way, it asks if I want to reformat, I've tried it both
>>>>>>> ways)
>>>>>>>
>>>>>>>
>>>>>>> then
>>>>>>>
>>>>>>>
>>>>>>> bin/start-dfs.sh
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> bin/start-all.sh
>>>>>>>
>>>>>>>
>>>>>>> and then
>>>>>>> bin/hadoop fs -put conf input
>>>>>>>
>>>>>>> now the return for this seemed cryptic:
>>>>>>>
>>>>>>>
>>>>>>> put: Target input/conf is a directory
>>>>>>>
>>>>>>> (??)
>>>>>>>
>>>>>>> and when I tried
>>>>>>>
>>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-
>>>>>>> z.]+'
>>>>>>>
>>>>>>> It says something about 0 nodes
>>>>>>>
>>>>>>> (from log file)
>>>>>>>
>>>>>>> 2010-02-01 13:26:29,874 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1
>>>>>>> cmd=create
>>>>>>>
>>>>>>>
>>>>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/
>>>>>>> job_201002011323_0001/job.jar
>>>>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC
>>>>>>> Server
>>>>>>> handler 3 on 9000, call
>>>>>>>
>>>>>>>
>>>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/
>>>>>>> job_201002011323_0001/job.jar,
>>>>>>> DFSClient_725490811) from 127.0.0.1:3003: error:
>>>>>>> java.io.IOException:
>>>>>>> File
>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/
>>>>>>> job.jar
>>>>>>> could
>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>> java.io.IOException: File
>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/
>>>>>>> job.jar
>>>>>>> could
>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(
>>>>>>> FSNamesystem.java:1287)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock
>>>>>>> (NameNode.java:351)
>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke
>>>>>>> (NativeMethodAccessorImpl.java:39)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke
>>>>>>> (DelegatingMethodAccessorImpl.java:25)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>>>> netstat:
>>>>>>>
>>>>>>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>>>>>>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>>>>>>
>>>>>>>
>>>>>>> and when I browse to http://localhost:50070/
>>>>>>>
>>>>>>>
>>>>>>> Cluster Summary
>>>>>>>
>>>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size
>>>>>>> is 8.01
>>>>>>> MB
>>>>>>> /
>>>>>>> 992.31 MB (0%)
>>>>>>> *
>>>>>>> Configured Capacity : 0 KB
>>>>>>> DFS Used : 0 KB
>>>>>>> Non DFS Used : 0 KB
>>>>>>> DFS Remaining : 0 KB
>>>>>>> DFS Used% : 100 %
>>>>>>> DFS Remaining% : 0 %
>>>>>>> Live Nodes <http://localhost:50070/
>>>>>>> dfshealth.jsp#LiveNodes> :
>>>>>>> 0
>>>>>>> Dead Nodes <http://localhost:50070/
>>>>>>> dfshealth.jsp#DeadNodes> :
>>>>>>> 0
>>>>>>>
>>>>>>>
>>>>>>> so I'm a bit still in the dark, I guess.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Brian
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Aaron Kimball wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Brian, it looks like you missed a step in the instructions.
>>>>>>>> You'll
>>>>>>>> need
>>>>>>>> to
>>>>>>>> format the hdfs filesystem instance before starting the
>>>>>>>> NameNode
>>>>>>>> server:
>>>>>>>>
>>>>>>>> You need to run:
>>>>>>>>
>>>>>>>> $ bin/hadoop namenode -format
>>>>>>>>
>>>>>>>> .. then you can do bin/start-dfs.sh
>>>>>>>> Hope this helps,
>>>>>>>> - Aaron
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per
>>>>>>>>> directions on
>>>>>>>>> the
>>>>>>>>> hadoop "quickstart" web page.
>>>>>>>>>
>>>>>>>>> I know sshd is running and I can "ssh localhost" without a
>>>>>>>>> password.
>>>>>>>>>
>>>>>>>>> This is from my hadoop-site.xml
>>>>>>>>>
>>>>>>>>> <configuration>
>>>>>>>>> <property>
>>>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>fs.default.name</name>
>>>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>mapred.job.tracker</name>
>>>>>>>>> <value>localhost:9001</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>>>> <value>-1</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>dfs.replication</name>
>>>>>>>>> <value>1</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>dfs.permissions</name>
>>>>>>>>> <value>false</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>webinterface.private.actions</name>
>>>>>>>>> <value>true</value>
>>>>>>>>> </property>
>>>>>>>>> </configuration>
>>>>>>>>>
>>>>>>>>> These are errors from my log files:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up
>>>>>>>>> at:
>>>>>>>>> localhost/
>>>>>>>>> 127.0.0.1:9000
>>>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>>>> Initializing JVM Metrics with processName=NameNode,
>>>>>>>>> sessionId=null
>>>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>>>> Initializing
>>>>>>>>> NameNodeMeterics using context
>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>> supergroup=supergroup
>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>> isPermissionEnabled=false
>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>> Registered
>>>>>>>>> FSNamesystemStatusMBean
>>>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does
>>>>>>>>> not exist.
>>>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>> FSNamesystem
>>>>>>>>> initialization failed.
>>>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an
>>>>>>>>> inconsistent
>>>>>>>>> state:
>>>>>>>>> storage directory does not exist or is not accessible.
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(
>>>>>>>>> FSImage.java:278)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(
>>>>>>>>> FSDirectory.java:87)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(
>>>>>>>>> FSNamesystem.java:309)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>
>>>>>>>>> (FSNamesystem.java:288)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize
>>>>>>>>> (NameNode.java:163)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>
>>>>>>>>> (NameNode.java:208)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>
>>>>>>>>> (NameNode.java:194)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(
>>>>>>>>> NameNode.java:859)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main
>>>>>>>>> (NameNode.java:868)
>>>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server:
>>>>>>>>> Stopping
>>>>>>>>> server
>>>>>>>>> on 9000
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> =========================================================
>>>>>>>>>
>>>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client:
>>>>>>>>> Retrying
>>>>>>>>> connect
>>>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>>>> problem cleaning system directory: null
>>>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000
>>>>>>>>> failed
>>>>>>>>> on
>>>>>>>>> connection exception: java.net.ConnectException: Connection
>>>>>>>>> refused:
>>>>>>>>> no
>>>>>>>>> further information
>>>>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode
>>>>>>>>> (DFSClient.java:104)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Brian
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
Re: hadoop under cygwin issue
Posted by Brian Wolf <br...@gmail.com>.
Hi Alex,
I am back on this problem. Seems it works, but I have this issue with
connecting to server.
I can connect 'ssh localhost' ok.
Thanks
Brian
$ bin/hadoop jar hadoop-*-examples.jar pi 2 2
Number of Maps = 2
Samples per Map = 2
10/03/12 17:16:17 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 0 time(s).
10/03/12 17:16:19 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 1 time(s).
Alex Kozlov wrote:
> Can you endeavor a simpler job (just to make sure your setup works):
>
> $ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2
>
> Alex K
>
> On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:
>
>
>> Alex, thanks for the help, it seems to start now, however
>>
>>
>> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
>> 'dfs[a-z.]+'
>> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated filesystem
>> name. Use "file:///" instead.
>> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to process
>> : 3
>> 10/02/03 20:02:44 INFO mapred.JobClient: Running job: job_201002031354_0013
>> 10/02/03 20:02:45 INFO mapred.JobClient: map 0% reduce 0%
>>
>>
>>
>> it hangs here (is pseudo cluster supposed to work?)
>>
>>
>> these are bottom of various log files
>>
>> conf log file
>>
>>
>> <property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
>>
>> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property>
>> <property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030
>> </value></property>
>> <property><name>io.file.buffer.size</name><value>4096</value></property>
>>
>> <property><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
>>
>> <property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
>>
>> <property><name>dfs.datanode.handler.count</name><value>3</value></property>
>>
>> <property><name>mapred.reduce.copy.backoff</name><value>300</value></property>
>> <property><name>mapred.task.profile</name><value>false</value></property>
>>
>> <property><name>dfs.replication.considerLoad</name><value>true</value></property>
>>
>> <property><name>jobclient.output.filter</name><value>FAILED</value></property>
>>
>> <property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property>
>>
>> <property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
>> <property><name>fs.checkpoint.size</name><value>67108864</value></property>
>>
>>
>> bottom
>> namenode log
>>
>> added to blk_6520091160827873550_1036 size 570
>> 2010-02-03 20:02:43,826 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>> dst=null perm=brian:supergroup:rw-r--r--
>> 2010-02-03 20:02:43,866 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=setPermission
>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>> dst=null perm=brian:supergroup:rw-r--r--
>> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>> NameSystem.allocateBlock:
>> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml.
>> blk_517844159758473296_1037
>> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to
>> blk_517844159758473296_1037 size 16238
>> 2010-02-03 20:02:44,257 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>> dst=null perm=null
>> 2010-02-03 20:02:44,527 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar
>> dst=null perm=null
>> 2010-02-03 20:02:45,258 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split
>> dst=null perm=null
>>
>>
>> bottom
>> datanode log
>>
>> 2010-02-03 20:02:44,046 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
>> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: /127.0.0.1:50010
>> 2010-02-03 20:02:44,076 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE,
>> cliID: DFSClient_-1424524646, srvID:
>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>> blk_517844159758473296_1037
>> 2010-02-03 20:02:44,086 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block
>> blk_517844159758473296_1037 terminating
>> 2010-02-03 20:02:44,457 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
>> cliID: DFSClient_-548531246, srvID:
>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>> blk_517844159758473296_1037
>> 2010-02-03 20:02:44,677 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ,
>> cliID: DFSClient_-548531246, srvID:
>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>> blk_-2806977820057440405_1035
>> 2010-02-03 20:02:45,278 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ, cliID:
>> DFSClient_-548531246, srvID: DS-1812377383-192.168.1.5-50010-1265088397104,
>> blockid: blk_6520091160827873550_1036
>> 2010-02-03 20:04:10,451 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>> succeeded for blk_3301977249866081256_1031
>> 2010-02-03 20:09:35,658 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>> succeeded for blk_9116729021606317943_1025
>> 2010-02-03 20:09:44,671 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>> succeeded for blk_8602436668984954947_1026
>>
>>
>>
>>
>>
>> jobtracker log
>>
>> Input size for job job_201002031354_0012 = 53060
>> 2010-02-03 19:48:37,599 INFO org.apache.hadoop.mapred.JobInProgress: Split
>> info for job:job_201002031354_0012
>> 2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology: Adding
>> a new node: /default-rack/localhost
>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0012_m_000000 has split on
>> node:/default-rack/localhost
>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0012_m_000001 has split on
>> node:/default-rack/localhost
>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0012_m_000002 has split on
>> node:/default-rack/localhost
>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0012_m_000003 has split on
>> node:/default-rack/localhost
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: Input
>> size for job job_201002031354_0013 = 53060
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: Split
>> info for job:job_201002031354_0013
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0013_m_000000 has split on
>> node:/default-rack/localhost
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0013_m_000001 has split on
>> node:/default-rack/localhost
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0013_m_000002 has split on
>> node:/default-rack/localhost
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0013_m_000003 has split on
>> node:/default-rack/localhost
>>
>>
>>
>> Thanks
>> Brian
>>
>>
>>
>>
>>
>> Alex Kozlov wrote:
>>
>>
>>> Try
>>>
>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>>
>>> file:/// is a magical prefix to force hadoop to look for the file in the
>>> local FS
>>>
>>> You can also force it to look into local FS by giving '-fs local' or '-fs
>>> file:///' option to the hadoop executable
>>>
>>> These options basically overwrite the *fs.default.name* configuration
>>> setting, which should be in your core-site.xml file
>>>
>>> You can also copy the content of the input directory to HDFS by executing
>>>
>>> $ bin/hadoop fs -mkdir input
>>> $ bin/hadoop fs -copyFromLocal input/* input
>>>
>>> Hope this helps
>>>
>>> Alex K
>>>
>>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>
>>>
>>>> Alex Kozlov wrote:
>>>>
>>>>
>>>>
>>>>
>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>>> 0
>>>>>
>>>>> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs
>>>>> directory
>>>>> (or where your logs are) and check the errors.
>>>>>
>>>>> Alex K
>>>>>
>>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Thanks for your help, Alex,
>>>>
>>>> I managed to get past that problem, now I have this problem:
>>>>
>>>> However, when I try to run this example as stated on the quickstart
>>>> webpage:
>>>>
>>>>
>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>
>>>> I get this error;
>>>> =============================================================
>>>> java.io.IOException: Not a file:
>>>> hdfs://localhost:9000/user/brian/input/conf
>>>> =========================================================
>>>> so it seems to default to my home directory looking for "input" it
>>>> apparently needs an absolute filepath, however, when I run that way:
>>>>
>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>> /usr/local/hadoop-0.19.2/input
>>>> output 'dfs[a-z.]+'
>>>>
>>>> ==============================================================
>>>> org.apache.hadoop.mapred.InvalidInputException: Input path does not
>>>> exist:
>>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>>> ==============================================================
>>>> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>>>> <- does exist
>>>>
>>>> Aaron,
>>>>
>>>>
>>>>
>>>>> Thanks or your help. I carefully went through the steps again a couple
>>>>>
>>>>>> times , and ran
>>>>>>
>>>>>> after this
>>>>>> bin/hadoop namenode -format
>>>>>>
>>>>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>>>>
>>>>>>
>>>>>> then
>>>>>>
>>>>>>
>>>>>> bin/start-dfs.sh
>>>>>>
>>>>>> and
>>>>>>
>>>>>> bin/start-all.sh
>>>>>>
>>>>>>
>>>>>> and then
>>>>>> bin/hadoop fs -put conf input
>>>>>>
>>>>>> now the return for this seemed cryptic:
>>>>>>
>>>>>>
>>>>>> put: Target input/conf is a directory
>>>>>>
>>>>>> (??)
>>>>>>
>>>>>> and when I tried
>>>>>>
>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>>>
>>>>>> It says something about 0 nodes
>>>>>>
>>>>>> (from log file)
>>>>>>
>>>>>> 2010-02-01 13:26:29,874 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>>>>
>>>>>>
>>>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>>>> handler 3 on 9000, call
>>>>>>
>>>>>>
>>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>>>>> File
>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>> could
>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>> java.io.IOException: File
>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>> could
>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>>> netstat:
>>>>>>
>>>>>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>>>>>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>>>>>
>>>>>>
>>>>>> and when I browse to http://localhost:50070/
>>>>>>
>>>>>>
>>>>>> Cluster Summary
>>>>>>
>>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01
>>>>>> MB
>>>>>> /
>>>>>> 992.31 MB (0%)
>>>>>> *
>>>>>> Configured Capacity : 0 KB
>>>>>> DFS Used : 0 KB
>>>>>> Non DFS Used : 0 KB
>>>>>> DFS Remaining : 0 KB
>>>>>> DFS Used% : 100 %
>>>>>> DFS Remaining% : 0 %
>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>>>> 0
>>>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> :
>>>>>> 0
>>>>>>
>>>>>>
>>>>>> so I'm a bit still in the dark, I guess.
>>>>>>
>>>>>> Thanks
>>>>>> Brian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Aaron Kimball wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Brian, it looks like you missed a step in the instructions. You'll
>>>>>>> need
>>>>>>> to
>>>>>>> format the hdfs filesystem instance before starting the NameNode
>>>>>>> server:
>>>>>>>
>>>>>>> You need to run:
>>>>>>>
>>>>>>> $ bin/hadoop namenode -format
>>>>>>>
>>>>>>> .. then you can do bin/start-dfs.sh
>>>>>>> Hope this helps,
>>>>>>> - Aaron
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on
>>>>>>>> the
>>>>>>>> hadoop "quickstart" web page.
>>>>>>>>
>>>>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>>>>
>>>>>>>> This is from my hadoop-site.xml
>>>>>>>>
>>>>>>>> <configuration>
>>>>>>>> <property>
>>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>fs.default.name</name>
>>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>mapred.job.tracker</name>
>>>>>>>> <value>localhost:9001</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>>> <value>-1</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>dfs.replication</name>
>>>>>>>> <value>1</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>dfs.permissions</name>
>>>>>>>> <value>false</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>webinterface.private.actions</name>
>>>>>>>> <value>true</value>
>>>>>>>> </property>
>>>>>>>> </configuration>
>>>>>>>>
>>>>>>>> These are errors from my log files:
>>>>>>>>
>>>>>>>>
>>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>>>> localhost/
>>>>>>>> 127.0.0.1:9000
>>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>>> Initializing
>>>>>>>> NameNodeMeterics using context
>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>> supergroup=supergroup
>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>> isPermissionEnabled=false
>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>>>> FSNamesystemStatusMBean
>>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>>>>> initialization failed.
>>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>>>>> state:
>>>>>>>> storage directory does not exist or is not accessible.
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>>>> at
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>>>>> server
>>>>>>>> on 9000
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> =========================================================
>>>>>>>>
>>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>>>>> connect
>>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>>> problem cleaning system directory: null
>>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed
>>>>>>>> on
>>>>>>>> connection exception: java.net.ConnectException: Connection refused:
>>>>>>>> no
>>>>>>>> further information
>>>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>>> at
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Brian
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
Re: hadoop under cygwin issue
Posted by Alex Kozlov <al...@cloudera.com>.
Can you endeavor a simpler job (just to make sure your setup works):
$ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2
Alex K
On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:
> Alex, thanks for the help, it seems to start now, however
>
>
> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
> 'dfs[a-z.]+'
> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated filesystem
> name. Use "file:///" instead.
> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to process
> : 3
> 10/02/03 20:02:44 INFO mapred.JobClient: Running job: job_201002031354_0013
> 10/02/03 20:02:45 INFO mapred.JobClient: map 0% reduce 0%
>
>
>
> it hangs here (is pseudo cluster supposed to work?)
>
>
> these are bottom of various log files
>
> conf log file
>
>
> <property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
>
> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property>
> <property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030
> </value></property>
> <property><name>io.file.buffer.size</name><value>4096</value></property>
>
> <property><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
>
> <property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
>
> <property><name>dfs.datanode.handler.count</name><value>3</value></property>
>
> <property><name>mapred.reduce.copy.backoff</name><value>300</value></property>
> <property><name>mapred.task.profile</name><value>false</value></property>
>
> <property><name>dfs.replication.considerLoad</name><value>true</value></property>
>
> <property><name>jobclient.output.filter</name><value>FAILED</value></property>
>
> <property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property>
>
> <property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
> <property><name>fs.checkpoint.size</name><value>67108864</value></property>
>
>
> bottom
> namenode log
>
> added to blk_6520091160827873550_1036 size 570
> 2010-02-03 20:02:43,826 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
> dst=null perm=brian:supergroup:rw-r--r--
> 2010-02-03 20:02:43,866 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=setPermission
> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
> dst=null perm=brian:supergroup:rw-r--r--
> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.allocateBlock:
> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml.
> blk_517844159758473296_1037
> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to
> blk_517844159758473296_1037 size 16238
> 2010-02-03 20:02:44,257 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
> dst=null perm=null
> 2010-02-03 20:02:44,527 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar
> dst=null perm=null
> 2010-02-03 20:02:45,258 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split
> dst=null perm=null
>
>
> bottom
> datanode log
>
> 2010-02-03 20:02:44,046 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: /127.0.0.1:50010
> 2010-02-03 20:02:44,076 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE,
> cliID: DFSClient_-1424524646, srvID:
> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
> blk_517844159758473296_1037
> 2010-02-03 20:02:44,086 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block
> blk_517844159758473296_1037 terminating
> 2010-02-03 20:02:44,457 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
> cliID: DFSClient_-548531246, srvID:
> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
> blk_517844159758473296_1037
> 2010-02-03 20:02:44,677 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ,
> cliID: DFSClient_-548531246, srvID:
> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
> blk_-2806977820057440405_1035
> 2010-02-03 20:02:45,278 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ, cliID:
> DFSClient_-548531246, srvID: DS-1812377383-192.168.1.5-50010-1265088397104,
> blockid: blk_6520091160827873550_1036
> 2010-02-03 20:04:10,451 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_3301977249866081256_1031
> 2010-02-03 20:09:35,658 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_9116729021606317943_1025
> 2010-02-03 20:09:44,671 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_8602436668984954947_1026
>
>
>
>
>
> jobtracker log
>
> Input size for job job_201002031354_0012 = 53060
> 2010-02-03 19:48:37,599 INFO org.apache.hadoop.mapred.JobInProgress: Split
> info for job:job_201002031354_0012
> 2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology: Adding
> a new node: /default-rack/localhost
> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0012_m_000000 has split on
> node:/default-rack/localhost
> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0012_m_000001 has split on
> node:/default-rack/localhost
> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0012_m_000002 has split on
> node:/default-rack/localhost
> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0012_m_000003 has split on
> node:/default-rack/localhost
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: Input
> size for job job_201002031354_0013 = 53060
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: Split
> info for job:job_201002031354_0013
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0013_m_000000 has split on
> node:/default-rack/localhost
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0013_m_000001 has split on
> node:/default-rack/localhost
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0013_m_000002 has split on
> node:/default-rack/localhost
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0013_m_000003 has split on
> node:/default-rack/localhost
>
>
>
> Thanks
> Brian
>
>
>
>
>
> Alex Kozlov wrote:
>
>> Try
>>
>> $ bin/hadoop jar hadoop-*-examples.jar grep
>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>
>> file:/// is a magical prefix to force hadoop to look for the file in the
>> local FS
>>
>> You can also force it to look into local FS by giving '-fs local' or '-fs
>> file:///' option to the hadoop executable
>>
>> These options basically overwrite the *fs.default.name* configuration
>> setting, which should be in your core-site.xml file
>>
>> You can also copy the content of the input directory to HDFS by executing
>>
>> $ bin/hadoop fs -mkdir input
>> $ bin/hadoop fs -copyFromLocal input/* input
>>
>> Hope this helps
>>
>> Alex K
>>
>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>>
>>> Alex Kozlov wrote:
>>>
>>>
>>>
>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>> 0
>>>>
>>>> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs
>>>> directory
>>>> (or where your logs are) and check the errors.
>>>>
>>>> Alex K
>>>>
>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> Thanks for your help, Alex,
>>>
>>> I managed to get past that problem, now I have this problem:
>>>
>>> However, when I try to run this example as stated on the quickstart
>>> webpage:
>>>
>>>
>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>
>>> I get this error;
>>> =============================================================
>>> java.io.IOException: Not a file:
>>> hdfs://localhost:9000/user/brian/input/conf
>>> =========================================================
>>> so it seems to default to my home directory looking for "input" it
>>> apparently needs an absolute filepath, however, when I run that way:
>>>
>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>> /usr/local/hadoop-0.19.2/input
>>> output 'dfs[a-z.]+'
>>>
>>> ==============================================================
>>> org.apache.hadoop.mapred.InvalidInputException: Input path does not
>>> exist:
>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>> ==============================================================
>>> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>>> <- does exist
>>>
>>> Aaron,
>>>
>>>
>>>> Thanks or your help. I carefully went through the steps again a couple
>>>>> times , and ran
>>>>>
>>>>> after this
>>>>> bin/hadoop namenode -format
>>>>>
>>>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>>>
>>>>>
>>>>> then
>>>>>
>>>>>
>>>>> bin/start-dfs.sh
>>>>>
>>>>> and
>>>>>
>>>>> bin/start-all.sh
>>>>>
>>>>>
>>>>> and then
>>>>> bin/hadoop fs -put conf input
>>>>>
>>>>> now the return for this seemed cryptic:
>>>>>
>>>>>
>>>>> put: Target input/conf is a directory
>>>>>
>>>>> (??)
>>>>>
>>>>> and when I tried
>>>>>
>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>>
>>>>> It says something about 0 nodes
>>>>>
>>>>> (from log file)
>>>>>
>>>>> 2010-02-01 13:26:29,874 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>>>
>>>>>
>>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>>> handler 3 on 9000, call
>>>>>
>>>>>
>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>>>> File
>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>> could
>>>>> only be replicated to 0 nodes, instead of 1
>>>>> java.io.IOException: File
>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>> could
>>>>> only be replicated to 0 nodes, instead of 1
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>>
>>>>>
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>> at
>>>>>
>>>>>
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>> netstat:
>>>>>
>>>>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>>>>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>>>>
>>>>>
>>>>> and when I browse to http://localhost:50070/
>>>>>
>>>>>
>>>>> Cluster Summary
>>>>>
>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01
>>>>> MB
>>>>> /
>>>>> 992.31 MB (0%)
>>>>> *
>>>>> Configured Capacity : 0 KB
>>>>> DFS Used : 0 KB
>>>>> Non DFS Used : 0 KB
>>>>> DFS Remaining : 0 KB
>>>>> DFS Used% : 100 %
>>>>> DFS Remaining% : 0 %
>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>>> 0
>>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> :
>>>>> 0
>>>>>
>>>>>
>>>>> so I'm a bit still in the dark, I guess.
>>>>>
>>>>> Thanks
>>>>> Brian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Aaron Kimball wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Brian, it looks like you missed a step in the instructions. You'll
>>>>>> need
>>>>>> to
>>>>>> format the hdfs filesystem instance before starting the NameNode
>>>>>> server:
>>>>>>
>>>>>> You need to run:
>>>>>>
>>>>>> $ bin/hadoop namenode -format
>>>>>>
>>>>>> .. then you can do bin/start-dfs.sh
>>>>>> Hope this helps,
>>>>>> - Aaron
>>>>>>
>>>>>>
>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on
>>>>>>> the
>>>>>>> hadoop "quickstart" web page.
>>>>>>>
>>>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>>>
>>>>>>> This is from my hadoop-site.xml
>>>>>>>
>>>>>>> <configuration>
>>>>>>> <property>
>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>fs.default.name</name>
>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>mapred.job.tracker</name>
>>>>>>> <value>localhost:9001</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>> <value>-1</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>dfs.replication</name>
>>>>>>> <value>1</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>dfs.permissions</name>
>>>>>>> <value>false</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>webinterface.private.actions</name>
>>>>>>> <value>true</value>
>>>>>>> </property>
>>>>>>> </configuration>
>>>>>>>
>>>>>>> These are errors from my log files:
>>>>>>>
>>>>>>>
>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>>> localhost/
>>>>>>> 127.0.0.1:9000
>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>> Initializing
>>>>>>> NameNodeMeterics using context
>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>> supergroup=supergroup
>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>> isPermissionEnabled=false
>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>>> FSNamesystemStatusMBean
>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>>>> initialization failed.
>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>>>> state:
>>>>>>> storage directory does not exist or is not accessible.
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>>> at
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>>>> server
>>>>>>> on 9000
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> =========================================================
>>>>>>>
>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>>>> connect
>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>> problem cleaning system directory: null
>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed
>>>>>>> on
>>>>>>> connection exception: java.net.ConnectException: Connection refused:
>>>>>>> no
>>>>>>> further information
>>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>> at
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>> Brian
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Re: hadoop under cygwin issue
Posted by Brian Wolf <br...@gmail.com>.
Alex, thanks for the help, it seems to start now, however
$ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
'dfs[a-z.]+'
10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated filesystem
name. Use "file:///" instead.
10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to
process : 3
10/02/03 20:02:44 INFO mapred.JobClient: Running job: job_201002031354_0013
10/02/03 20:02:45 INFO mapred.JobClient: map 0% reduce 0%
it hangs here (is pseudo cluster supposed to work?)
these are bottom of various log files
conf log file
<property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
<property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property>
<property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030</value></property>
<property><name>io.file.buffer.size</name><value>4096</value></property>
<property><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
<property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
<property><name>dfs.datanode.handler.count</name><value>3</value></property>
<property><name>mapred.reduce.copy.backoff</name><value>300</value></property>
<property><name>mapred.task.profile</name><value>false</value></property>
<property><name>dfs.replication.considerLoad</name><value>true</value></property>
<property><name>jobclient.output.filter</name><value>FAILED</value></property>
<property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property>
<property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
<property><name>fs.checkpoint.size</name><value>67108864</value></property>
bottom
namenode log
added to blk_6520091160827873550_1036 size 570
2010-02-03 20:02:43,826 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
dst=null perm=brian:supergroup:rw-r--r--
2010-02-03 20:02:43,866 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=brian,None,Administrators,Users ip=/127.0.0.1
cmd=setPermission
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
dst=null perm=brian:supergroup:rw-r--r--
2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.allocateBlock:
/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml.
blk_517844159758473296_1037
2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to
blk_517844159758473296_1037 size 16238
2010-02-03 20:02:44,257 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
dst=null perm=null
2010-02-03 20:02:44,527 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar
dst=null perm=null
2010-02-03 20:02:45,258 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=open
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split
dst=null perm=null
bottom
datanode log
2010-02-03 20:02:44,046 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: /127.0.0.1:50010
2010-02-03 20:02:44,076 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
/127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE,
cliID: DFSClient_-1424524646, srvID:
DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
blk_517844159758473296_1037
2010-02-03 20:02:44,086 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for
block blk_517844159758473296_1037 terminating
2010-02-03 20:02:44,457 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
/127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
cliID: DFSClient_-548531246, srvID:
DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
blk_517844159758473296_1037
2010-02-03 20:02:44,677 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
/127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ,
cliID: DFSClient_-548531246, srvID:
DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
blk_-2806977820057440405_1035
2010-02-03 20:02:45,278 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
/127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ,
cliID: DFSClient_-548531246, srvID:
DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
blk_6520091160827873550_1036
2010-02-03 20:04:10,451 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_3301977249866081256_1031
2010-02-03 20:09:35,658 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_9116729021606317943_1025
2010-02-03 20:09:44,671 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_8602436668984954947_1026
jobtracker log
Input size for job job_201002031354_0012 = 53060
2010-02-03 19:48:37,599 INFO org.apache.hadoop.mapred.JobInProgress:
Split info for job:job_201002031354_0012
2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology:
Adding a new node: /default-rack/localhost
2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
tip:task_201002031354_0012_m_000000 has split on
node:/default-rack/localhost
2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
tip:task_201002031354_0012_m_000001 has split on
node:/default-rack/localhost
2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
tip:task_201002031354_0012_m_000002 has split on
node:/default-rack/localhost
2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
tip:task_201002031354_0012_m_000003 has split on
node:/default-rack/localhost
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
Input size for job job_201002031354_0013 = 53060
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
Split info for job:job_201002031354_0013
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
tip:task_201002031354_0013_m_000000 has split on
node:/default-rack/localhost
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
tip:task_201002031354_0013_m_000001 has split on
node:/default-rack/localhost
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
tip:task_201002031354_0013_m_000002 has split on
node:/default-rack/localhost
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
tip:task_201002031354_0013_m_000003 has split on
node:/default-rack/localhost
Thanks
Brian
Alex Kozlov wrote:
> Try
>
> $ bin/hadoop jar hadoop-*-examples.jar grep
> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>
> file:/// is a magical prefix to force hadoop to look for the file in the
> local FS
>
> You can also force it to look into local FS by giving '-fs local' or '-fs
> file:///' option to the hadoop executable
>
> These options basically overwrite the *fs.default.name* configuration
> setting, which should be in your core-site.xml file
>
> You can also copy the content of the input directory to HDFS by executing
>
> $ bin/hadoop fs -mkdir input
> $ bin/hadoop fs -copyFromLocal input/* input
>
> Hope this helps
>
> Alex K
>
> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>
>
>> Alex Kozlov wrote:
>>
>>
>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
>>>
>>> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs directory
>>> (or where your logs are) and check the errors.
>>>
>>> Alex K
>>>
>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>
>>>
>>
>> Thanks for your help, Alex,
>>
>> I managed to get past that problem, now I have this problem:
>>
>> However, when I try to run this example as stated on the quickstart
>> webpage:
>>
>>
>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>
>> I get this error;
>> =============================================================
>> java.io.IOException: Not a file:
>> hdfs://localhost:9000/user/brian/input/conf
>> =========================================================
>> so it seems to default to my home directory looking for "input" it
>> apparently needs an absolute filepath, however, when I run that way:
>>
>> $ bin/hadoop jar hadoop-*-examples.jar grep /usr/local/hadoop-0.19.2/input
>> output 'dfs[a-z.]+'
>>
>> ==============================================================
>> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>> ==============================================================
>> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>> <- does exist
>>
>> Aaron,
>>
>>>> Thanks or your help. I carefully went through the steps again a couple
>>>> times , and ran
>>>>
>>>> after this
>>>> bin/hadoop namenode -format
>>>>
>>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>>
>>>>
>>>> then
>>>>
>>>>
>>>> bin/start-dfs.sh
>>>>
>>>> and
>>>>
>>>> bin/start-all.sh
>>>>
>>>>
>>>> and then
>>>> bin/hadoop fs -put conf input
>>>>
>>>> now the return for this seemed cryptic:
>>>>
>>>>
>>>> put: Target input/conf is a directory
>>>>
>>>> (??)
>>>>
>>>> and when I tried
>>>>
>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>
>>>> It says something about 0 nodes
>>>>
>>>> (from log file)
>>>>
>>>> 2010-02-01 13:26:29,874 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>>
>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> dst=null perm=brian:supergroup:rw-r--r--
>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 3 on 9000, call
>>>>
>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>>> File
>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> could
>>>> only be replicated to 0 nodes, instead of 1
>>>> java.io.IOException: File
>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> could
>>>> only be replicated to 0 nodes, instead of 1
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>>
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>> at
>>>>
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>
>>>>
>>>>
>>>>
>>>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>>>
>>>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>>>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>>>
>>>>
>>>> and when I browse to http://localhost:50070/
>>>>
>>>>
>>>> Cluster Summary
>>>>
>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB
>>>> /
>>>> 992.31 MB (0%)
>>>> *
>>>> Configured Capacity : 0 KB
>>>> DFS Used : 0 KB
>>>> Non DFS Used : 0 KB
>>>> DFS Remaining : 0 KB
>>>> DFS Used% : 100 %
>>>> DFS Remaining% : 0 %
>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>>> 0
>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> :
>>>> 0
>>>>
>>>>
>>>> so I'm a bit still in the dark, I guess.
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>> Aaron Kimball wrote:
>>>>
>>>>
>>>>
>>>>
>>>>> Brian, it looks like you missed a step in the instructions. You'll need
>>>>> to
>>>>> format the hdfs filesystem instance before starting the NameNode server:
>>>>>
>>>>> You need to run:
>>>>>
>>>>> $ bin/hadoop namenode -format
>>>>>
>>>>> .. then you can do bin/start-dfs.sh
>>>>> Hope this helps,
>>>>> - Aaron
>>>>>
>>>>>
>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>>>> hadoop "quickstart" web page.
>>>>>>
>>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>>
>>>>>> This is from my hadoop-site.xml
>>>>>>
>>>>>> <configuration>
>>>>>> <property>
>>>>>> <name>hadoop.tmp.dir</name>
>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>fs.default.name</name>
>>>>>> <value>hdfs://localhost:9000</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>mapred.job.tracker</name>
>>>>>> <value>localhost:9001</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>> <value>-1</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>dfs.replication</name>
>>>>>> <value>1</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>dfs.permissions</name>
>>>>>> <value>false</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>webinterface.private.actions</name>
>>>>>> <value>true</value>
>>>>>> </property>
>>>>>> </configuration>
>>>>>>
>>>>>> These are errors from my log files:
>>>>>>
>>>>>>
>>>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>> localhost/
>>>>>> 127.0.0.1:9000
>>>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>> Initializing
>>>>>> NameNodeMeterics using context
>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> supergroup=supergroup
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> isPermissionEnabled=false
>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>> Initializing FSNamesystemMetrics using context
>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>> FSNamesystemStatusMBean
>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>>> initialization failed.
>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>>> state:
>>>>>> storage directory does not exist or is not accessible.
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>> at
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>> at
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>> at
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>>> server
>>>>>> on 9000
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> =========================================================
>>>>>>
>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>>> connect
>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>> problem cleaning system directory: null
>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>>>> further information
>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>> at
>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Brian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
Re: hadoop under cygwin issue
Posted by Alex Kozlov <al...@cloudera.com>.
Try
$ bin/hadoop jar hadoop-*-examples.jar grep
file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
file:/// is a magical prefix to force hadoop to look for the file in the
local FS
You can also force it to look into local FS by giving '-fs local' or '-fs
file:///' option to the hadoop executable
These options basically overwrite the *fs.default.name* configuration
setting, which should be in your core-site.xml file
You can also copy the content of the input directory to HDFS by executing
$ bin/hadoop fs -mkdir input
$ bin/hadoop fs -copyFromLocal input/* input
Hope this helps
Alex K
On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
> Alex Kozlov wrote:
>
>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
>>
>> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs directory
>> (or where your logs are) and check the errors.
>>
>> Alex K
>>
>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>>
>
>
>
> Thanks for your help, Alex,
>
> I managed to get past that problem, now I have this problem:
>
> However, when I try to run this example as stated on the quickstart
> webpage:
>
>
> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>
> I get this error;
> =============================================================
> java.io.IOException: Not a file:
> hdfs://localhost:9000/user/brian/input/conf
> =========================================================
> so it seems to default to my home directory looking for "input" it
> apparently needs an absolute filepath, however, when I run that way:
>
> $ bin/hadoop jar hadoop-*-examples.jar grep /usr/local/hadoop-0.19.2/input
> output 'dfs[a-z.]+'
>
> ==============================================================
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
> ==============================================================
> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
> <- does exist
>
> Aaron,
>>>
>>> Thanks or your help. I carefully went through the steps again a couple
>>> times , and ran
>>>
>>> after this
>>> bin/hadoop namenode -format
>>>
>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>
>>>
>>> then
>>>
>>>
>>> bin/start-dfs.sh
>>>
>>> and
>>>
>>> bin/start-all.sh
>>>
>>>
>>> and then
>>> bin/hadoop fs -put conf input
>>>
>>> now the return for this seemed cryptic:
>>>
>>>
>>> put: Target input/conf is a directory
>>>
>>> (??)
>>>
>>> and when I tried
>>>
>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>
>>> It says something about 0 nodes
>>>
>>> (from log file)
>>>
>>> 2010-02-01 13:26:29,874 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>>>
>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> dst=null perm=brian:supergroup:rw-r--r--
>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 9000, call
>>>
>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>> File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>> java.io.IOException: File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>
>>>
>>>
>>>
>>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>>
>>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>>
>>>
>>> and when I browse to http://localhost:50070/
>>>
>>>
>>> Cluster Summary
>>>
>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB
>>> /
>>> 992.31 MB (0%)
>>> *
>>> Configured Capacity : 0 KB
>>> DFS Used : 0 KB
>>> Non DFS Used : 0 KB
>>> DFS Remaining : 0 KB
>>> DFS Used% : 100 %
>>> DFS Remaining% : 0 %
>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> :
>>> 0
>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> :
>>> 0
>>>
>>>
>>> so I'm a bit still in the dark, I guess.
>>>
>>> Thanks
>>> Brian
>>>
>>>
>>>
>>>
>>> Aaron Kimball wrote:
>>>
>>>
>>>
>>>> Brian, it looks like you missed a step in the instructions. You'll need
>>>> to
>>>> format the hdfs filesystem instance before starting the NameNode server:
>>>>
>>>> You need to run:
>>>>
>>>> $ bin/hadoop namenode -format
>>>>
>>>> .. then you can do bin/start-dfs.sh
>>>> Hope this helps,
>>>> - Aaron
>>>>
>>>>
>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>>> hadoop "quickstart" web page.
>>>>>
>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>
>>>>> This is from my hadoop-site.xml
>>>>>
>>>>> <configuration>
>>>>> <property>
>>>>> <name>hadoop.tmp.dir</name>
>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>fs.default.name</name>
>>>>> <value>hdfs://localhost:9000</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.tracker</name>
>>>>> <value>localhost:9001</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>> <value>-1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.replication</name>
>>>>> <value>1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.permissions</name>
>>>>> <value>false</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>webinterface.private.actions</name>
>>>>> <value>true</value>
>>>>> </property>
>>>>> </configuration>
>>>>>
>>>>> These are errors from my log files:
>>>>>
>>>>>
>>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>> 2010-01-30 00:03:33,121 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>> localhost/
>>>>> 127.0.0.1:9000
>>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>> 2010-01-30 00:03:33,181 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>> Initializing
>>>>> NameNodeMeterics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> fsOwner=brian,None,Administrators,Users
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> supergroup=supergroup
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> isPermissionEnabled=false
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>> Initializing FSNamesystemMetrics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>> FSNamesystemStatusMBean
>>>>> 2010-01-30 00:03:34,803 INFO
>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>> initialization failed.
>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>> state:
>>>>> storage directory does not exist or is not accessible.
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>> at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>> at
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>> server
>>>>> on 9000
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> =========================================================
>>>>>
>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>> connect
>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>> problem cleaning system directory: null
>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>>> further information
>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>> at
>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Brian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Re: hadoop under cygwin issue
Posted by Brian Wolf <br...@gmail.com>.
Alex Kozlov wrote:
> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
>
> You datanode is dead. Look at the logs in the $HADOOP_HOME/logs directory
> (or where your logs are) and check the errors.
>
> Alex K
>
> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>
>
Thanks for your help, Alex,
I managed to get past that problem, now I have this problem:
However, when I try to run this example as stated on the quickstart webpage:
bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
I get this error;
=============================================================
java.io.IOException: Not a file:
hdfs://localhost:9000/user/brian/input/conf
=========================================================
so it seems to default to my home directory looking for "input" it
apparently needs an absolute filepath, however, when I run that way:
$ bin/hadoop jar hadoop-*-examples.jar grep
/usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
==============================================================
org.apache.hadoop.mapred.InvalidInputException: Input path does not
exist: hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
==============================================================
It still isn't happy although this part ->
/usr/local/hadoop-0.19.2/input <- does exist
>> Aaron,
>>
>> Thanks or your help. I carefully went through the steps again a couple
>> times , and ran
>>
>> after this
>> bin/hadoop namenode -format
>>
>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>
>>
>> then
>>
>>
>> bin/start-dfs.sh
>>
>> and
>>
>> bin/start-all.sh
>>
>>
>> and then
>> bin/hadoop fs -put conf input
>>
>> now the return for this seemed cryptic:
>>
>>
>> put: Target input/conf is a directory
>>
>> (??)
>>
>> and when I tried
>>
>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>
>> It says something about 0 nodes
>>
>> (from log file)
>>
>> 2010-02-01 13:26:29,874 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>> dst=null perm=brian:supergroup:rw-r--r--
>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 3 on 9000, call
>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException: File
>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar could
>> only be replicated to 0 nodes, instead of 1
>> java.io.IOException: File
>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar could
>> only be replicated to 0 nodes, instead of 1
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>
>>
>>
>>
>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>
>> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
>> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>>
>>
>> and when I browse to http://localhost:50070/
>>
>>
>> Cluster Summary
>>
>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB /
>> 992.31 MB (0%)
>> *
>> Configured Capacity : 0 KB
>> DFS Used : 0 KB
>> Non DFS Used : 0 KB
>> DFS Remaining : 0 KB
>> DFS Used% : 100 %
>> DFS Remaining% : 0 %
>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> : 0
>>
>>
>> so I'm a bit still in the dark, I guess.
>>
>> Thanks
>> Brian
>>
>>
>>
>>
>> Aaron Kimball wrote:
>>
>>
>>> Brian, it looks like you missed a step in the instructions. You'll need to
>>> format the hdfs filesystem instance before starting the NameNode server:
>>>
>>> You need to run:
>>>
>>> $ bin/hadoop namenode -format
>>>
>>> .. then you can do bin/start-dfs.sh
>>> Hope this helps,
>>> - Aaron
>>>
>>>
>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>
>>>
>>>> Hi,
>>>>
>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>> hadoop "quickstart" web page.
>>>>
>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>
>>>> This is from my hadoop-site.xml
>>>>
>>>> <configuration>
>>>> <property>
>>>> <name>hadoop.tmp.dir</name>
>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>> </property>
>>>> <property>
>>>> <name>fs.default.name</name>
>>>> <value>hdfs://localhost:9000</value>
>>>> </property>
>>>> <property>
>>>> <name>mapred.job.tracker</name>
>>>> <value>localhost:9001</value>
>>>> </property>
>>>> <property>
>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>> <value>-1</value>
>>>> </property>
>>>> <property>
>>>> <name>dfs.replication</name>
>>>> <value>1</value>
>>>> </property>
>>>> <property>
>>>> <name>dfs.permissions</name>
>>>> <value>false</value>
>>>> </property>
>>>> <property>
>>>> <name>webinterface.private.actions</name>
>>>> <value>true</value>
>>>> </property>
>>>> </configuration>
>>>>
>>>> These are errors from my log files:
>>>>
>>>>
>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>> 2010-01-30 00:03:33,121 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>> localhost/
>>>> 127.0.0.1:9000
>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>> 2010-01-30 00:03:33,181 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>> Initializing
>>>> NameNodeMeterics using context
>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>> 2010-01-30 00:03:34,603 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>> fsOwner=brian,None,Administrators,Users
>>>> 2010-01-30 00:03:34,603 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>> supergroup=supergroup
>>>> 2010-01-30 00:03:34,603 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>> isPermissionEnabled=false
>>>> 2010-01-30 00:03:34,653 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>> Initializing FSNamesystemMetrics using context
>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>> 2010-01-30 00:03:34,653 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>> FSNamesystemStatusMBean
>>>> 2010-01-30 00:03:34,803 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>> 2010-01-30 00:03:34,813 ERROR
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>> initialization failed.
>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>> state:
>>>> storage directory does not exist or is not accessible.
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>> at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>> at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>> at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>> at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>> server
>>>> on 9000
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> =========================================================
>>>>
>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect
>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>> problem cleaning system directory: null
>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>> further information
>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>> at
>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
>
Re: hadoop under cygwin issue
Posted by Alex Kozlov <al...@cloudera.com>.
Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
You datanode is dead. Look at the logs in the $HADOOP_HOME/logs directory
(or where your logs are) and check the errors.
Alex K
On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>
> Aaron,
>
> Thanks or your help. I carefully went through the steps again a couple
> times , and ran
>
> after this
> bin/hadoop namenode -format
>
> (by the way, it asks if I want to reformat, I've tried it both ways)
>
>
> then
>
>
> bin/start-dfs.sh
>
> and
>
> bin/start-all.sh
>
>
> and then
> bin/hadoop fs -put conf input
>
> now the return for this seemed cryptic:
>
>
> put: Target input/conf is a directory
>
> (??)
>
> and when I tried
>
> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>
> It says something about 0 nodes
>
> (from log file)
>
> 2010-02-01 13:26:29,874 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
> dst=null perm=brian:supergroup:rw-r--r--
> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 9000, call
> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException: File
> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar could
> only be replicated to 0 nodes, instead of 1
> java.io.IOException: File
> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar could
> only be replicated to 0 nodes, instead of 1
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>
>
>
> To maybe rule out something regarding ports or ssh , when I run netstat:
>
> TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
> TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
>
>
> and when I browse to http://localhost:50070/
>
>
> Cluster Summary
>
> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB /
> 992.31 MB (0%)
> *
> Configured Capacity : 0 KB
> DFS Used : 0 KB
> Non DFS Used : 0 KB
> DFS Remaining : 0 KB
> DFS Used% : 100 %
> DFS Remaining% : 0 %
> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> : 0
>
>
> so I'm a bit still in the dark, I guess.
>
> Thanks
> Brian
>
>
>
>
> Aaron Kimball wrote:
>
>> Brian, it looks like you missed a step in the instructions. You'll need to
>> format the hdfs filesystem instance before starting the NameNode server:
>>
>> You need to run:
>>
>> $ bin/hadoop namenode -format
>>
>> .. then you can do bin/start-dfs.sh
>> Hope this helps,
>> - Aaron
>>
>>
>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>>
>>> Hi,
>>>
>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>> hadoop "quickstart" web page.
>>>
>>> I know sshd is running and I can "ssh localhost" without a password.
>>>
>>> This is from my hadoop-site.xml
>>>
>>> <configuration>
>>> <property>
>>> <name>hadoop.tmp.dir</name>
>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>> </property>
>>> <property>
>>> <name>fs.default.name</name>
>>> <value>hdfs://localhost:9000</value>
>>> </property>
>>> <property>
>>> <name>mapred.job.tracker</name>
>>> <value>localhost:9001</value>
>>> </property>
>>> <property>
>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>> <value>-1</value>
>>> </property>
>>> <property>
>>> <name>dfs.replication</name>
>>> <value>1</value>
>>> </property>
>>> <property>
>>> <name>dfs.permissions</name>
>>> <value>false</value>
>>> </property>
>>> <property>
>>> <name>webinterface.private.actions</name>
>>> <value>true</value>
>>> </property>
>>> </configuration>
>>>
>>> These are errors from my log files:
>>>
>>>
>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>> 2010-01-30 00:03:33,121 INFO
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>> localhost/
>>> 127.0.0.1:9000
>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>> 2010-01-30 00:03:33,181 INFO
>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>> Initializing
>>> NameNodeMeterics using context
>>> object:org.apache.hadoop.metrics.spi.NullContext
>>> 2010-01-30 00:03:34,603 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> fsOwner=brian,None,Administrators,Users
>>> 2010-01-30 00:03:34,603 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> supergroup=supergroup
>>> 2010-01-30 00:03:34,603 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> isPermissionEnabled=false
>>> 2010-01-30 00:03:34,653 INFO
>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>> Initializing FSNamesystemMetrics using context
>>> object:org.apache.hadoop.metrics.spi.NullContext
>>> 2010-01-30 00:03:34,653 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>> FSNamesystemStatusMBean
>>> 2010-01-30 00:03:34,803 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage:
>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>> 2010-01-30 00:03:34,813 ERROR
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>> initialization failed.
>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>> state:
>>> storage directory does not exist or is not accessible.
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>> at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>> at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>> at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>> at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>> server
>>> on 9000
>>>
>>>
>>>
>>>
>>>
>>> =========================================================
>>>
>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>> problem cleaning system directory: null
>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>> connection exception: java.net.ConnectException: Connection refused: no
>>> further information
>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>> at
>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>
>>>
>>>
>>> Thanks
>>> Brian
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
Re: hadoop under cygwin issue
Posted by Brian Wolf <br...@gmail.com>.
Aaron,
Thanks or your help. I carefully went through the steps again a couple
times , and ran
after this
bin/hadoop namenode -format
(by the way, it asks if I want to reformat, I've tried it both ways)
then
bin/start-dfs.sh
and
bin/start-all.sh
and then
bin/hadoop fs -put conf input
now the return for this seemed cryptic:
put: Target input/conf is a directory
(??)
and when I tried
bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
It says something about 0 nodes
(from log file)
2010-02-01 13:26:29,874 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=brian,None,Administrators,Users ip=/127.0.0.1 cmd=create
src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
dst=null perm=brian:supergroup:rw-r--r--
2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 9000, call
addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
File
/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
could only be replicated to 0 nodes, instead of 1
java.io.IOException: File
/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
could only be replicated to 0 nodes, instead of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
To maybe rule out something regarding ports or ssh , when I run netstat:
TCP 127.0.0.1:9000 0.0.0.0:0 LISTENING
TCP 127.0.0.1:9001 0.0.0.0:0 LISTENING
and when I browse to http://localhost:50070/
Cluster Summary
* * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01
MB / 992.31 MB (0%)
*
Configured Capacity : 0 KB
DFS Used : 0 KB
Non DFS Used : 0 KB
DFS Remaining : 0 KB
DFS Used% : 100 %
DFS Remaining% : 0 %
Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> : 0
Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> : 0
so I'm a bit still in the dark, I guess.
Thanks
Brian
Aaron Kimball wrote:
> Brian, it looks like you missed a step in the instructions. You'll need to
> format the hdfs filesystem instance before starting the NameNode server:
>
> You need to run:
>
> $ bin/hadoop namenode -format
>
> .. then you can do bin/start-dfs.sh
> Hope this helps,
> - Aaron
>
>
> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>
>
>> Hi,
>>
>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>> hadoop "quickstart" web page.
>>
>> I know sshd is running and I can "ssh localhost" without a password.
>>
>> This is from my hadoop-site.xml
>>
>> <configuration>
>> <property>
>> <name>hadoop.tmp.dir</name>
>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>> </property>
>> <property>
>> <name>fs.default.name</name>
>> <value>hdfs://localhost:9000</value>
>> </property>
>> <property>
>> <name>mapred.job.tracker</name>
>> <value>localhost:9001</value>
>> </property>
>> <property>
>> <name>mapred.job.reuse.jvm.num.tasks</name>
>> <value>-1</value>
>> </property>
>> <property>
>> <name>dfs.replication</name>
>> <value>1</value>
>> </property>
>> <property>
>> <name>dfs.permissions</name>
>> <value>false</value>
>> </property>
>> <property>
>> <name>webinterface.private.actions</name>
>> <value>true</value>
>> </property>
>> </configuration>
>>
>> These are errors from my log files:
>>
>>
>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>> Initializing RPC Metrics with hostName=NameNode, port=9000
>> 2010-01-30 00:03:33,121 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/
>> 127.0.0.1:9000
>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>> 2010-01-30 00:03:33,181 INFO
>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing
>> NameNodeMeterics using context
>> object:org.apache.hadoop.metrics.spi.NullContext
>> 2010-01-30 00:03:34,603 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> fsOwner=brian,None,Administrators,Users
>> 2010-01-30 00:03:34,603 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
>> 2010-01-30 00:03:34,603 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> isPermissionEnabled=false
>> 2010-01-30 00:03:34,653 INFO
>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>> Initializing FSNamesystemMetrics using context
>> object:org.apache.hadoop.metrics.spi.NullContext
>> 2010-01-30 00:03:34,653 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>> FSNamesystemStatusMBean
>> 2010-01-30 00:03:34,803 INFO org.apache.hadoop.hdfs.server.common.Storage:
>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>> 2010-01-30 00:03:34,813 ERROR
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>> initialization failed.
>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent state:
>> storage directory does not exist or is not accessible.
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>> at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping server
>> on 9000
>>
>>
>>
>>
>>
>> =========================================================
>>
>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying connect
>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>> problem cleaning system directory: null
>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>> connection exception: java.net.ConnectException: Connection refused: no
>> further information
>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>> at $Proxy4.getProtocolVersion(Unknown Source)
>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>> at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>
>>
>>
>> Thanks
>> Brian
>>
>>
>>
>
>
Re: hadoop under cygwin issue
Posted by Aaron Kimball <aa...@cloudera.com>.
Brian, it looks like you missed a step in the instructions. You'll need to
format the hdfs filesystem instance before starting the NameNode server:
You need to run:
$ bin/hadoop namenode -format
.. then you can do bin/start-dfs.sh
Hope this helps,
- Aaron
On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>
> Hi,
>
> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
> hadoop "quickstart" web page.
>
> I know sshd is running and I can "ssh localhost" without a password.
>
> This is from my hadoop-site.xml
>
> <configuration>
> <property>
> <name>hadoop.tmp.dir</name>
> <value>/cygwin/tmp/hadoop-${user.name}</value>
> </property>
> <property>
> <name>fs.default.name</name>
> <value>hdfs://localhost:9000</value>
> </property>
> <property>
> <name>mapred.job.tracker</name>
> <value>localhost:9001</value>
> </property>
> <property>
> <name>mapred.job.reuse.jvm.num.tasks</name>
> <value>-1</value>
> </property>
> <property>
> <name>dfs.replication</name>
> <value>1</value>
> </property>
> <property>
> <name>dfs.permissions</name>
> <value>false</value>
> </property>
> <property>
> <name>webinterface.private.actions</name>
> <value>true</value>
> </property>
> </configuration>
>
> These are errors from my log files:
>
>
> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> Initializing RPC Metrics with hostName=NameNode, port=9000
> 2010-01-30 00:03:33,121 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/
> 127.0.0.1:9000
> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=NameNode, sessionId=null
> 2010-01-30 00:03:33,181 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing
> NameNodeMeterics using context
> object:org.apache.hadoop.metrics.spi.NullContext
> 2010-01-30 00:03:34,603 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> fsOwner=brian,None,Administrators,Users
> 2010-01-30 00:03:34,603 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
> 2010-01-30 00:03:34,603 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> isPermissionEnabled=false
> 2010-01-30 00:03:34,653 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
> Initializing FSNamesystemMetrics using context
> object:org.apache.hadoop.metrics.spi.NullContext
> 2010-01-30 00:03:34,653 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
> FSNamesystemStatusMBean
> 2010-01-30 00:03:34,803 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
> 2010-01-30 00:03:34,813 ERROR
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
> initialization failed.
> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent state:
> storage directory does not exist or is not accessible.
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping server
> on 9000
>
>
>
>
>
> =========================================================
>
> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
> problem cleaning system directory: null
> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
> connection exception: java.net.ConnectException: Connection refused: no
> further information
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
> at org.apache.hadoop.ipc.Client.call(Client.java:700)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
> at $Proxy4.getProtocolVersion(Unknown Source)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
> at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>
>
>
> Thanks
> Brian
>
>