You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Brian Wolf <br...@gmail.com> on 2010/01/30 09:27:45 UTC

hadoop under cygwin issue

Hi,

I am trying to run Hadoop 0.19.2 under cygwin as per directions on the 
hadoop "quickstart" web page.

I know sshd is running and I can "ssh localhost" without a password.

This is from my hadoop-site.xml

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/cygwin/tmp/hadoop-${user.name}</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
<property>
<name>mapred.job.reuse.jvm.num.tasks</name>
<value>-1</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>webinterface.private.actions</name>
<value>true</value>
</property>
</configuration>

These are errors from my log files:


2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: 
Initializing RPC Metrics with hostName=NameNode, port=9000
2010-01-30 00:03:33,121 INFO 
org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: 
localhost/127.0.0.1:9000
2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: 
Initializing JVM Metrics with processName=NameNode, sessionId=null
2010-01-30 00:03:33,181 INFO 
org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: 
Initializing NameNodeMeterics using context 
object:org.apache.hadoop.metrics.spi.NullContext
2010-01-30 00:03:34,603 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
fsOwner=brian,None,Administrators,Users
2010-01-30 00:03:34,603 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2010-01-30 00:03:34,603 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
isPermissionEnabled=false
2010-01-30 00:03:34,653 INFO 
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: 
Initializing FSNamesystemMetrics using context 
object:org.apache.hadoop.metrics.spi.NullContext
2010-01-30 00:03:34,653 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered 
FSNamesystemStatusMBean
2010-01-30 00:03:34,803 INFO 
org.apache.hadoop.hdfs.server.common.Storage: Storage directory 
C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
2010-01-30 00:03:34,813 ERROR 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: 
Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent 
state: storage directory does not exist or is not accessible.
    at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
    at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping 
server on 9000





=========================================================

2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
 problem cleaning system directory: null
java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on 
connection exception: java.net.ConnectException: Connection refused: no 
further information
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
    at org.apache.hadoop.ipc.Client.call(Client.java:700)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
    at $Proxy4.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
    at 
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)



Thanks
Brian

Re: hadoop under cygwin issue

Posted by Brian Wolf <br...@gmail.com>.

Thanks for the insight, Ed.  Thats actually a pretty big "gesalt" for 
me, I have to process it a bit (I had read about it, of course)

Brian


Ed Mazur wrote:
> Brian,
>
> It looks like you're confusing your local file system with HDFS. HDFS
> sits on top of your file system and is where data for (non-standalone)
> Hadoop jobs comes from. You can poll it with "fs -ls ...", so do
> something like "hadoop fs -lsr /" to see everything in HDFS. This will
> probably shed some light on why your first attempt failed.
> /user/brian/input should be a directory with several xml files.
>
> Ed
>
> On Wed, Feb 3, 2010 at 5:17 PM, Brian Wolf <br...@gmail.com> wrote:
>   
>> Alex Kozlov wrote:
>>     
>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0
>>>
>>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs directory
>>> (or where your logs are) and check the errors.
>>>
>>> Alex K
>>>
>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>       
>>
>> Thanks for your help, Alex,
>>
>> I managed to get past that problem, now I have this problem:
>>
>> However, when I try to run this example as stated on the quickstart webpage:
>>
>> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'
>>
>> I get this error;
>> =============================================================
>> java.io.IOException:       Not a file:
>> hdfs://localhost:9000/user/brian/input/conf
>> =========================================================
>> so it seems to default to my home directory looking for "input" it
>> apparently  needs an absolute filepath, however, when I  run that way:
>>
>> $ bin/hadoop jar hadoop-*-examples.jar grep /usr/local/hadoop-0.19.2/input
>>  output 'dfs[a-z.]+'
>>
>> ==============================================================
>> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>> ==============================================================
>> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>>  <-  does exist
>>     
>>>> Aaron,
>>>>
>>>> Thanks or your help. I  carefully went through the steps again a couple
>>>> times , and ran
>>>>
>>>> after this
>>>> bin/hadoop namenode -format
>>>>
>>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>>
>>>>
>>>> then
>>>>
>>>>
>>>> bin/start-dfs.sh
>>>>
>>>> and
>>>>
>>>> bin/start-all.sh
>>>>
>>>>
>>>> and then
>>>> bin/hadoop fs -put conf input
>>>>
>>>> now the return for this seemed cryptic:
>>>>
>>>>
>>>> put: Target input/conf is a directory
>>>>
>>>> (??)
>>>>
>>>>  and when I tried
>>>>
>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>
>>>> It says something about 0 nodes
>>>>
>>>> (from log file)
>>>>
>>>> 2010-02-01 13:26:29,874 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>>
>>>>  src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>  dst=null    perm=brian:supergroup:rw-r--r--
>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 3 on 9000, call
>>>>
>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>>> File
>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> could
>>>> only be replicated to 0 nodes, instead of 1
>>>> java.io.IOException: File
>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> could
>>>> only be replicated to 0 nodes, instead of 1
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>  at
>>>>
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>  at
>>>>
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>
>>>>
>>>>
>>>>
>>>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>>>
>>>>  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>>  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>>
>>>>
>>>> and when I browse to http://localhost:50070/
>>>>
>>>>
>>>>    Cluster Summary
>>>>
>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB
>>>> /
>>>> 992.31 MB (0%)
>>>> *
>>>> Configured Capacity     :       0 KB
>>>> DFS Used        :       0 KB
>>>> Non DFS Used    :       0 KB
>>>> DFS Remaining   :       0 KB
>>>> DFS Used%       :       100 %
>>>> DFS Remaining%  :       0 %
>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0
>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :       0
>>>>
>>>>
>>>> so I'm a bit still in the dark, I guess.
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>> Aaron Kimball wrote:
>>>>
>>>>
>>>>         
>>>>> Brian, it looks like you missed a step in the instructions. You'll need
>>>>> to
>>>>> format the hdfs filesystem instance before starting the NameNode server:
>>>>>
>>>>> You need to run:
>>>>>
>>>>> $ bin/hadoop namenode -format
>>>>>
>>>>> .. then you can do bin/start-dfs.sh
>>>>> Hope this helps,
>>>>> - Aaron
>>>>>
>>>>>
>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>>>> hadoop "quickstart" web page.
>>>>>>
>>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>>
>>>>>> This is from my hadoop-site.xml
>>>>>>
>>>>>> <configuration>
>>>>>> <property>
>>>>>> <name>hadoop.tmp.dir</name>
>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>fs.default.name</name>
>>>>>> <value>hdfs://localhost:9000</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>mapred.job.tracker</name>
>>>>>> <value>localhost:9001</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>> <value>-1</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>dfs.replication</name>
>>>>>> <value>1</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>dfs.permissions</name>
>>>>>> <value>false</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>webinterface.private.actions</name>
>>>>>> <value>true</value>
>>>>>> </property>
>>>>>> </configuration>
>>>>>>
>>>>>> These are errors from my log files:
>>>>>>
>>>>>>
>>>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>> localhost/
>>>>>> 127.0.0.1:9000
>>>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>> Initializing
>>>>>> NameNodeMeterics using context
>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> supergroup=supergroup
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> isPermissionEnabled=false
>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>> Initializing FSNamesystemMetrics using context
>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>> FSNamesystemStatusMBean
>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>>> initialization failed.
>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>>> state:
>>>>>> storage directory does not exist or is not accessible.
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>>  at
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>>  at
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>>  at
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>>> server
>>>>>> on 9000
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> =========================================================
>>>>>>
>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>>> connect
>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>> problem cleaning system directory: null
>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>>>> further information
>>>>>>  at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>  at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>  at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>  at
>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Brian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>           
>>>>         
>>>       
>>

Re: hadoop under cygwin issue

Posted by Ed Mazur <ma...@cs.umass.edu>.

Brian,

It looks like you're confusing your local file system with HDFS. HDFS
sits on top of your file system and is where data for (non-standalone)
Hadoop jobs comes from. You can poll it with "fs -ls ...", so do
something like "hadoop fs -lsr /" to see everything in HDFS. This will
probably shed some light on why your first attempt failed.
/user/brian/input should be a directory with several xml files.

Ed

On Wed, Feb 3, 2010 at 5:17 PM, Brian Wolf <br...@gmail.com> wrote:
> Alex Kozlov wrote:
>>
>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0
>>
>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs directory
>> (or where your logs are) and check the errors.
>>
>> Alex K
>>
>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>
>
>
> Thanks for your help, Alex,
>
> I managed to get past that problem, now I have this problem:
>
> However, when I try to run this example as stated on the quickstart webpage:
>
> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'
>
> I get this error;
> =============================================================
> java.io.IOException:       Not a file:
> hdfs://localhost:9000/user/brian/input/conf
> =========================================================
> so it seems to default to my home directory looking for "input" it
> apparently  needs an absolute filepath, however, when I  run that way:
>
> $ bin/hadoop jar hadoop-*-examples.jar grep /usr/local/hadoop-0.19.2/input
>  output 'dfs[a-z.]+'
>
> ==============================================================
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
> ==============================================================
> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>  <-  does exist
>>>
>>> Aaron,
>>>
>>> Thanks or your help. I  carefully went through the steps again a couple
>>> times , and ran
>>>
>>> after this
>>> bin/hadoop namenode -format
>>>
>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>
>>>
>>> then
>>>
>>>
>>> bin/start-dfs.sh
>>>
>>> and
>>>
>>> bin/start-all.sh
>>>
>>>
>>> and then
>>> bin/hadoop fs -put conf input
>>>
>>> now the return for this seemed cryptic:
>>>
>>>
>>> put: Target input/conf is a directory
>>>
>>> (??)
>>>
>>>  and when I tried
>>>
>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>
>>> It says something about 0 nodes
>>>
>>> (from log file)
>>>
>>> 2010-02-01 13:26:29,874 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>
>>>  src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>  dst=null    perm=brian:supergroup:rw-r--r--
>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 9000, call
>>>
>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>> File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>> java.io.IOException: File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>  at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>  at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>
>>>
>>>
>>>
>>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>>
>>>  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>
>>>
>>> and when I browse to http://localhost:50070/
>>>
>>>
>>>    Cluster Summary
>>>
>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB
>>> /
>>> 992.31 MB (0%)
>>> *
>>> Configured Capacity     :       0 KB
>>> DFS Used        :       0 KB
>>> Non DFS Used    :       0 KB
>>> DFS Remaining   :       0 KB
>>> DFS Used%       :       100 %
>>> DFS Remaining%  :       0 %
>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0
>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :       0
>>>
>>>
>>> so I'm a bit still in the dark, I guess.
>>>
>>> Thanks
>>> Brian
>>>
>>>
>>>
>>>
>>> Aaron Kimball wrote:
>>>
>>>
>>>>
>>>> Brian, it looks like you missed a step in the instructions. You'll need
>>>> to
>>>> format the hdfs filesystem instance before starting the NameNode server:
>>>>
>>>> You need to run:
>>>>
>>>> $ bin/hadoop namenode -format
>>>>
>>>> .. then you can do bin/start-dfs.sh
>>>> Hope this helps,
>>>> - Aaron
>>>>
>>>>
>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>>> hadoop "quickstart" web page.
>>>>>
>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>
>>>>> This is from my hadoop-site.xml
>>>>>
>>>>> <configuration>
>>>>> <property>
>>>>> <name>hadoop.tmp.dir</name>
>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>fs.default.name</name>
>>>>> <value>hdfs://localhost:9000</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.tracker</name>
>>>>> <value>localhost:9001</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>> <value>-1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.replication</name>
>>>>> <value>1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.permissions</name>
>>>>> <value>false</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>webinterface.private.actions</name>
>>>>> <value>true</value>
>>>>> </property>
>>>>> </configuration>
>>>>>
>>>>> These are errors from my log files:
>>>>>
>>>>>
>>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>> 2010-01-30 00:03:33,121 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>> localhost/
>>>>> 127.0.0.1:9000
>>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>> 2010-01-30 00:03:33,181 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>> Initializing
>>>>> NameNodeMeterics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> fsOwner=brian,None,Administrators,Users
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> supergroup=supergroup
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> isPermissionEnabled=false
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>> Initializing FSNamesystemMetrics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>> FSNamesystemStatusMBean
>>>>> 2010-01-30 00:03:34,803 INFO
>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>> initialization failed.
>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>> state:
>>>>> storage directory does not exist or is not accessible.
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>  at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>  at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>  at
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>> server
>>>>> on 9000
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> =========================================================
>>>>>
>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>> connect
>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>> problem cleaning system directory: null
>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>>> further information
>>>>>  at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>  at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>  at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>  at
>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Brian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: hadoop under cygwin issue

Posted by Brian Wolf <br...@gmail.com>.

Alex Kozlov wrote:
> Hi Brian,
>
> Is your namenode running?  Try 'hadoop fs -ls /'.
>
> Alex
>
>
> On Mar 12, 2010, at 5:20 PM, Brian Wolf <br...@gmail.com> wrote:
>
>> Hi Alex,
>>
>> I am back on this problem.  Seems it works, but I have this issue 
>> with connecting to server.
>> I can connect 'ssh localhost' ok.
>>
>> Thanks
>> Brian
>>
>> $ bin/hadoop jar hadoop-*-examples.jar pi 2 2
>> Number of Maps  = 2
>> Samples per Map = 2
>> 10/03/12 17:16:17 INFO ipc.Client: Retrying connect to server: 
>> localhost/127.0.0.1:9000. Already tried 0 time(s).
>> 10/03/12 17:16:19 INFO ipc.Client: Retrying connect to server: 
>> localhost/127.0.0.1:9000. Already tried 1 time(s).
>>
>>
>>
>> Alex Kozlov wrote:
>>> Can you endeavor a simpler job (just to make sure your setup works):
>>>
>>> $ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2
>>>
>>> Alex K
>>>
>>> On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>> Alex, thanks for the help,  it seems to start now, however
>>>>
>>>>
>>>> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
>>>> 'dfs[a-z.]+'
>>>> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated 
>>>> filesystem
>>>> name. Use "file:///" instead.
>>>> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to 
>>>> process
>>>> : 3
>>>> 10/02/03 20:02:44 INFO mapred.JobClient: Running job: 
>>>> job_201002031354_0013
>>>> 10/02/03 20:02:45 INFO mapred.JobClient:  map 0% reduce 0%
>>>>
>>>>
>>>>
>>>> it hangs here (is pseudo cluster  supposed to work?)
>>>>
>>>>
>>>> these are bottom of various log files
>>>>
>>>> conf log file
>>>>
>>>>
>>>> <property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property> 
>>>>
>>>>
>>>> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property> 
>>>>
>>>> <property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030 
>>>>
>>>> </value></property>
>>>> <property><name>io.file.buffer.size</name><value>4096</value></property> 
>>>>
>>>>
>>>> <property><name>mapred.jobtracker.restart.recover</name><value>false</value></property> 
>>>>
>>>>
>>>> <property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property> 
>>>>
>>>>
>>>> <property><name>dfs.datanode.handler.count</name><value>3</value></property> 
>>>>
>>>>
>>>> <property><name>mapred.reduce.copy.backoff</name><value>300</value></property> 
>>>>
>>>> <property><name>mapred.task.profile</name><value>false</value></property> 
>>>>
>>>>
>>>> <property><name>dfs.replication.considerLoad</name><value>true</value></property> 
>>>>
>>>>
>>>> <property><name>jobclient.output.filter</name><value>FAILED</value></property> 
>>>>
>>>>
>>>> <property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property> 
>>>>
>>>>
>>>> <property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property> 
>>>>
>>>> <property><name>fs.checkpoint.size</name><value>67108864</value></property> 
>>>>
>>>>
>>>>
>>>> bottom
>>>> namenode log
>>>>
>>>> added to blk_6520091160827873550_1036 size 570
>>>> 2010-02-03 20:02:43,826 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml 
>>>>
>>>> dst=null    perm=brian:supergroup:rw-r--r--
>>>> 2010-02-03 20:02:43,866 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    
>>>> cmd=setPermission
>>>>   
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml 
>>>>
>>>>   dst=null    perm=brian:supergroup:rw-r--r--
>>>> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange: 
>>>> BLOCK*
>>>> NameSystem.allocateBlock:
>>>> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml.
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange: 
>>>> BLOCK*
>>>> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is 
>>>> added to
>>>> blk_517844159758473296_1037 size 16238
>>>> 2010-02-03 20:02:44,257 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml 
>>>>
>>>> dst=null    perm=null
>>>> 2010-02-03 20:02:44,527 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar 
>>>>
>>>> dst=null    perm=null
>>>> 2010-02-03 20:02:45,258 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split 
>>>>
>>>>   dst=null    perm=null
>>>>
>>>>
>>>> bottom
>>>> datanode log
>>>>
>>>> 2010-02-03 20:02:44,046 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
>>>> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: 
>>>> /127.0.0.1:50010
>>>> 2010-02-03 20:02:44,076 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE,
>>>> cliID: DFSClient_-1424524646, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,086 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 
>>>> for block
>>>> blk_517844159758473296_1037 terminating
>>>> 2010-02-03 20:02:44,457 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
>>>> cliID: DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,677 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ,
>>>> cliID: DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_-2806977820057440405_1035
>>>> 2010-02-03 20:02:45,278 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ, 
>>>> cliID:
>>>> DFSClient_-548531246, srvID: 
>>>> DS-1812377383-192.168.1.5-50010-1265088397104,
>>>> blockid: blk_6520091160827873550_1036
>>>> 2010-02-03 20:04:10,451 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_3301977249866081256_1031
>>>> 2010-02-03 20:09:35,658 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_9116729021606317943_1025
>>>> 2010-02-03 20:09:44,671 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_8602436668984954947_1026
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> jobtracker log
>>>>
>>>> Input size for job job_201002031354_0012 = 53060
>>>> 2010-02-03 19:48:37,599 INFO 
>>>> org.apache.hadoop.mapred.JobInProgress: Split
>>>> info for job:job_201002031354_0012
>>>> 2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology: 
>>>> Adding
>>>> a new node: /default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000000 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000001 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000002 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000003 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO 
>>>> org.apache.hadoop.mapred.JobInProgress: Input
>>>> size for job job_201002031354_0013 = 53060
>>>> 2010-02-03 20:02:45,278 INFO 
>>>> org.apache.hadoop.mapred.JobInProgress: Split
>>>> info for job:job_201002031354_0013
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000000 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000001 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000002 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000003 has split on
>>>> node:/default-rack/localhost
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Alex Kozlov wrote:
>>>>
>>>>
>>>>> Try
>>>>>
>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>>>>
>>>>> file:/// is a magical prefix to force hadoop to look for the file 
>>>>> in the
>>>>> local FS
>>>>>
>>>>> You can also force it to look into local FS by giving '-fs local' 
>>>>> or '-fs
>>>>> file:///' option to the hadoop executable
>>>>>
>>>>> These options basically overwrite the *fs.default.name* configuration
>>>>> setting, which should be in your core-site.xml file
>>>>>
>>>>> You can also copy the content of the input directory to HDFS by 
>>>>> executing
>>>>>
>>>>> $ bin/hadoop fs -mkdir input
>>>>> $ bin/hadoop fs -copyFromLocal input/* input
>>>>>
>>>>> Hope this helps
>>>>>
>>>>> Alex K
>>>>>
>>>>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Alex Kozlov wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>>>>> 0
>>>>>>>
>>>>>>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs
>>>>>>> directory
>>>>>>> (or where your logs are) and check the errors.
>>>>>>>
>>>>>>> Alex K
>>>>>>>
>>>>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> 
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Thanks for your help, Alex,
>>>>>>
>>>>>> I managed to get past that problem, now I have this problem:
>>>>>>
>>>>>> However, when I try to run this example as stated on the quickstart
>>>>>> webpage:
>>>>>>
>>>>>>
>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'
>>>>>>
>>>>>> I get this error;
>>>>>> =============================================================
>>>>>> java.io.IOException:       Not a file:
>>>>>> hdfs://localhost:9000/user/brian/input/conf
>>>>>> =========================================================
>>>>>> so it seems to default to my home directory looking for "input" it
>>>>>> apparently  needs an absolute filepath, however, when I  run that 
>>>>>> way:
>>>>>>
>>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>>> /usr/local/hadoop-0.19.2/input
>>>>>> output 'dfs[a-z.]+'
>>>>>>
>>>>>> ==============================================================
>>>>>> org.apache.hadoop.mapred.InvalidInputException: Input path does not
>>>>>> exist:
>>>>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>>>>> ==============================================================
>>>>>> It still isn't happy although this part -> 
>>>>>> /usr/local/hadoop-0.19.2/input
>>>>>> <-  does exist
>>>>>>
>>>>>> Aaron,
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks or your help. I  carefully went through the steps again a 
>>>>>>> couple
>>>>>>>
>>>>>>>> times , and ran
>>>>>>>>
>>>>>>>> after this
>>>>>>>> bin/hadoop namenode -format
>>>>>>>>
>>>>>>>> (by the way, it asks if I want to reformat, I've tried it both 
>>>>>>>> ways)
>>>>>>>>
>>>>>>>>
>>>>>>>> then
>>>>>>>>
>>>>>>>>
>>>>>>>> bin/start-dfs.sh
>>>>>>>>
>>>>>>>> and
>>>>>>>>
>>>>>>>> bin/start-all.sh
>>>>>>>>
>>>>>>>>
>>>>>>>> and then
>>>>>>>> bin/hadoop fs -put conf input
>>>>>>>>
>>>>>>>> now the return for this seemed cryptic:
>>>>>>>>
>>>>>>>>
>>>>>>>> put: Target input/conf is a directory
>>>>>>>>
>>>>>>>> (??)
>>>>>>>>
>>>>>>>> and when I tried
>>>>>>>>
>>>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 
>>>>>>>> 'dfs[a-z.]+'
>>>>>>>>
>>>>>>>> It says something about 0 nodes
>>>>>>>>
>>>>>>>> (from log file)
>>>>>>>>
>>>>>>>> 2010-02-01 13:26:29,874 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>>>>>>
>>>>>>>>
>>>>>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar 
>>>>>>>>
>>>>>>>> dst=null    perm=brian:supergroup:rw-r--r--
>>>>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC 
>>>>>>>> Server
>>>>>>>> handler 3 on 9000, call
>>>>>>>>
>>>>>>>>
>>>>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar, 
>>>>>>>>
>>>>>>>> DFSClient_725490811) from 127.0.0.1:3003: error: 
>>>>>>>> java.io.IOException:
>>>>>>>> File
>>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar 
>>>>>>>>
>>>>>>>> could
>>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>>> java.io.IOException: File
>>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar 
>>>>>>>>
>>>>>>>> could
>>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287) 
>>>>>>>>
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) 
>>>>>>>>
>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
>>>>>>>>
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>>>>> netstat:
>>>>>>>>
>>>>>>>> TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>>>>>> TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>>>>>>
>>>>>>>>
>>>>>>>> and when I browse to http://localhost:50070/
>>>>>>>>
>>>>>>>>
>>>>>>>>  Cluster Summary
>>>>>>>>
>>>>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size 
>>>>>>>> is 8.01
>>>>>>>> MB
>>>>>>>> /
>>>>>>>> 992.31 MB (0%)
>>>>>>>> *
>>>>>>>> Configured Capacity     :       0 KB
>>>>>>>> DFS Used        :       0 KB
>>>>>>>> Non DFS Used    :       0 KB
>>>>>>>> DFS Remaining   :       0 KB
>>>>>>>> DFS Used%       :       100 %
>>>>>>>> DFS Remaining%  :       0 %
>>>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>>>>>> 0
>>>>>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :
>>>>>>>> 0
>>>>>>>>
>>>>>>>>
>>>>>>>> so I'm a bit still in the dark, I guess.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Brian
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Aaron Kimball wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Brian, it looks like you missed a step in the instructions. 
>>>>>>>>> You'll
>>>>>>>>> need
>>>>>>>>> to
>>>>>>>>> format the hdfs filesystem instance before starting the NameNode
>>>>>>>>> server:
>>>>>>>>>
>>>>>>>>> You need to run:
>>>>>>>>>
>>>>>>>>> $ bin/hadoop namenode -format
>>>>>>>>>
>>>>>>>>> .. then you can do bin/start-dfs.sh
>>>>>>>>> Hope this helps,
>>>>>>>>> - Aaron
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per 
>>>>>>>>>> directions on
>>>>>>>>>> the
>>>>>>>>>> hadoop "quickstart" web page.
>>>>>>>>>>
>>>>>>>>>> I know sshd is running and I can "ssh localhost" without a 
>>>>>>>>>> password.
>>>>>>>>>>
>>>>>>>>>> This is from my hadoop-site.xml
>>>>>>>>>>
>>>>>>>>>> <configuration>
>>>>>>>>>> <property>
>>>>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>fs.default.name</name>
>>>>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>mapred.job.tracker</name>
>>>>>>>>>> <value>localhost:9001</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>>>>> <value>-1</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.replication</name>
>>>>>>>>>> <value>1</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.permissions</name>
>>>>>>>>>> <value>false</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>webinterface.private.actions</name>
>>>>>>>>>> <value>true</value>
>>>>>>>>>> </property>
>>>>>>>>>> </configuration>
>>>>>>>>>>
>>>>>>>>>> These are errors from my log files:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>>>>>> localhost/
>>>>>>>>>> 127.0.0.1:9000
>>>>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>>>>> Initializing JVM Metrics with processName=NameNode, 
>>>>>>>>>> sessionId=null
>>>>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>>>>> Initializing
>>>>>>>>>> NameNodeMeterics using context
>>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> supergroup=supergroup
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> isPermissionEnabled=false
>>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: 
>>>>>>>>>>
>>>>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>>>>>> FSNamesystemStatusMBean
>>>>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does 
>>>>>>>>>> not exist.
>>>>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
>>>>>>>>>> FSNamesystem
>>>>>>>>>> initialization failed.
>>>>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: 
>>>>>>>>>>
>>>>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an 
>>>>>>>>>> inconsistent
>>>>>>>>>> state:
>>>>>>>>>> storage directory does not exist or is not accessible.
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868) 
>>>>>>>>>>
>>>>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: 
>>>>>>>>>> Stopping
>>>>>>>>>> server
>>>>>>>>>> on 9000
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> =========================================================
>>>>>>>>>>
>>>>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: 
>>>>>>>>>> Retrying
>>>>>>>>>> connect
>>>>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>>>>> problem cleaning system directory: null
>>>>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 
>>>>>>>>>> failed
>>>>>>>>>> on
>>>>>>>>>> connection exception: java.net.ConnectException: Connection 
>>>>>>>>>> refused:
>>>>>>>>>> no
>>>>>>>>>> further information
>>>>>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104) 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Brian
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
Hi Alex,

Im using a different system, it seems to be running better now.

Thanks
Brian

Re: hadoop under cygwin issue

Posted by Brian Wolf <br...@gmail.com>.

Hi Alex,

seems to:

$ bin/hadoop fs -ls /
Found 1 items
drwxr-xr-x   - brian supergroup          0 2010-03-13 10:45 /tmp

However, I think this might be the source of the problems, whenever I 
invoke any of the scripts, I  get always get these issues:

localhost: /usr/bin/bash: /usr/local/hadoop-0.20.2/bin/hadoop-daemon.sh: 
No such file or directory

I 'm thinking this is something to do with cygwin (?). Ive been careful 
not to open these files with a windows editor (I've already been through 
that headache!)

which I guess I have been  ignoring, but I thnk what ever hadoop-daemon 
is suppossed to do isn't getting done.

However, I have tried to invoke it by hand by echoing out what I guess 
the arguments are supposed to be , like "hadoop-daemon start datanode"  
, but that doesn't seem to work, ie


(also, is there a minimum amt of hd space required, as I have only 1 gig 
or so free )
like:

after i run start-all.sh, I run

$ bin/hadoop-daemon.sh start datanode
starting datanode, logging to 
/usr/local/hadoop-0.20.2/bin/../logs/hadoop-brian-datanode-wynn6266448332.out


ok, but then I try to run the grep example, I get these errors:

2010-03-13 11:27:57,149 WARN org.apache.hadoop.hdfs.DFSClient: Error 
Recovery for block null bad datanode[0] nodes == null
2010-03-13 11:27:57,149 WARN org.apache.hadoop.hdfs.DFSClient: Could not 
get block locations. Source file 
"/tmp/hadoop-SYSTEM/mapred/system/jobtracker.info" - Aborting...
2010-03-13 11:27:57,149 WARN org.apache.hadoop.mapred.JobTracker: 
Writing to file 
hdfs://localhost:9000/tmp/hadoop-SYSTEM/mapred/system/jobtracker.info 
failed!
2010-03-13 11:27:57,149 WARN org.apache.hadoop.mapred.JobTracker: 
FileSystem is not ready yet!
2010-03-13 11:27:57,369 WARN org.apache.hadoop.mapred.JobTracker: Failed 
to initialize recovery manager.
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/tmp/hadoop-SYSTEM/mapred/system/jobtracker.info could only be 
replicated to 0 nodes, instead of 1










Alex Kozlov wrote:
> Hi Brian,
>
> Is your namenode running?  Try 'hadoop fs -ls /'.
>
> Alex
>
>
> On Mar 12, 2010, at 5:20 PM, Brian Wolf <br...@gmail.com> wrote:
>
>> Hi Alex,
>>
>> I am back on this problem.  Seems it works, but I have this issue 
>> with connecting to server.
>> I can connect 'ssh localhost' ok.
>>
>> Thanks
>> Brian
>>
>> $ bin/hadoop jar hadoop-*-examples.jar pi 2 2
>> Number of Maps  = 2
>> Samples per Map = 2
>> 10/03/12 17:16:17 INFO ipc.Client: Retrying connect to server: 
>> localhost/127.0.0.1:9000. Already tried 0 time(s).
>> 10/03/12 17:16:19 INFO ipc.Client: Retrying connect to server: 
>> localhost/127.0.0.1:9000. Already tried 1 time(s).
>>
>>
>>
>> Alex Kozlov wrote:
>>> Can you endeavor a simpler job (just to make sure your setup works):
>>>
>>> $ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2
>>>
>>> Alex K
>>>
>>> On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>> Alex, thanks for the help,  it seems to start now, however
>>>>
>>>>
>>>> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
>>>> 'dfs[a-z.]+'
>>>> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated 
>>>> filesystem
>>>> name. Use "file:///" instead.
>>>> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to 
>>>> process
>>>> : 3
>>>> 10/02/03 20:02:44 INFO mapred.JobClient: Running job: 
>>>> job_201002031354_0013
>>>> 10/02/03 20:02:45 INFO mapred.JobClient:  map 0% reduce 0%
>>>>
>>>>
>>>>
>>>> it hangs here (is pseudo cluster  supposed to work?)
>>>>
>>>>
>>>> these are bottom of various log files
>>>>
>>>> conf log file
>>>>
>>>>
>>>> <property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property> 
>>>>
>>>>
>>>> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property> 
>>>>
>>>> <property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030 
>>>>
>>>> </value></property>
>>>> <property><name>io.file.buffer.size</name><value>4096</value></property> 
>>>>
>>>>
>>>> <property><name>mapred.jobtracker.restart.recover</name><value>false</value></property> 
>>>>
>>>>
>>>> <property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property> 
>>>>
>>>>
>>>> <property><name>dfs.datanode.handler.count</name><value>3</value></property> 
>>>>
>>>>
>>>> <property><name>mapred.reduce.copy.backoff</name><value>300</value></property> 
>>>>
>>>> <property><name>mapred.task.profile</name><value>false</value></property> 
>>>>
>>>>
>>>> <property><name>dfs.replication.considerLoad</name><value>true</value></property> 
>>>>
>>>>
>>>> <property><name>jobclient.output.filter</name><value>FAILED</value></property> 
>>>>
>>>>
>>>> <property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property> 
>>>>
>>>>
>>>> <property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property> 
>>>>
>>>> <property><name>fs.checkpoint.size</name><value>67108864</value></property> 
>>>>
>>>>
>>>>
>>>> bottom
>>>> namenode log
>>>>
>>>> added to blk_6520091160827873550_1036 size 570
>>>> 2010-02-03 20:02:43,826 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml 
>>>>
>>>> dst=null    perm=brian:supergroup:rw-r--r--
>>>> 2010-02-03 20:02:43,866 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    
>>>> cmd=setPermission
>>>>   
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml 
>>>>
>>>>   dst=null    perm=brian:supergroup:rw-r--r--
>>>> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange: 
>>>> BLOCK*
>>>> NameSystem.allocateBlock:
>>>> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml.
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange: 
>>>> BLOCK*
>>>> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is 
>>>> added to
>>>> blk_517844159758473296_1037 size 16238
>>>> 2010-02-03 20:02:44,257 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml 
>>>>
>>>> dst=null    perm=null
>>>> 2010-02-03 20:02:44,527 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar 
>>>>
>>>> dst=null    perm=null
>>>> 2010-02-03 20:02:45,258 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split 
>>>>
>>>>   dst=null    perm=null
>>>>
>>>>
>>>> bottom
>>>> datanode log
>>>>
>>>> 2010-02-03 20:02:44,046 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
>>>> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: 
>>>> /127.0.0.1:50010
>>>> 2010-02-03 20:02:44,076 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE,
>>>> cliID: DFSClient_-1424524646, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,086 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 
>>>> for block
>>>> blk_517844159758473296_1037 terminating
>>>> 2010-02-03 20:02:44,457 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
>>>> cliID: DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_517844159758473296_1037
>>>> 2010-02-03 20:02:44,677 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ,
>>>> cliID: DFSClient_-548531246, srvID:
>>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>>> blk_-2806977820057440405_1035
>>>> 2010-02-03 20:02:45,278 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>>> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ, 
>>>> cliID:
>>>> DFSClient_-548531246, srvID: 
>>>> DS-1812377383-192.168.1.5-50010-1265088397104,
>>>> blockid: blk_6520091160827873550_1036
>>>> 2010-02-03 20:04:10,451 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_3301977249866081256_1031
>>>> 2010-02-03 20:09:35,658 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_9116729021606317943_1025
>>>> 2010-02-03 20:09:44,671 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>>>> succeeded for blk_8602436668984954947_1026
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> jobtracker log
>>>>
>>>> Input size for job job_201002031354_0012 = 53060
>>>> 2010-02-03 19:48:37,599 INFO 
>>>> org.apache.hadoop.mapred.JobInProgress: Split
>>>> info for job:job_201002031354_0012
>>>> 2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology: 
>>>> Adding
>>>> a new node: /default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000000 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000001 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000002 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0012_m_000003 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO 
>>>> org.apache.hadoop.mapred.JobInProgress: Input
>>>> size for job job_201002031354_0013 = 53060
>>>> 2010-02-03 20:02:45,278 INFO 
>>>> org.apache.hadoop.mapred.JobInProgress: Split
>>>> info for job:job_201002031354_0013
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000000 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000001 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000002 has split on
>>>> node:/default-rack/localhost
>>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>>> tip:task_201002031354_0013_m_000003 has split on
>>>> node:/default-rack/localhost
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Alex Kozlov wrote:
>>>>
>>>>
>>>>> Try
>>>>>
>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>>>>
>>>>> file:/// is a magical prefix to force hadoop to look for the file 
>>>>> in the
>>>>> local FS
>>>>>
>>>>> You can also force it to look into local FS by giving '-fs local' 
>>>>> or '-fs
>>>>> file:///' option to the hadoop executable
>>>>>
>>>>> These options basically overwrite the *fs.default.name* configuration
>>>>> setting, which should be in your core-site.xml file
>>>>>
>>>>> You can also copy the content of the input directory to HDFS by 
>>>>> executing
>>>>>
>>>>> $ bin/hadoop fs -mkdir input
>>>>> $ bin/hadoop fs -copyFromLocal input/* input
>>>>>
>>>>> Hope this helps
>>>>>
>>>>> Alex K
>>>>>
>>>>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Alex Kozlov wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>>>>> 0
>>>>>>>
>>>>>>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs
>>>>>>> directory
>>>>>>> (or where your logs are) and check the errors.
>>>>>>>
>>>>>>> Alex K
>>>>>>>
>>>>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> 
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Thanks for your help, Alex,
>>>>>>
>>>>>> I managed to get past that problem, now I have this problem:
>>>>>>
>>>>>> However, when I try to run this example as stated on the quickstart
>>>>>> webpage:
>>>>>>
>>>>>>
>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'
>>>>>>
>>>>>> I get this error;
>>>>>> =============================================================
>>>>>> java.io.IOException:       Not a file:
>>>>>> hdfs://localhost:9000/user/brian/input/conf
>>>>>> =========================================================
>>>>>> so it seems to default to my home directory looking for "input" it
>>>>>> apparently  needs an absolute filepath, however, when I  run that 
>>>>>> way:
>>>>>>
>>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>>> /usr/local/hadoop-0.19.2/input
>>>>>> output 'dfs[a-z.]+'
>>>>>>
>>>>>> ==============================================================
>>>>>> org.apache.hadoop.mapred.InvalidInputException: Input path does not
>>>>>> exist:
>>>>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>>>>> ==============================================================
>>>>>> It still isn't happy although this part -> 
>>>>>> /usr/local/hadoop-0.19.2/input
>>>>>> <-  does exist
>>>>>>
>>>>>> Aaron,
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks or your help. I  carefully went through the steps again a 
>>>>>>> couple
>>>>>>>
>>>>>>>> times , and ran
>>>>>>>>
>>>>>>>> after this
>>>>>>>> bin/hadoop namenode -format
>>>>>>>>
>>>>>>>> (by the way, it asks if I want to reformat, I've tried it both 
>>>>>>>> ways)
>>>>>>>>
>>>>>>>>
>>>>>>>> then
>>>>>>>>
>>>>>>>>
>>>>>>>> bin/start-dfs.sh
>>>>>>>>
>>>>>>>> and
>>>>>>>>
>>>>>>>> bin/start-all.sh
>>>>>>>>
>>>>>>>>
>>>>>>>> and then
>>>>>>>> bin/hadoop fs -put conf input
>>>>>>>>
>>>>>>>> now the return for this seemed cryptic:
>>>>>>>>
>>>>>>>>
>>>>>>>> put: Target input/conf is a directory
>>>>>>>>
>>>>>>>> (??)
>>>>>>>>
>>>>>>>> and when I tried
>>>>>>>>
>>>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 
>>>>>>>> 'dfs[a-z.]+'
>>>>>>>>
>>>>>>>> It says something about 0 nodes
>>>>>>>>
>>>>>>>> (from log file)
>>>>>>>>
>>>>>>>> 2010-02-01 13:26:29,874 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>>>>>>
>>>>>>>>
>>>>>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar 
>>>>>>>>
>>>>>>>> dst=null    perm=brian:supergroup:rw-r--r--
>>>>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC 
>>>>>>>> Server
>>>>>>>> handler 3 on 9000, call
>>>>>>>>
>>>>>>>>
>>>>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar, 
>>>>>>>>
>>>>>>>> DFSClient_725490811) from 127.0.0.1:3003: error: 
>>>>>>>> java.io.IOException:
>>>>>>>> File
>>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar 
>>>>>>>>
>>>>>>>> could
>>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>>> java.io.IOException: File
>>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar 
>>>>>>>>
>>>>>>>> could
>>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287) 
>>>>>>>>
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) 
>>>>>>>>
>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
>>>>>>>>
>>>>>>>> at
>>>>>>>>
>>>>>>>>
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>>>>> netstat:
>>>>>>>>
>>>>>>>> TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>>>>>> TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>>>>>>
>>>>>>>>
>>>>>>>> and when I browse to http://localhost:50070/
>>>>>>>>
>>>>>>>>
>>>>>>>>  Cluster Summary
>>>>>>>>
>>>>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size 
>>>>>>>> is 8.01
>>>>>>>> MB
>>>>>>>> /
>>>>>>>> 992.31 MB (0%)
>>>>>>>> *
>>>>>>>> Configured Capacity     :       0 KB
>>>>>>>> DFS Used        :       0 KB
>>>>>>>> Non DFS Used    :       0 KB
>>>>>>>> DFS Remaining   :       0 KB
>>>>>>>> DFS Used%       :       100 %
>>>>>>>> DFS Remaining%  :       0 %
>>>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>>>>>> 0
>>>>>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :
>>>>>>>> 0
>>>>>>>>
>>>>>>>>
>>>>>>>> so I'm a bit still in the dark, I guess.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Brian
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Aaron Kimball wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Brian, it looks like you missed a step in the instructions. 
>>>>>>>>> You'll
>>>>>>>>> need
>>>>>>>>> to
>>>>>>>>> format the hdfs filesystem instance before starting the NameNode
>>>>>>>>> server:
>>>>>>>>>
>>>>>>>>> You need to run:
>>>>>>>>>
>>>>>>>>> $ bin/hadoop namenode -format
>>>>>>>>>
>>>>>>>>> .. then you can do bin/start-dfs.sh
>>>>>>>>> Hope this helps,
>>>>>>>>> - Aaron
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per 
>>>>>>>>>> directions on
>>>>>>>>>> the
>>>>>>>>>> hadoop "quickstart" web page.
>>>>>>>>>>
>>>>>>>>>> I know sshd is running and I can "ssh localhost" without a 
>>>>>>>>>> password.
>>>>>>>>>>
>>>>>>>>>> This is from my hadoop-site.xml
>>>>>>>>>>
>>>>>>>>>> <configuration>
>>>>>>>>>> <property>
>>>>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>fs.default.name</name>
>>>>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>mapred.job.tracker</name>
>>>>>>>>>> <value>localhost:9001</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>>>>> <value>-1</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.replication</name>
>>>>>>>>>> <value>1</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>dfs.permissions</name>
>>>>>>>>>> <value>false</value>
>>>>>>>>>> </property>
>>>>>>>>>> <property>
>>>>>>>>>> <name>webinterface.private.actions</name>
>>>>>>>>>> <value>true</value>
>>>>>>>>>> </property>
>>>>>>>>>> </configuration>
>>>>>>>>>>
>>>>>>>>>> These are errors from my log files:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>>>>>> localhost/
>>>>>>>>>> 127.0.0.1:9000
>>>>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>>>>> Initializing JVM Metrics with processName=NameNode, 
>>>>>>>>>> sessionId=null
>>>>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>>>>> Initializing
>>>>>>>>>> NameNodeMeterics using context
>>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> supergroup=supergroup
>>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>>> isPermissionEnabled=false
>>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: 
>>>>>>>>>>
>>>>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>>>>>> FSNamesystemStatusMBean
>>>>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does 
>>>>>>>>>> not exist.
>>>>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
>>>>>>>>>> FSNamesystem
>>>>>>>>>> initialization failed.
>>>>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: 
>>>>>>>>>>
>>>>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an 
>>>>>>>>>> inconsistent
>>>>>>>>>> state:
>>>>>>>>>> storage directory does not exist or is not accessible.
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859) 
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868) 
>>>>>>>>>>
>>>>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: 
>>>>>>>>>> Stopping
>>>>>>>>>> server
>>>>>>>>>> on 9000
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> =========================================================
>>>>>>>>>>
>>>>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: 
>>>>>>>>>> Retrying
>>>>>>>>>> connect
>>>>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>>>>> problem cleaning system directory: null
>>>>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 
>>>>>>>>>> failed
>>>>>>>>>> on
>>>>>>>>>> connection exception: java.net.ConnectException: Connection 
>>>>>>>>>> refused:
>>>>>>>>>> no
>>>>>>>>>> further information
>>>>>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>>>>> at
>>>>>>>>>>
>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104) 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Brian
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>

Re: hadoop under cygwin issue

Posted by Alex Kozlov <al...@cloudera.com>.

Hi Brian,

Is your namenode running?  Try 'hadoop fs -ls /'.

Alex


On Mar 12, 2010, at 5:20 PM, Brian Wolf <br...@gmail.com> wrote:

> Hi Alex,
>
> I am back on this problem.  Seems it works, but I have this issue  
> with connecting to server.
> I can connect 'ssh localhost' ok.
>
> Thanks
> Brian
>
> $ bin/hadoop jar hadoop-*-examples.jar pi 2 2
> Number of Maps  = 2
> Samples per Map = 2
> 10/03/12 17:16:17 INFO ipc.Client: Retrying connect to server:  
> localhost/127.0.0.1:9000. Already tried 0 time(s).
> 10/03/12 17:16:19 INFO ipc.Client: Retrying connect to server:  
> localhost/127.0.0.1:9000. Already tried 1 time(s).
>
>
>
> Alex Kozlov wrote:
>> Can you endeavor a simpler job (just to make sure your setup works):
>>
>> $ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2
>>
>> Alex K
>>
>> On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>>> Alex, thanks for the help,  it seems to start now, however
>>>
>>>
>>> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
>>> 'dfs[a-z.]+'
>>> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated  
>>> filesystem
>>> name. Use "file:///" instead.
>>> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths  
>>> to process
>>> : 3
>>> 10/02/03 20:02:44 INFO mapred.JobClient: Running job:  
>>> job_201002031354_0013
>>> 10/02/03 20:02:45 INFO mapred.JobClient:  map 0% reduce 0%
>>>
>>>
>>>
>>> it hangs here (is pseudo cluster  supposed to work?)
>>>
>>>
>>> these are bottom of various log files
>>>
>>> conf log file
>>>
>>>
>>> <property><name>fs.s3.impl</ 
>>> name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
>>>
>>> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/ 
>>> local/hadoop-0.19.2/input</value></property>
>>> <property><name>mapred.job.tracker.http.address</ 
>>> name><value>0.0.0.0:50030
>>> </value></property>
>>> <property><name>io.file.buffer.size</name><value>4096</value></ 
>>> property>
>>>
>>> <property><name>mapred.jobtracker.restart.recover</ 
>>> name><value>false</value></property>
>>>
>>> <property><name>io.serializations</ 
>>> name><value>org.apache.hadoop.io.serializer.WritableSerialization</ 
>>> value></property>
>>>
>>> <property><name>dfs.datanode.handler.count</name><value>3</value></ 
>>> property>
>>>
>>> <property><name>mapred.reduce.copy.backoff</name><value>300</ 
>>> value></property>
>>> <property><name>mapred.task.profile</name><value>false</value></ 
>>> property>
>>>
>>> <property><name>dfs.replication.considerLoad</name><value>true</ 
>>> value></property>
>>>
>>> <property><name>jobclient.output.filter</name><value>FAILED</ 
>>> value></property>
>>>
>>> <property><name>mapred.tasktracker.map.tasks.maximum</ 
>>> name><value>2</value></property>
>>>
>>> <property><name>io.compression.codecs</ 
>>> name> 
>>> <value> 
>>> org.apache.hadoop.io.compress.DefaultCodec, 
>>> org.apache.hadoop.io.compress.GzipCodec, 
>>> org.apache.hadoop.io.compress.BZip2Codec</value></property>
>>> <property><name>fs.checkpoint.size</name><value>67108864</value></ 
>>> property>
>>>
>>>
>>> bottom
>>> namenode log
>>>
>>> added to blk_6520091160827873550_1036 size 570
>>> 2010-02-03 20:02:43,826 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/ 
>>> job.xml
>>> dst=null    perm=brian:supergroup:rw-r--r--
>>> 2010-02-03 20:02:43,866 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1     
>>> cmd=setPermission
>>>   src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/ 
>>> job.xml
>>>   dst=null    perm=brian:supergroup:rw-r--r--
>>> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange:  
>>> BLOCK*
>>> NameSystem.allocateBlock:
>>> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/ 
>>> job.xml.
>>> blk_517844159758473296_1037
>>> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange:  
>>> BLOCK*
>>> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is  
>>> added to
>>> blk_517844159758473296_1037 size 16238
>>> 2010-02-03 20:02:44,257 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/ 
>>> job.xml
>>> dst=null    perm=null
>>> 2010-02-03 20:02:44,527 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/ 
>>> job.jar
>>> dst=null    perm=null
>>> 2010-02-03 20:02:45,258 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>> src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/ 
>>> job.split
>>>   dst=null    perm=null
>>>
>>>
>>> bottom
>>> datanode log
>>>
>>> 2010-02-03 20:02:44,046 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
>>> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: / 
>>> 127.0.0.1:50010
>>> 2010-02-03 20:02:44,076 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op:  
>>> HDFS_WRITE,
>>> cliID: DFSClient_-1424524646, srvID:
>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>> blk_517844159758473296_1037
>>> 2010-02-03 20:02:44,086 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0  
>>> for block
>>> blk_517844159758473296_1037 terminating
>>> 2010-02-03 20:02:44,457 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
>>> cliID: DFSClient_-548531246, srvID:
>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>> blk_517844159758473296_1037
>>> 2010-02-03 20:02:44,677 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op:  
>>> HDFS_READ,
>>> cliID: DFSClient_-548531246, srvID:
>>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>>> blk_-2806977820057440405_1035
>>> 2010-02-03 20:02:45,278 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>>> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ,  
>>> cliID:
>>> DFSClient_-548531246, srvID:  
>>> DS-1812377383-192.168.1.5-50010-1265088397104,
>>> blockid: blk_6520091160827873550_1036
>>> 2010-02-03 20:04:10,451 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner:  
>>> Verification
>>> succeeded for blk_3301977249866081256_1031
>>> 2010-02-03 20:09:35,658 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner:  
>>> Verification
>>> succeeded for blk_9116729021606317943_1025
>>> 2010-02-03 20:09:44,671 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner:  
>>> Verification
>>> succeeded for blk_8602436668984954947_1026
>>>
>>>
>>>
>>>
>>>
>>> jobtracker log
>>>
>>> Input size for job job_201002031354_0012 = 53060
>>> 2010-02-03 19:48:37,599 INFO  
>>> org.apache.hadoop.mapred.JobInProgress: Split
>>> info for job:job_201002031354_0012
>>> 2010-02-03 19:48:37,649 INFO  
>>> org.apache.hadoop.net.NetworkTopology: Adding
>>> a new node: /default-rack/localhost
>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0012_m_000000 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0012_m_000001 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0012_m_000002 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0012_m_000003 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 20:02:45,278 INFO  
>>> org.apache.hadoop.mapred.JobInProgress: Input
>>> size for job job_201002031354_0013 = 53060
>>> 2010-02-03 20:02:45,278 INFO  
>>> org.apache.hadoop.mapred.JobInProgress: Split
>>> info for job:job_201002031354_0013
>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0013_m_000000 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0013_m_000001 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0013_m_000002 has split on
>>> node:/default-rack/localhost
>>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>>> tip:task_201002031354_0013_m_000003 has split on
>>> node:/default-rack/localhost
>>>
>>>
>>>
>>> Thanks
>>> Brian
>>>
>>>
>>>
>>>
>>>
>>> Alex Kozlov wrote:
>>>
>>>
>>>> Try
>>>>
>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>>>
>>>> file:/// is a magical prefix to force hadoop to look for the file  
>>>> in the
>>>> local FS
>>>>
>>>> You can also force it to look into local FS by giving '-fs local'  
>>>> or '-fs
>>>> file:///' option to the hadoop executable
>>>>
>>>> These options basically overwrite the *fs.default.name*  
>>>> configuration
>>>> setting, which should be in your core-site.xml file
>>>>
>>>> You can also copy the content of the input directory to HDFS by  
>>>> executing
>>>>
>>>> $ bin/hadoop fs -mkdir input
>>>> $ bin/hadoop fs -copyFromLocal input/* input
>>>>
>>>> Hope this helps
>>>>
>>>> Alex K
>>>>
>>>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com>  
>>>> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>> Alex Kozlov wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>>>> 0
>>>>>>
>>>>>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs
>>>>>> directory
>>>>>> (or where your logs are) and check the errors.
>>>>>>
>>>>>> Alex K
>>>>>>
>>>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com>  
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> Thanks for your help, Alex,
>>>>>
>>>>> I managed to get past that problem, now I have this problem:
>>>>>
>>>>> However, when I try to run this example as stated on the  
>>>>> quickstart
>>>>> webpage:
>>>>>
>>>>>
>>>>> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a- 
>>>>> z.]+'
>>>>>
>>>>> I get this error;
>>>>> =============================================================
>>>>> java.io.IOException:       Not a file:
>>>>> hdfs://localhost:9000/user/brian/input/conf
>>>>> =========================================================
>>>>> so it seems to default to my home directory looking for "input" it
>>>>> apparently  needs an absolute filepath, however, when I  run  
>>>>> that way:
>>>>>
>>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>>> /usr/local/hadoop-0.19.2/input
>>>>> output 'dfs[a-z.]+'
>>>>>
>>>>> ==============================================================
>>>>> org.apache.hadoop.mapred.InvalidInputException: Input path does  
>>>>> not
>>>>> exist:
>>>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>>>> ==============================================================
>>>>> It still isn't happy although this part -> /usr/local/ 
>>>>> hadoop-0.19.2/input
>>>>> <-  does exist
>>>>>
>>>>> Aaron,
>>>>>
>>>>>
>>>>>
>>>>>> Thanks or your help. I  carefully went through the steps again  
>>>>>> a couple
>>>>>>
>>>>>>> times , and ran
>>>>>>>
>>>>>>> after this
>>>>>>> bin/hadoop namenode -format
>>>>>>>
>>>>>>> (by the way, it asks if I want to reformat, I've tried it both  
>>>>>>> ways)
>>>>>>>
>>>>>>>
>>>>>>> then
>>>>>>>
>>>>>>>
>>>>>>> bin/start-dfs.sh
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> bin/start-all.sh
>>>>>>>
>>>>>>>
>>>>>>> and then
>>>>>>> bin/hadoop fs -put conf input
>>>>>>>
>>>>>>> now the return for this seemed cryptic:
>>>>>>>
>>>>>>>
>>>>>>> put: Target input/conf is a directory
>>>>>>>
>>>>>>> (??)
>>>>>>>
>>>>>>> and when I tried
>>>>>>>
>>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a- 
>>>>>>> z.]+'
>>>>>>>
>>>>>>> It says something about 0 nodes
>>>>>>>
>>>>>>> (from log file)
>>>>>>>
>>>>>>> 2010-02-01 13:26:29,874 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1     
>>>>>>> cmd=create
>>>>>>>
>>>>>>>
>>>>>>> src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/ 
>>>>>>> job_201002011323_0001/job.jar
>>>>>>> dst=null    perm=brian:supergroup:rw-r--r--
>>>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC  
>>>>>>> Server
>>>>>>> handler 3 on 9000, call
>>>>>>>
>>>>>>>
>>>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/ 
>>>>>>> job_201002011323_0001/job.jar,
>>>>>>> DFSClient_725490811) from 127.0.0.1:3003: error:  
>>>>>>> java.io.IOException:
>>>>>>> File
>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/ 
>>>>>>> job.jar
>>>>>>> could
>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>> java.io.IOException: File
>>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/ 
>>>>>>> job.jar
>>>>>>> could
>>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock( 
>>>>>>> FSNamesystem.java:1287)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock 
>>>>>>> (NameNode.java:351)
>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke 
>>>>>>> (NativeMethodAccessorImpl.java:39)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke 
>>>>>>> (DelegatingMethodAccessorImpl.java:25)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>>>> netstat:
>>>>>>>
>>>>>>> TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>>>>> TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>>>>>
>>>>>>>
>>>>>>> and when I browse to http://localhost:50070/
>>>>>>>
>>>>>>>
>>>>>>>  Cluster Summary
>>>>>>>
>>>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size  
>>>>>>> is 8.01
>>>>>>> MB
>>>>>>> /
>>>>>>> 992.31 MB (0%)
>>>>>>> *
>>>>>>> Configured Capacity     :       0 KB
>>>>>>> DFS Used        :       0 KB
>>>>>>> Non DFS Used    :       0 KB
>>>>>>> DFS Remaining   :       0 KB
>>>>>>> DFS Used%       :       100 %
>>>>>>> DFS Remaining%  :       0 %
>>>>>>> Live Nodes <http://localhost:50070/ 
>>>>>>> dfshealth.jsp#LiveNodes>     :
>>>>>>> 0
>>>>>>> Dead Nodes <http://localhost:50070/ 
>>>>>>> dfshealth.jsp#DeadNodes>     :
>>>>>>> 0
>>>>>>>
>>>>>>>
>>>>>>> so I'm a bit still in the dark, I guess.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Brian
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Aaron Kimball wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Brian, it looks like you missed a step in the instructions.  
>>>>>>>> You'll
>>>>>>>> need
>>>>>>>> to
>>>>>>>> format the hdfs filesystem instance before starting the  
>>>>>>>> NameNode
>>>>>>>> server:
>>>>>>>>
>>>>>>>> You need to run:
>>>>>>>>
>>>>>>>> $ bin/hadoop namenode -format
>>>>>>>>
>>>>>>>> .. then you can do bin/start-dfs.sh
>>>>>>>> Hope this helps,
>>>>>>>> - Aaron
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per  
>>>>>>>>> directions on
>>>>>>>>> the
>>>>>>>>> hadoop "quickstart" web page.
>>>>>>>>>
>>>>>>>>> I know sshd is running and I can "ssh localhost" without a  
>>>>>>>>> password.
>>>>>>>>>
>>>>>>>>> This is from my hadoop-site.xml
>>>>>>>>>
>>>>>>>>> <configuration>
>>>>>>>>> <property>
>>>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>fs.default.name</name>
>>>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>mapred.job.tracker</name>
>>>>>>>>> <value>localhost:9001</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>>>> <value>-1</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>dfs.replication</name>
>>>>>>>>> <value>1</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>dfs.permissions</name>
>>>>>>>>> <value>false</value>
>>>>>>>>> </property>
>>>>>>>>> <property>
>>>>>>>>> <name>webinterface.private.actions</name>
>>>>>>>>> <value>true</value>
>>>>>>>>> </property>
>>>>>>>>> </configuration>
>>>>>>>>>
>>>>>>>>> These are errors from my log files:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up  
>>>>>>>>> at:
>>>>>>>>> localhost/
>>>>>>>>> 127.0.0.1:9000
>>>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>>>> Initializing JVM Metrics with processName=NameNode,  
>>>>>>>>> sessionId=null
>>>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>>>> Initializing
>>>>>>>>> NameNodeMeterics using context
>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>> supergroup=supergroup
>>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>>> isPermissionEnabled=false
>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:  
>>>>>>>>> Registered
>>>>>>>>> FSNamesystemStatusMBean
>>>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does  
>>>>>>>>> not exist.
>>>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:  
>>>>>>>>> FSNamesystem
>>>>>>>>> initialization failed.
>>>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an  
>>>>>>>>> inconsistent
>>>>>>>>> state:
>>>>>>>>> storage directory does not exist or is not accessible.
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead( 
>>>>>>>>> FSImage.java:278)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage( 
>>>>>>>>> FSDirectory.java:87)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize( 
>>>>>>>>> FSNamesystem.java:309)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init> 
>>>>>>>>> (FSNamesystem.java:288)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize 
>>>>>>>>> (NameNode.java:163)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init> 
>>>>>>>>> (NameNode.java:208)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init> 
>>>>>>>>> (NameNode.java:194)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode( 
>>>>>>>>> NameNode.java:859)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main 
>>>>>>>>> (NameNode.java:868)
>>>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server:  
>>>>>>>>> Stopping
>>>>>>>>> server
>>>>>>>>> on 9000
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> =========================================================
>>>>>>>>>
>>>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client:  
>>>>>>>>> Retrying
>>>>>>>>> connect
>>>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>>>> problem cleaning system directory: null
>>>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000  
>>>>>>>>> failed
>>>>>>>>> on
>>>>>>>>> connection exception: java.net.ConnectException: Connection  
>>>>>>>>> refused:
>>>>>>>>> no
>>>>>>>>> further information
>>>>>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>>> at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode 
>>>>>>>>> (DFSClient.java:104)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Brian
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>

Re: hadoop under cygwin issue

Posted by Brian Wolf <br...@gmail.com>.

Hi Alex,

I am back on this problem.  Seems it works, but I have this issue with 
connecting to server.
I can connect 'ssh localhost' ok.

Thanks
Brian

$ bin/hadoop jar hadoop-*-examples.jar pi 2 2
Number of Maps  = 2
Samples per Map = 2
10/03/12 17:16:17 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9000. Already tried 0 time(s).
10/03/12 17:16:19 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9000. Already tried 1 time(s).



Alex Kozlov wrote:
> Can you endeavor a simpler job (just to make sure your setup works):
>
> $ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2
>
> Alex K
>
> On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:
>
>   
>> Alex, thanks for the help,  it seems to start now, however
>>
>>
>> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
>> 'dfs[a-z.]+'
>> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated filesystem
>> name. Use "file:///" instead.
>> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to process
>> : 3
>> 10/02/03 20:02:44 INFO mapred.JobClient: Running job: job_201002031354_0013
>> 10/02/03 20:02:45 INFO mapred.JobClient:  map 0% reduce 0%
>>
>>
>>
>> it hangs here (is pseudo cluster  supposed to work?)
>>
>>
>> these are bottom of various log files
>>
>> conf log file
>>
>>
>> <property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
>>
>> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property>
>> <property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030
>> </value></property>
>> <property><name>io.file.buffer.size</name><value>4096</value></property>
>>
>> <property><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
>>
>> <property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
>>
>> <property><name>dfs.datanode.handler.count</name><value>3</value></property>
>>
>> <property><name>mapred.reduce.copy.backoff</name><value>300</value></property>
>> <property><name>mapred.task.profile</name><value>false</value></property>
>>
>> <property><name>dfs.replication.considerLoad</name><value>true</value></property>
>>
>> <property><name>jobclient.output.filter</name><value>FAILED</value></property>
>>
>> <property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property>
>>
>> <property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
>> <property><name>fs.checkpoint.size</name><value>67108864</value></property>
>>
>>
>> bottom
>> namenode log
>>
>> added to blk_6520091160827873550_1036 size 570
>> 2010-02-03 20:02:43,826 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>  src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>>  dst=null    perm=brian:supergroup:rw-r--r--
>> 2010-02-03 20:02:43,866 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=setPermission
>>    src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>>    dst=null    perm=brian:supergroup:rw-r--r--
>> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>> NameSystem.allocateBlock:
>> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml.
>> blk_517844159758473296_1037
>> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to
>> blk_517844159758473296_1037 size 16238
>> 2010-02-03 20:02:44,257 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>  src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>>  dst=null    perm=null
>> 2010-02-03 20:02:44,527 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>  src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar
>>  dst=null    perm=null
>> 2010-02-03 20:02:45,258 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>>  src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split
>>    dst=null    perm=null
>>
>>
>> bottom
>> datanode log
>>
>> 2010-02-03 20:02:44,046 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
>> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: /127.0.0.1:50010
>> 2010-02-03 20:02:44,076 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE,
>> cliID: DFSClient_-1424524646, srvID:
>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>> blk_517844159758473296_1037
>> 2010-02-03 20:02:44,086 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block
>> blk_517844159758473296_1037 terminating
>> 2010-02-03 20:02:44,457 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
>> cliID: DFSClient_-548531246, srvID:
>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>> blk_517844159758473296_1037
>> 2010-02-03 20:02:44,677 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ,
>> cliID: DFSClient_-548531246, srvID:
>> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
>> blk_-2806977820057440405_1035
>> 2010-02-03 20:02:45,278 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
>> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ, cliID:
>> DFSClient_-548531246, srvID: DS-1812377383-192.168.1.5-50010-1265088397104,
>> blockid: blk_6520091160827873550_1036
>> 2010-02-03 20:04:10,451 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>> succeeded for blk_3301977249866081256_1031
>> 2010-02-03 20:09:35,658 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>> succeeded for blk_9116729021606317943_1025
>> 2010-02-03 20:09:44,671 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
>> succeeded for blk_8602436668984954947_1026
>>
>>
>>
>>
>>
>> jobtracker log
>>
>> Input size for job job_201002031354_0012 = 53060
>> 2010-02-03 19:48:37,599 INFO org.apache.hadoop.mapred.JobInProgress: Split
>> info for job:job_201002031354_0012
>> 2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology: Adding
>> a new node: /default-rack/localhost
>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0012_m_000000 has split on
>> node:/default-rack/localhost
>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0012_m_000001 has split on
>> node:/default-rack/localhost
>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0012_m_000002 has split on
>> node:/default-rack/localhost
>> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0012_m_000003 has split on
>> node:/default-rack/localhost
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: Input
>> size for job job_201002031354_0013 = 53060
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: Split
>> info for job:job_201002031354_0013
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0013_m_000000 has split on
>> node:/default-rack/localhost
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0013_m_000001 has split on
>> node:/default-rack/localhost
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0013_m_000002 has split on
>> node:/default-rack/localhost
>> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
>> tip:task_201002031354_0013_m_000003 has split on
>> node:/default-rack/localhost
>>
>>
>>
>> Thanks
>> Brian
>>
>>
>>
>>
>>
>> Alex Kozlov wrote:
>>
>>     
>>> Try
>>>
>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>>
>>> file:/// is a magical prefix to force hadoop to look for the file in the
>>> local FS
>>>
>>> You can also force it to look into local FS by giving '-fs local' or '-fs
>>> file:///' option to the hadoop executable
>>>
>>> These options basically overwrite the *fs.default.name* configuration
>>> setting, which should be in your core-site.xml file
>>>
>>> You can also copy the content of the input directory to HDFS by executing
>>>
>>> $ bin/hadoop fs -mkdir input
>>> $ bin/hadoop fs -copyFromLocal input/* input
>>>
>>> Hope this helps
>>>
>>> Alex K
>>>
>>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>
>>>       
>>>> Alex Kozlov wrote:
>>>>
>>>>
>>>>
>>>>         
>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>>> 0
>>>>>
>>>>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs
>>>>> directory
>>>>> (or where your logs are) and check the errors.
>>>>>
>>>>> Alex K
>>>>>
>>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>           
>>>> Thanks for your help, Alex,
>>>>
>>>> I managed to get past that problem, now I have this problem:
>>>>
>>>> However, when I try to run this example as stated on the quickstart
>>>> webpage:
>>>>
>>>>
>>>> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'
>>>>
>>>> I get this error;
>>>> =============================================================
>>>> java.io.IOException:       Not a file:
>>>> hdfs://localhost:9000/user/brian/input/conf
>>>> =========================================================
>>>> so it seems to default to my home directory looking for "input" it
>>>> apparently  needs an absolute filepath, however, when I  run that way:
>>>>
>>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>>> /usr/local/hadoop-0.19.2/input
>>>>  output 'dfs[a-z.]+'
>>>>
>>>> ==============================================================
>>>> org.apache.hadoop.mapred.InvalidInputException: Input path does not
>>>> exist:
>>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>>> ==============================================================
>>>> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>>>>  <-  does exist
>>>>
>>>>  Aaron,
>>>>
>>>>
>>>>         
>>>>> Thanks or your help. I  carefully went through the steps again a couple
>>>>>           
>>>>>> times , and ran
>>>>>>
>>>>>> after this
>>>>>> bin/hadoop namenode -format
>>>>>>
>>>>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>>>>
>>>>>>
>>>>>> then
>>>>>>
>>>>>>
>>>>>> bin/start-dfs.sh
>>>>>>
>>>>>> and
>>>>>>
>>>>>> bin/start-all.sh
>>>>>>
>>>>>>
>>>>>> and then
>>>>>> bin/hadoop fs -put conf input
>>>>>>
>>>>>> now the return for this seemed cryptic:
>>>>>>
>>>>>>
>>>>>> put: Target input/conf is a directory
>>>>>>
>>>>>> (??)
>>>>>>
>>>>>>  and when I tried
>>>>>>
>>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>>>
>>>>>> It says something about 0 nodes
>>>>>>
>>>>>> (from log file)
>>>>>>
>>>>>> 2010-02-01 13:26:29,874 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>>>>
>>>>>>
>>>>>>  src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>>  dst=null    perm=brian:supergroup:rw-r--r--
>>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>>>> handler 3 on 9000, call
>>>>>>
>>>>>>
>>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>>>>> File
>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>> could
>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>> java.io.IOException: File
>>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>> could
>>>>>> only be replicated to 0 nodes, instead of 1
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>>> netstat:
>>>>>>
>>>>>>  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>>>>  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>>>>
>>>>>>
>>>>>> and when I browse to http://localhost:50070/
>>>>>>
>>>>>>
>>>>>>   Cluster Summary
>>>>>>
>>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01
>>>>>> MB
>>>>>> /
>>>>>> 992.31 MB (0%)
>>>>>> *
>>>>>> Configured Capacity     :       0 KB
>>>>>> DFS Used        :       0 KB
>>>>>> Non DFS Used    :       0 KB
>>>>>> DFS Remaining   :       0 KB
>>>>>> DFS Used%       :       100 %
>>>>>> DFS Remaining%  :       0 %
>>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>>>> 0
>>>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :
>>>>>> 0
>>>>>>
>>>>>>
>>>>>> so I'm a bit still in the dark, I guess.
>>>>>>
>>>>>> Thanks
>>>>>> Brian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Aaron Kimball wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> Brian, it looks like you missed a step in the instructions. You'll
>>>>>>> need
>>>>>>> to
>>>>>>> format the hdfs filesystem instance before starting the NameNode
>>>>>>> server:
>>>>>>>
>>>>>>> You need to run:
>>>>>>>
>>>>>>> $ bin/hadoop namenode -format
>>>>>>>
>>>>>>> .. then you can do bin/start-dfs.sh
>>>>>>> Hope this helps,
>>>>>>> - Aaron
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on
>>>>>>>> the
>>>>>>>> hadoop "quickstart" web page.
>>>>>>>>
>>>>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>>>>
>>>>>>>> This is from my hadoop-site.xml
>>>>>>>>
>>>>>>>> <configuration>
>>>>>>>> <property>
>>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>fs.default.name</name>
>>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>mapred.job.tracker</name>
>>>>>>>> <value>localhost:9001</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>>> <value>-1</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>dfs.replication</name>
>>>>>>>> <value>1</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>dfs.permissions</name>
>>>>>>>> <value>false</value>
>>>>>>>> </property>
>>>>>>>> <property>
>>>>>>>> <name>webinterface.private.actions</name>
>>>>>>>> <value>true</value>
>>>>>>>> </property>
>>>>>>>> </configuration>
>>>>>>>>
>>>>>>>> These are errors from my log files:
>>>>>>>>
>>>>>>>>
>>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>>>> localhost/
>>>>>>>> 127.0.0.1:9000
>>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>>> Initializing
>>>>>>>> NameNodeMeterics using context
>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>> supergroup=supergroup
>>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>>> isPermissionEnabled=false
>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>>>> FSNamesystemStatusMBean
>>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>>>>> initialization failed.
>>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>>>>> state:
>>>>>>>> storage directory does not exist or is not accessible.
>>>>>>>>  at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>>>>  at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>>>>  at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>>>>  at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>>>>  at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>>>>  at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>>>>  at
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>>>>  at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>>>>  at
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>>>>> server
>>>>>>>> on 9000
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> =========================================================
>>>>>>>>
>>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>>>>> connect
>>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>>> problem cleaning system directory: null
>>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed
>>>>>>>> on
>>>>>>>> connection exception: java.net.ConnectException: Connection refused:
>>>>>>>> no
>>>>>>>> further information
>>>>>>>>  at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>>>  at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>>>  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>>  at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>>>  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>>>  at
>>>>>>>>
>>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Brian
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>
>>>>>>>               
>>>>>>             
>>>>>
>>>>>           
>>>>         
>>>
>>>       
>>     
>
>

Re: hadoop under cygwin issue

Posted by Alex Kozlov <al...@cloudera.com>.

Can you endeavor a simpler job (just to make sure your setup works):

$ hadoop jar $HADOOP_INSTALL/hadoop-*-examples.jar pi 2 2

Alex K

On Wed, Feb 3, 2010 at 8:26 PM, Brian Wolf <br...@gmail.com> wrote:

> Alex, thanks for the help,  it seems to start now, however
>
>
> $ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output
> 'dfs[a-z.]+'
> 10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated filesystem
> name. Use "file:///" instead.
> 10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to process
> : 3
> 10/02/03 20:02:44 INFO mapred.JobClient: Running job: job_201002031354_0013
> 10/02/03 20:02:45 INFO mapred.JobClient:  map 0% reduce 0%
>
>
>
> it hangs here (is pseudo cluster  supposed to work?)
>
>
> these are bottom of various log files
>
> conf log file
>
>
> <property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
>
> <property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property>
> <property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030
> </value></property>
> <property><name>io.file.buffer.size</name><value>4096</value></property>
>
> <property><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
>
> <property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
>
> <property><name>dfs.datanode.handler.count</name><value>3</value></property>
>
> <property><name>mapred.reduce.copy.backoff</name><value>300</value></property>
> <property><name>mapred.task.profile</name><value>false</value></property>
>
> <property><name>dfs.replication.considerLoad</name><value>true</value></property>
>
> <property><name>jobclient.output.filter</name><value>FAILED</value></property>
>
> <property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property>
>
> <property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
> <property><name>fs.checkpoint.size</name><value>67108864</value></property>
>
>
> bottom
> namenode log
>
> added to blk_6520091160827873550_1036 size 570
> 2010-02-03 20:02:43,826 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>  src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>  dst=null    perm=brian:supergroup:rw-r--r--
> 2010-02-03 20:02:43,866 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=setPermission
>    src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>    dst=null    perm=brian:supergroup:rw-r--r--
> 2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.allocateBlock:
> /cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml.
> blk_517844159758473296_1037
> 2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to
> blk_517844159758473296_1037 size 16238
> 2010-02-03 20:02:44,257 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>  src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml
>  dst=null    perm=null
> 2010-02-03 20:02:44,527 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>  src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar
>  dst=null    perm=null
> 2010-02-03 20:02:45,258 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open
>  src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split
>    dst=null    perm=null
>
>
> bottom
> datanode log
>
> 2010-02-03 20:02:44,046 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: /127.0.0.1:50010
> 2010-02-03 20:02:44,076 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE,
> cliID: DFSClient_-1424524646, srvID:
> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
> blk_517844159758473296_1037
> 2010-02-03 20:02:44,086 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block
> blk_517844159758473296_1037 terminating
> 2010-02-03 20:02:44,457 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ,
> cliID: DFSClient_-548531246, srvID:
> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
> blk_517844159758473296_1037
> 2010-02-03 20:02:44,677 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ,
> cliID: DFSClient_-548531246, srvID:
> DS-1812377383-192.168.1.5-50010-1265088397104, blockid:
> blk_-2806977820057440405_1035
> 2010-02-03 20:02:45,278 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /
> 127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ, cliID:
> DFSClient_-548531246, srvID: DS-1812377383-192.168.1.5-50010-1265088397104,
> blockid: blk_6520091160827873550_1036
> 2010-02-03 20:04:10,451 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_3301977249866081256_1031
> 2010-02-03 20:09:35,658 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_9116729021606317943_1025
> 2010-02-03 20:09:44,671 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_8602436668984954947_1026
>
>
>
>
>
> jobtracker log
>
> Input size for job job_201002031354_0012 = 53060
> 2010-02-03 19:48:37,599 INFO org.apache.hadoop.mapred.JobInProgress: Split
> info for job:job_201002031354_0012
> 2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology: Adding
> a new node: /default-rack/localhost
> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0012_m_000000 has split on
> node:/default-rack/localhost
> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0012_m_000001 has split on
> node:/default-rack/localhost
> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0012_m_000002 has split on
> node:/default-rack/localhost
> 2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0012_m_000003 has split on
> node:/default-rack/localhost
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: Input
> size for job job_201002031354_0013 = 53060
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: Split
> info for job:job_201002031354_0013
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0013_m_000000 has split on
> node:/default-rack/localhost
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0013_m_000001 has split on
> node:/default-rack/localhost
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0013_m_000002 has split on
> node:/default-rack/localhost
> 2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress:
> tip:task_201002031354_0013_m_000003 has split on
> node:/default-rack/localhost
>
>
>
> Thanks
> Brian
>
>
>
>
>
> Alex Kozlov wrote:
>
>> Try
>>
>> $ bin/hadoop jar hadoop-*-examples.jar grep
>> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>>
>> file:/// is a magical prefix to force hadoop to look for the file in the
>> local FS
>>
>> You can also force it to look into local FS by giving '-fs local' or '-fs
>> file:///' option to the hadoop executable
>>
>> These options basically overwrite the *fs.default.name* configuration
>> setting, which should be in your core-site.xml file
>>
>> You can also copy the content of the input directory to HDFS by executing
>>
>> $ bin/hadoop fs -mkdir input
>> $ bin/hadoop fs -copyFromLocal input/* input
>>
>> Hope this helps
>>
>> Alex K
>>
>> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>>
>>> Alex Kozlov wrote:
>>>
>>>
>>>
>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>> 0
>>>>
>>>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs
>>>> directory
>>>> (or where your logs are) and check the errors.
>>>>
>>>> Alex K
>>>>
>>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> Thanks for your help, Alex,
>>>
>>> I managed to get past that problem, now I have this problem:
>>>
>>> However, when I try to run this example as stated on the quickstart
>>> webpage:
>>>
>>>
>>> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'
>>>
>>> I get this error;
>>> =============================================================
>>> java.io.IOException:       Not a file:
>>> hdfs://localhost:9000/user/brian/input/conf
>>> =========================================================
>>> so it seems to default to my home directory looking for "input" it
>>> apparently  needs an absolute filepath, however, when I  run that way:
>>>
>>> $ bin/hadoop jar hadoop-*-examples.jar grep
>>> /usr/local/hadoop-0.19.2/input
>>>  output 'dfs[a-z.]+'
>>>
>>> ==============================================================
>>> org.apache.hadoop.mapred.InvalidInputException: Input path does not
>>> exist:
>>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>>> ==============================================================
>>> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>>>  <-  does exist
>>>
>>>  Aaron,
>>>
>>>
>>>> Thanks or your help. I  carefully went through the steps again a couple
>>>>> times , and ran
>>>>>
>>>>> after this
>>>>> bin/hadoop namenode -format
>>>>>
>>>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>>>
>>>>>
>>>>> then
>>>>>
>>>>>
>>>>> bin/start-dfs.sh
>>>>>
>>>>> and
>>>>>
>>>>> bin/start-all.sh
>>>>>
>>>>>
>>>>> and then
>>>>> bin/hadoop fs -put conf input
>>>>>
>>>>> now the return for this seemed cryptic:
>>>>>
>>>>>
>>>>> put: Target input/conf is a directory
>>>>>
>>>>> (??)
>>>>>
>>>>>  and when I tried
>>>>>
>>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>>
>>>>> It says something about 0 nodes
>>>>>
>>>>> (from log file)
>>>>>
>>>>> 2010-02-01 13:26:29,874 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>>>
>>>>>
>>>>>  src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>>  dst=null    perm=brian:supergroup:rw-r--r--
>>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>>> handler 3 on 9000, call
>>>>>
>>>>>
>>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>>>> File
>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>> could
>>>>> only be replicated to 0 nodes, instead of 1
>>>>> java.io.IOException: File
>>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>> could
>>>>> only be replicated to 0 nodes, instead of 1
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>  at
>>>>>
>>>>>
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>  at
>>>>>
>>>>>
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> To maybe rule out something regarding ports or ssh , when I run
>>>>> netstat:
>>>>>
>>>>>  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>>>  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>>>
>>>>>
>>>>> and when I browse to http://localhost:50070/
>>>>>
>>>>>
>>>>>   Cluster Summary
>>>>>
>>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01
>>>>> MB
>>>>> /
>>>>> 992.31 MB (0%)
>>>>> *
>>>>> Configured Capacity     :       0 KB
>>>>> DFS Used        :       0 KB
>>>>> Non DFS Used    :       0 KB
>>>>> DFS Remaining   :       0 KB
>>>>> DFS Used%       :       100 %
>>>>> DFS Remaining%  :       0 %
>>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>>> 0
>>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :
>>>>> 0
>>>>>
>>>>>
>>>>> so I'm a bit still in the dark, I guess.
>>>>>
>>>>> Thanks
>>>>> Brian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Aaron Kimball wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Brian, it looks like you missed a step in the instructions. You'll
>>>>>> need
>>>>>> to
>>>>>> format the hdfs filesystem instance before starting the NameNode
>>>>>> server:
>>>>>>
>>>>>> You need to run:
>>>>>>
>>>>>> $ bin/hadoop namenode -format
>>>>>>
>>>>>> .. then you can do bin/start-dfs.sh
>>>>>> Hope this helps,
>>>>>> - Aaron
>>>>>>
>>>>>>
>>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on
>>>>>>> the
>>>>>>> hadoop "quickstart" web page.
>>>>>>>
>>>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>>>
>>>>>>> This is from my hadoop-site.xml
>>>>>>>
>>>>>>> <configuration>
>>>>>>> <property>
>>>>>>> <name>hadoop.tmp.dir</name>
>>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>fs.default.name</name>
>>>>>>> <value>hdfs://localhost:9000</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>mapred.job.tracker</name>
>>>>>>> <value>localhost:9001</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>>> <value>-1</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>dfs.replication</name>
>>>>>>> <value>1</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>dfs.permissions</name>
>>>>>>> <value>false</value>
>>>>>>> </property>
>>>>>>> <property>
>>>>>>> <name>webinterface.private.actions</name>
>>>>>>> <value>true</value>
>>>>>>> </property>
>>>>>>> </configuration>
>>>>>>>
>>>>>>> These are errors from my log files:
>>>>>>>
>>>>>>>
>>>>>>> 2010-01-30 00:03:33,091 INFO
>>>>>>> org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>>> localhost/
>>>>>>> 127.0.0.1:9000
>>>>>>> 2010-01-30 00:03:33,161 INFO
>>>>>>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>>> Initializing
>>>>>>> NameNodeMeterics using context
>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>> supergroup=supergroup
>>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>>> isPermissionEnabled=false
>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>>> Initializing FSNamesystemMetrics using context
>>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>>> FSNamesystemStatusMBean
>>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>>>> initialization failed.
>>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>>>> state:
>>>>>>> storage directory does not exist or is not accessible.
>>>>>>>  at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>>>  at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>>>  at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>>>  at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>>>  at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>>>  at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>>>  at
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>>>  at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>>>  at
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>>>> server
>>>>>>> on 9000
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> =========================================================
>>>>>>>
>>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>>>> connect
>>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>>> problem cleaning system directory: null
>>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed
>>>>>>> on
>>>>>>> connection exception: java.net.ConnectException: Connection refused:
>>>>>>> no
>>>>>>> further information
>>>>>>>  at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>>  at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>>  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>>  at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>>  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>>  at
>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>> Brian
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>

Re: hadoop under cygwin issue

Posted by Brian Wolf <br...@gmail.com>.

Alex, thanks for the help,  it seems to start now, however


$ bin/hadoop jar hadoop-*-examples.jar grep -fs local input output 
'dfs[a-z.]+'
10/02/03 20:02:41 WARN fs.FileSystem: "local" is a deprecated filesystem 
name. Use "file:///" instead.
10/02/03 20:02:43 INFO mapred.FileInputFormat: Total input paths to 
process : 3
10/02/03 20:02:44 INFO mapred.JobClient: Running job: job_201002031354_0013
10/02/03 20:02:45 INFO mapred.JobClient:  map 0% reduce 0%



 it hangs here (is pseudo cluster  supposed to work?)


these are bottom of various log files

conf log file

<property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
<property><name>mapred.input.dir</name><value>file:/C:/OpenSSH/usr/local/hadoop-0.19.2/input</value></property>
<property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030</value></property>
<property><name>io.file.buffer.size</name><value>4096</value></property>
<property><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
<property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
<property><name>dfs.datanode.handler.count</name><value>3</value></property>
<property><name>mapred.reduce.copy.backoff</name><value>300</value></property>
<property><name>mapred.task.profile</name><value>false</value></property>
<property><name>dfs.replication.considerLoad</name><value>true</value></property>
<property><name>jobclient.output.filter</name><value>FAILED</value></property>
<property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property>
<property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
<property><name>fs.checkpoint.size</name><value>67108864</value></property>


bottom
namenode log

added to blk_6520091160827873550_1036 size 570
2010-02-03 20:02:43,826 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create    
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml    
dst=null    perm=brian:supergroup:rw-r--r--
2010-02-03 20:02:43,866 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
ugi=brian,None,Administrators,Users    ip=/127.0.0.1    
cmd=setPermission    
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml    
dst=null    perm=brian:supergroup:rw-r--r--
2010-02-03 20:02:44,026 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml. 
blk_517844159758473296_1037
2010-02-03 20:02:44,076 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to 
blk_517844159758473296_1037 size 16238
2010-02-03 20:02:44,257 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open    
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.xml    
dst=null    perm=null
2010-02-03 20:02:44,527 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open    
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.jar    
dst=null    perm=null
2010-02-03 20:02:45,258 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=open    
src=/cygwin/tmp/hadoop-brian/mapred/system/job_201002031354_0013/job.split    
dst=null    perm=null


bottom
datanode log

2010-02-03 20:02:44,046 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block 
blk_517844159758473296_1037 src: /127.0.0.1:4069 dest: /127.0.0.1:50010
2010-02-03 20:02:44,076 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/127.0.0.1:4069, dest: /127.0.0.1:50010, bytes: 16238, op: HDFS_WRITE, 
cliID: DFSClient_-1424524646, srvID: 
DS-1812377383-192.168.1.5-50010-1265088397104, blockid: 
blk_517844159758473296_1037
2010-02-03 20:02:44,086 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for 
block blk_517844159758473296_1037 terminating
2010-02-03 20:02:44,457 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/127.0.0.1:50010, dest: /127.0.0.1:4075, bytes: 16366, op: HDFS_READ, 
cliID: DFSClient_-548531246, srvID: 
DS-1812377383-192.168.1.5-50010-1265088397104, blockid: 
blk_517844159758473296_1037
2010-02-03 20:02:44,677 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/127.0.0.1:50010, dest: /127.0.0.1:4076, bytes: 135168, op: HDFS_READ, 
cliID: DFSClient_-548531246, srvID: 
DS-1812377383-192.168.1.5-50010-1265088397104, blockid: 
blk_-2806977820057440405_1035
2010-02-03 20:02:45,278 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: 
/127.0.0.1:50010, dest: /127.0.0.1:4077, bytes: 578, op: HDFS_READ, 
cliID: DFSClient_-548531246, srvID: 
DS-1812377383-192.168.1.5-50010-1265088397104, blockid: 
blk_6520091160827873550_1036
2010-02-03 20:04:10,451 INFO 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification 
succeeded for blk_3301977249866081256_1031
2010-02-03 20:09:35,658 INFO 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification 
succeeded for blk_9116729021606317943_1025
2010-02-03 20:09:44,671 INFO 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification 
succeeded for blk_8602436668984954947_1026





jobtracker log

Input size for job job_201002031354_0012 = 53060
2010-02-03 19:48:37,599 INFO org.apache.hadoop.mapred.JobInProgress: 
Split info for job:job_201002031354_0012
2010-02-03 19:48:37,649 INFO org.apache.hadoop.net.NetworkTopology: 
Adding a new node: /default-rack/localhost
2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201002031354_0012_m_000000 has split on 
node:/default-rack/localhost
2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201002031354_0012_m_000001 has split on 
node:/default-rack/localhost
2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201002031354_0012_m_000002 has split on 
node:/default-rack/localhost
2010-02-03 19:48:37,659 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201002031354_0012_m_000003 has split on 
node:/default-rack/localhost
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: 
Input size for job job_201002031354_0013 = 53060
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: 
Split info for job:job_201002031354_0013
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201002031354_0013_m_000000 has split on 
node:/default-rack/localhost
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201002031354_0013_m_000001 has split on 
node:/default-rack/localhost
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201002031354_0013_m_000002 has split on 
node:/default-rack/localhost
2010-02-03 20:02:45,278 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201002031354_0013_m_000003 has split on 
node:/default-rack/localhost



Thanks
Brian




Alex Kozlov wrote:
> Try
>
> $ bin/hadoop jar hadoop-*-examples.jar grep
> file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'
>
> file:/// is a magical prefix to force hadoop to look for the file in the
> local FS
>
> You can also force it to look into local FS by giving '-fs local' or '-fs
> file:///' option to the hadoop executable
>
> These options basically overwrite the *fs.default.name* configuration
> setting, which should be in your core-site.xml file
>
> You can also copy the content of the input directory to HDFS by executing
>
> $ bin/hadoop fs -mkdir input
> $ bin/hadoop fs -copyFromLocal input/* input
>
> Hope this helps
>
> Alex K
>
> On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:
>
>   
>> Alex Kozlov wrote:
>>
>>     
>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0
>>>
>>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs directory
>>> (or where your logs are) and check the errors.
>>>
>>> Alex K
>>>
>>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>
>>>       
>>
>> Thanks for your help, Alex,
>>
>> I managed to get past that problem, now I have this problem:
>>
>> However, when I try to run this example as stated on the quickstart
>> webpage:
>>
>>
>> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'
>>
>> I get this error;
>> =============================================================
>> java.io.IOException:       Not a file:
>> hdfs://localhost:9000/user/brian/input/conf
>> =========================================================
>> so it seems to default to my home directory looking for "input" it
>> apparently  needs an absolute filepath, however, when I  run that way:
>>
>> $ bin/hadoop jar hadoop-*-examples.jar grep /usr/local/hadoop-0.19.2/input
>>  output 'dfs[a-z.]+'
>>
>> ==============================================================
>> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
>> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
>> ==============================================================
>> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>>  <-  does exist
>>
>>  Aaron,
>>     
>>>> Thanks or your help. I  carefully went through the steps again a couple
>>>> times , and ran
>>>>
>>>> after this
>>>> bin/hadoop namenode -format
>>>>
>>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>>
>>>>
>>>> then
>>>>
>>>>
>>>> bin/start-dfs.sh
>>>>
>>>> and
>>>>
>>>> bin/start-all.sh
>>>>
>>>>
>>>> and then
>>>> bin/hadoop fs -put conf input
>>>>
>>>> now the return for this seemed cryptic:
>>>>
>>>>
>>>> put: Target input/conf is a directory
>>>>
>>>> (??)
>>>>
>>>>  and when I tried
>>>>
>>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>>
>>>> It says something about 0 nodes
>>>>
>>>> (from log file)
>>>>
>>>> 2010-02-01 13:26:29,874 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>>
>>>>  src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>>  dst=null    perm=brian:supergroup:rw-r--r--
>>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 3 on 9000, call
>>>>
>>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>>> File
>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> could
>>>> only be replicated to 0 nodes, instead of 1
>>>> java.io.IOException: File
>>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>> could
>>>> only be replicated to 0 nodes, instead of 1
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>  at
>>>>
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>  at
>>>>
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>
>>>>
>>>>
>>>>
>>>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>>>
>>>>  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>>  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>>
>>>>
>>>> and when I browse to http://localhost:50070/
>>>>
>>>>
>>>>    Cluster Summary
>>>>
>>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB
>>>> /
>>>> 992.31 MB (0%)
>>>> *
>>>> Configured Capacity     :       0 KB
>>>> DFS Used        :       0 KB
>>>> Non DFS Used    :       0 KB
>>>> DFS Remaining   :       0 KB
>>>> DFS Used%       :       100 %
>>>> DFS Remaining%  :       0 %
>>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>>> 0
>>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :
>>>> 0
>>>>
>>>>
>>>> so I'm a bit still in the dark, I guess.
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>> Aaron Kimball wrote:
>>>>
>>>>
>>>>
>>>>         
>>>>> Brian, it looks like you missed a step in the instructions. You'll need
>>>>> to
>>>>> format the hdfs filesystem instance before starting the NameNode server:
>>>>>
>>>>> You need to run:
>>>>>
>>>>> $ bin/hadoop namenode -format
>>>>>
>>>>> .. then you can do bin/start-dfs.sh
>>>>> Hope this helps,
>>>>> - Aaron
>>>>>
>>>>>
>>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>>>> hadoop "quickstart" web page.
>>>>>>
>>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>>
>>>>>> This is from my hadoop-site.xml
>>>>>>
>>>>>> <configuration>
>>>>>> <property>
>>>>>> <name>hadoop.tmp.dir</name>
>>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>fs.default.name</name>
>>>>>> <value>hdfs://localhost:9000</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>mapred.job.tracker</name>
>>>>>> <value>localhost:9001</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>>> <value>-1</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>dfs.replication</name>
>>>>>> <value>1</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>dfs.permissions</name>
>>>>>> <value>false</value>
>>>>>> </property>
>>>>>> <property>
>>>>>> <name>webinterface.private.actions</name>
>>>>>> <value>true</value>
>>>>>> </property>
>>>>>> </configuration>
>>>>>>
>>>>>> These are errors from my log files:
>>>>>>
>>>>>>
>>>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>>> 2010-01-30 00:03:33,121 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>>> localhost/
>>>>>> 127.0.0.1:9000
>>>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>>> 2010-01-30 00:03:33,181 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>>> Initializing
>>>>>> NameNodeMeterics using context
>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> fsOwner=brian,None,Administrators,Users
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> supergroup=supergroup
>>>>>> 2010-01-30 00:03:34,603 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>>> isPermissionEnabled=false
>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>>> Initializing FSNamesystemMetrics using context
>>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>>> 2010-01-30 00:03:34,653 INFO
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>>> FSNamesystemStatusMBean
>>>>>> 2010-01-30 00:03:34,803 INFO
>>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>>> initialization failed.
>>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>>> state:
>>>>>> storage directory does not exist or is not accessible.
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>>  at
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>>  at
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>>  at
>>>>>>
>>>>>>
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>>  at
>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>>> server
>>>>>> on 9000
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> =========================================================
>>>>>>
>>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>>> connect
>>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>>> problem cleaning system directory: null
>>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>>>> further information
>>>>>>  at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>>  at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>>  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>>  at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>>  at
>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Brian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>
>>>>>           
>>>>         
>>>
>>>       
>>     
>
>

Re: hadoop under cygwin issue

Posted by Alex Kozlov <al...@cloudera.com>.

Try

$ bin/hadoop jar hadoop-*-examples.jar grep
file:///usr/local/hadoop-0.19.2/input output 'dfs[a-z.]+'

file:/// is a magical prefix to force hadoop to look for the file in the
local FS

You can also force it to look into local FS by giving '-fs local' or '-fs
file:///' option to the hadoop executable

These options basically overwrite the *fs.default.name* configuration
setting, which should be in your core-site.xml file

You can also copy the content of the input directory to HDFS by executing

$ bin/hadoop fs -mkdir input
$ bin/hadoop fs -copyFromLocal input/* input

Hope this helps

Alex K

On Wed, Feb 3, 2010 at 2:17 PM, Brian Wolf <br...@gmail.com> wrote:

> Alex Kozlov wrote:
>
>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0
>>
>> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs directory
>> (or where your logs are) and check the errors.
>>
>> Alex K
>>
>> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>>
>
>
>
> Thanks for your help, Alex,
>
> I managed to get past that problem, now I have this problem:
>
> However, when I try to run this example as stated on the quickstart
> webpage:
>
>
> bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'
>
> I get this error;
> =============================================================
> java.io.IOException:       Not a file:
> hdfs://localhost:9000/user/brian/input/conf
> =========================================================
> so it seems to default to my home directory looking for "input" it
> apparently  needs an absolute filepath, however, when I  run that way:
>
> $ bin/hadoop jar hadoop-*-examples.jar grep /usr/local/hadoop-0.19.2/input
>  output 'dfs[a-z.]+'
>
> ==============================================================
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
> hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
> ==============================================================
> It still isn't happy although this part -> /usr/local/hadoop-0.19.2/input
>  <-  does exist
>
>  Aaron,
>>>
>>> Thanks or your help. I  carefully went through the steps again a couple
>>> times , and ran
>>>
>>> after this
>>> bin/hadoop namenode -format
>>>
>>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>>
>>>
>>> then
>>>
>>>
>>> bin/start-dfs.sh
>>>
>>> and
>>>
>>> bin/start-all.sh
>>>
>>>
>>> and then
>>> bin/hadoop fs -put conf input
>>>
>>> now the return for this seemed cryptic:
>>>
>>>
>>> put: Target input/conf is a directory
>>>
>>> (??)
>>>
>>>  and when I tried
>>>
>>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>>
>>> It says something about 0 nodes
>>>
>>> (from log file)
>>>
>>> 2010-02-01 13:26:29,874 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>>
>>>  src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>>  dst=null    perm=brian:supergroup:rw-r--r--
>>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 9000, call
>>>
>>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException:
>>> File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>> java.io.IOException: File
>>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>> could
>>> only be replicated to 0 nodes, instead of 1
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>  at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>  at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>
>>>
>>>
>>>
>>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>>
>>>  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>>  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>>
>>>
>>> and when I browse to http://localhost:50070/
>>>
>>>
>>>    Cluster Summary
>>>
>>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB
>>> /
>>> 992.31 MB (0%)
>>> *
>>> Configured Capacity     :       0 KB
>>> DFS Used        :       0 KB
>>> Non DFS Used    :       0 KB
>>> DFS Remaining   :       0 KB
>>> DFS Used%       :       100 %
>>> DFS Remaining%  :       0 %
>>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :
>>> 0
>>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :
>>> 0
>>>
>>>
>>> so I'm a bit still in the dark, I guess.
>>>
>>> Thanks
>>> Brian
>>>
>>>
>>>
>>>
>>> Aaron Kimball wrote:
>>>
>>>
>>>
>>>> Brian, it looks like you missed a step in the instructions. You'll need
>>>> to
>>>> format the hdfs filesystem instance before starting the NameNode server:
>>>>
>>>> You need to run:
>>>>
>>>> $ bin/hadoop namenode -format
>>>>
>>>> .. then you can do bin/start-dfs.sh
>>>> Hope this helps,
>>>> - Aaron
>>>>
>>>>
>>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>>> hadoop "quickstart" web page.
>>>>>
>>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>>
>>>>> This is from my hadoop-site.xml
>>>>>
>>>>> <configuration>
>>>>> <property>
>>>>> <name>hadoop.tmp.dir</name>
>>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>fs.default.name</name>
>>>>> <value>hdfs://localhost:9000</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.tracker</name>
>>>>> <value>localhost:9001</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>>> <value>-1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.replication</name>
>>>>> <value>1</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>dfs.permissions</name>
>>>>> <value>false</value>
>>>>> </property>
>>>>> <property>
>>>>> <name>webinterface.private.actions</name>
>>>>> <value>true</value>
>>>>> </property>
>>>>> </configuration>
>>>>>
>>>>> These are errors from my log files:
>>>>>
>>>>>
>>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>>> 2010-01-30 00:03:33,121 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>>> localhost/
>>>>> 127.0.0.1:9000
>>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>>> 2010-01-30 00:03:33,181 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>>> Initializing
>>>>> NameNodeMeterics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> fsOwner=brian,None,Administrators,Users
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> supergroup=supergroup
>>>>> 2010-01-30 00:03:34,603 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>>> isPermissionEnabled=false
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>>> Initializing FSNamesystemMetrics using context
>>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>>> 2010-01-30 00:03:34,653 INFO
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>>> FSNamesystemStatusMBean
>>>>> 2010-01-30 00:03:34,803 INFO
>>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>>> 2010-01-30 00:03:34,813 ERROR
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>>> initialization failed.
>>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>>> state:
>>>>> storage directory does not exist or is not accessible.
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>>  at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>>  at
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>>  at
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>>  at
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>>> server
>>>>> on 9000
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> =========================================================
>>>>>
>>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>>> connect
>>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>>> problem cleaning system directory: null
>>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>>> further information
>>>>>  at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>>  at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>>  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>>  at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>>  at
>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Brian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>

Re: hadoop under cygwin issue

Posted by Brian Wolf <br...@gmail.com>.

Alex Kozlov wrote:
> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0
>
> You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs directory
> (or where your logs are) and check the errors.
>
> Alex K
>
> On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:
>
>   



Thanks for your help, Alex,

I managed to get past that problem, now I have this problem:

However, when I try to run this example as stated on the quickstart webpage:

bin/hadoop jar hadoop-*-examples.jar grep input  output 'dfs[a-z.]+'

I get this error;
=============================================================
java.io.IOException:       Not a file: 
hdfs://localhost:9000/user/brian/input/conf
=========================================================
so it seems to default to my home directory looking for "input" it 
apparently  needs an absolute filepath, however, when I  run that way:

$ bin/hadoop jar hadoop-*-examples.jar grep 
/usr/local/hadoop-0.19.2/input  output 'dfs[a-z.]+'

==============================================================
org.apache.hadoop.mapred.InvalidInputException: Input path does not 
exist: hdfs://localhost:9000/usr/local/hadoop-0.19.2/input
==============================================================
It still isn't happy although this part -> 
/usr/local/hadoop-0.19.2/input    <-  does exist
>> Aaron,
>>
>> Thanks or your help. I  carefully went through the steps again a couple
>> times , and ran
>>
>> after this
>> bin/hadoop namenode -format
>>
>> (by the way, it asks if I want to reformat, I've tried it both ways)
>>
>>
>> then
>>
>>
>> bin/start-dfs.sh
>>
>> and
>>
>> bin/start-all.sh
>>
>>
>> and then
>> bin/hadoop fs -put conf input
>>
>> now the return for this seemed cryptic:
>>
>>
>> put: Target input/conf is a directory
>>
>> (??)
>>
>>  and when I tried
>>
>> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>>
>> It says something about 0 nodes
>>
>> (from log file)
>>
>> 2010-02-01 13:26:29,874 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
>> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>>  src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>>  dst=null    perm=brian:supergroup:rw-r--r--
>> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 3 on 9000, call
>> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
>> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException: File
>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar could
>> only be replicated to 0 nodes, instead of 1
>> java.io.IOException: File
>> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar could
>> only be replicated to 0 nodes, instead of 1
>>   at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>>   at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>   at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>   at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>
>>
>>
>>
>> To maybe rule out something regarding ports or ssh , when I run netstat:
>>
>>  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>>  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>>
>>
>> and when I browse to http://localhost:50070/
>>
>>
>>     Cluster Summary
>>
>> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB /
>> 992.31 MB (0%)
>> *
>> Configured Capacity     :       0 KB
>> DFS Used        :       0 KB
>> Non DFS Used    :       0 KB
>> DFS Remaining   :       0 KB
>> DFS Used%       :       100 %
>> DFS Remaining%  :       0 %
>> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0
>> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :       0
>>
>>
>> so I'm a bit still in the dark, I guess.
>>
>> Thanks
>> Brian
>>
>>
>>
>>
>> Aaron Kimball wrote:
>>
>>     
>>> Brian, it looks like you missed a step in the instructions. You'll need to
>>> format the hdfs filesystem instance before starting the NameNode server:
>>>
>>> You need to run:
>>>
>>> $ bin/hadoop namenode -format
>>>
>>> .. then you can do bin/start-dfs.sh
>>> Hope this helps,
>>> - Aaron
>>>
>>>
>>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>>
>>>
>>>
>>>       
>>>> Hi,
>>>>
>>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>>> hadoop "quickstart" web page.
>>>>
>>>> I know sshd is running and I can "ssh localhost" without a password.
>>>>
>>>> This is from my hadoop-site.xml
>>>>
>>>> <configuration>
>>>> <property>
>>>> <name>hadoop.tmp.dir</name>
>>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>>> </property>
>>>> <property>
>>>> <name>fs.default.name</name>
>>>> <value>hdfs://localhost:9000</value>
>>>> </property>
>>>> <property>
>>>> <name>mapred.job.tracker</name>
>>>> <value>localhost:9001</value>
>>>> </property>
>>>> <property>
>>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>>> <value>-1</value>
>>>> </property>
>>>> <property>
>>>> <name>dfs.replication</name>
>>>> <value>1</value>
>>>> </property>
>>>> <property>
>>>> <name>dfs.permissions</name>
>>>> <value>false</value>
>>>> </property>
>>>> <property>
>>>> <name>webinterface.private.actions</name>
>>>> <value>true</value>
>>>> </property>
>>>> </configuration>
>>>>
>>>> These are errors from my log files:
>>>>
>>>>
>>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>>> 2010-01-30 00:03:33,121 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>>> localhost/
>>>> 127.0.0.1:9000
>>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>>> 2010-01-30 00:03:33,181 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>>> Initializing
>>>> NameNodeMeterics using context
>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>> 2010-01-30 00:03:34,603 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>> fsOwner=brian,None,Administrators,Users
>>>> 2010-01-30 00:03:34,603 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>> supergroup=supergroup
>>>> 2010-01-30 00:03:34,603 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>>> isPermissionEnabled=false
>>>> 2010-01-30 00:03:34,653 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>>> Initializing FSNamesystemMetrics using context
>>>> object:org.apache.hadoop.metrics.spi.NullContext
>>>> 2010-01-30 00:03:34,653 INFO
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>>> FSNamesystemStatusMBean
>>>> 2010-01-30 00:03:34,803 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage:
>>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>>> 2010-01-30 00:03:34,813 ERROR
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>>> initialization failed.
>>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>>> state:
>>>> storage directory does not exist or is not accessible.
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>>  at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>>  at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>>  at
>>>>
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>>  at
>>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>>> server
>>>> on 9000
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> =========================================================
>>>>
>>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect
>>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>>> problem cleaning system directory: null
>>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>>> connection exception: java.net.ConnectException: Connection refused: no
>>>> further information
>>>>  at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>>  at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>>  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>  at $Proxy4.getProtocolVersion(Unknown Source)
>>>>  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>>  at
>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Brian
>>>>
>>>>
>>>>
>>>>
>>>>         
>>>
>>>       
>>     
>
>

Re: hadoop under cygwin issue

Posted by Alex Kozlov <al...@cloudera.com>.

Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0

You datanode is dead.  Look at the logs in the $HADOOP_HOME/logs directory
(or where your logs are) and check the errors.

Alex K

On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf <br...@gmail.com> wrote:

>
> Aaron,
>
> Thanks or your help. I  carefully went through the steps again a couple
> times , and ran
>
> after this
> bin/hadoop namenode -format
>
> (by the way, it asks if I want to reformat, I've tried it both ways)
>
>
> then
>
>
> bin/start-dfs.sh
>
> and
>
> bin/start-all.sh
>
>
> and then
> bin/hadoop fs -put conf input
>
> now the return for this seemed cryptic:
>
>
> put: Target input/conf is a directory
>
> (??)
>
>  and when I tried
>
> bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
>
> It says something about 0 nodes
>
> (from log file)
>
> 2010-02-01 13:26:29,874 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
> ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create
>  src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
>  dst=null    perm=brian:supergroup:rw-r--r--
> 2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 9000, call
> addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar,
> DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException: File
> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar could
> only be replicated to 0 nodes, instead of 1
> java.io.IOException: File
> /cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar could
> only be replicated to 0 nodes, instead of 1
>   at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
>   at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>
>
>
> To maybe rule out something regarding ports or ssh , when I run netstat:
>
>  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
>  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
>
>
> and when I browse to http://localhost:50070/
>
>
>     Cluster Summary
>
> * * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 MB /
> 992.31 MB (0%)
> *
> Configured Capacity     :       0 KB
> DFS Used        :       0 KB
> Non DFS Used    :       0 KB
> DFS Remaining   :       0 KB
> DFS Used%       :       100 %
> DFS Remaining%  :       0 %
> Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes>     :       0
> Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes>     :       0
>
>
> so I'm a bit still in the dark, I guess.
>
> Thanks
> Brian
>
>
>
>
> Aaron Kimball wrote:
>
>> Brian, it looks like you missed a step in the instructions. You'll need to
>> format the hdfs filesystem instance before starting the NameNode server:
>>
>> You need to run:
>>
>> $ bin/hadoop namenode -format
>>
>> .. then you can do bin/start-dfs.sh
>> Hope this helps,
>> - Aaron
>>
>>
>> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>>
>>
>>
>>> Hi,
>>>
>>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>>> hadoop "quickstart" web page.
>>>
>>> I know sshd is running and I can "ssh localhost" without a password.
>>>
>>> This is from my hadoop-site.xml
>>>
>>> <configuration>
>>> <property>
>>> <name>hadoop.tmp.dir</name>
>>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>>> </property>
>>> <property>
>>> <name>fs.default.name</name>
>>> <value>hdfs://localhost:9000</value>
>>> </property>
>>> <property>
>>> <name>mapred.job.tracker</name>
>>> <value>localhost:9001</value>
>>> </property>
>>> <property>
>>> <name>mapred.job.reuse.jvm.num.tasks</name>
>>> <value>-1</value>
>>> </property>
>>> <property>
>>> <name>dfs.replication</name>
>>> <value>1</value>
>>> </property>
>>> <property>
>>> <name>dfs.permissions</name>
>>> <value>false</value>
>>> </property>
>>> <property>
>>> <name>webinterface.private.actions</name>
>>> <value>true</value>
>>> </property>
>>> </configuration>
>>>
>>> These are errors from my log files:
>>>
>>>
>>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>> Initializing RPC Metrics with hostName=NameNode, port=9000
>>> 2010-01-30 00:03:33,121 INFO
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>> localhost/
>>> 127.0.0.1:9000
>>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>> 2010-01-30 00:03:33,181 INFO
>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>> Initializing
>>> NameNodeMeterics using context
>>> object:org.apache.hadoop.metrics.spi.NullContext
>>> 2010-01-30 00:03:34,603 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> fsOwner=brian,None,Administrators,Users
>>> 2010-01-30 00:03:34,603 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> supergroup=supergroup
>>> 2010-01-30 00:03:34,603 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> isPermissionEnabled=false
>>> 2010-01-30 00:03:34,653 INFO
>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>> Initializing FSNamesystemMetrics using context
>>> object:org.apache.hadoop.metrics.spi.NullContext
>>> 2010-01-30 00:03:34,653 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>> FSNamesystemStatusMBean
>>> 2010-01-30 00:03:34,803 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage:
>>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>>> 2010-01-30 00:03:34,813 ERROR
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>> initialization failed.
>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent
>>> state:
>>> storage directory does not exist or is not accessible.
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>  at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>  at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>  at
>>>
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>  at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping
>>> server
>>> on 9000
>>>
>>>
>>>
>>>
>>>
>>> =========================================================
>>>
>>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>>> problem cleaning system directory: null
>>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>>> connection exception: java.net.ConnectException: Connection refused: no
>>> further information
>>>  at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>>  at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>>  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>  at $Proxy4.getProtocolVersion(Unknown Source)
>>>  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>>  at
>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>>
>>>
>>>
>>> Thanks
>>> Brian
>>>
>>>
>>>
>>>
>>
>>
>>
>
>

Re: hadoop under cygwin issue

Posted by Brian Wolf <br...@gmail.com>.

Aaron,

Thanks or your help. I  carefully went through the steps again a couple 
times , and ran

after this 

bin/hadoop namenode -format

(by the way, it asks if I want to reformat, I've tried it both ways)


then


bin/start-dfs.sh

and

bin/start-all.sh


and then 

bin/hadoop fs -put conf input

now the return for this seemed cryptic:

 
put: Target input/conf is a directory

(??)

  and when I tried

bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'

It says something about 0 nodes

(from log file)

2010-02-01 13:26:29,874 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
ugi=brian,None,Administrators,Users    ip=/127.0.0.1    cmd=create    
src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar    
dst=null    perm=brian:supergroup:rw-r--r--
2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 3 on 9000, call 
addBlock(/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar, 
DFSClient_725490811) from 127.0.0.1:3003: error: java.io.IOException: 
File 
/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar 
could only be replicated to 0 nodes, instead of 1
java.io.IOException: File 
/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar 
could only be replicated to 0 nodes, instead of 1
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1287)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)


 

To maybe rule out something regarding ports or ssh , when I run netstat:

  TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
  TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING


and when I browse to http://localhost:50070/


      Cluster Summary

* * * 21 files and directories, 0 blocks = 21 total. Heap Size is 8.01 
MB / 992.31 MB (0%)
*
Configured Capacity 	: 	0 KB
DFS Used 	: 	0 KB
Non DFS Used 	: 	0 KB
DFS Remaining 	: 	0 KB
DFS Used% 	: 	100 %
DFS Remaining% 	: 	0 %
Live Nodes <http://localhost:50070/dfshealth.jsp#LiveNodes> 	: 	0
Dead Nodes <http://localhost:50070/dfshealth.jsp#DeadNodes> 	: 	0


so I'm a bit still in the dark, I guess.

Thanks
Brian
 


Aaron Kimball wrote:
> Brian, it looks like you missed a step in the instructions. You'll need to
> format the hdfs filesystem instance before starting the NameNode server:
>
> You need to run:
>
> $ bin/hadoop namenode -format
>
> .. then you can do bin/start-dfs.sh
> Hope this helps,
> - Aaron
>
>
> On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:
>
>   
>> Hi,
>>
>> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
>> hadoop "quickstart" web page.
>>
>> I know sshd is running and I can "ssh localhost" without a password.
>>
>> This is from my hadoop-site.xml
>>
>> <configuration>
>> <property>
>> <name>hadoop.tmp.dir</name>
>> <value>/cygwin/tmp/hadoop-${user.name}</value>
>> </property>
>> <property>
>> <name>fs.default.name</name>
>> <value>hdfs://localhost:9000</value>
>> </property>
>> <property>
>> <name>mapred.job.tracker</name>
>> <value>localhost:9001</value>
>> </property>
>> <property>
>> <name>mapred.job.reuse.jvm.num.tasks</name>
>> <value>-1</value>
>> </property>
>> <property>
>> <name>dfs.replication</name>
>> <value>1</value>
>> </property>
>> <property>
>> <name>dfs.permissions</name>
>> <value>false</value>
>> </property>
>> <property>
>> <name>webinterface.private.actions</name>
>> <value>true</value>
>> </property>
>> </configuration>
>>
>> These are errors from my log files:
>>
>>
>> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>> Initializing RPC Metrics with hostName=NameNode, port=9000
>> 2010-01-30 00:03:33,121 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/
>> 127.0.0.1:9000
>> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>> 2010-01-30 00:03:33,181 INFO
>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing
>> NameNodeMeterics using context
>> object:org.apache.hadoop.metrics.spi.NullContext
>> 2010-01-30 00:03:34,603 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> fsOwner=brian,None,Administrators,Users
>> 2010-01-30 00:03:34,603 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
>> 2010-01-30 00:03:34,603 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> isPermissionEnabled=false
>> 2010-01-30 00:03:34,653 INFO
>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>> Initializing FSNamesystemMetrics using context
>> object:org.apache.hadoop.metrics.spi.NullContext
>> 2010-01-30 00:03:34,653 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>> FSNamesystemStatusMBean
>> 2010-01-30 00:03:34,803 INFO org.apache.hadoop.hdfs.server.common.Storage:
>> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
>> 2010-01-30 00:03:34,813 ERROR
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>> initialization failed.
>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent state:
>> storage directory does not exist or is not accessible.
>>   at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>>   at
>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>   at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>>   at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>>   at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>   at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>   at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>   at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>   at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping server
>> on 9000
>>
>>
>>
>>
>>
>> =========================================================
>>
>> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying connect
>> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
>> problem cleaning system directory: null
>> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
>> connection exception: java.net.ConnectException: Connection refused: no
>> further information
>>   at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>>   at org.apache.hadoop.ipc.Client.call(Client.java:700)
>>   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>   at $Proxy4.getProtocolVersion(Unknown Source)
>>   at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>>   at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>>
>>
>>
>> Thanks
>> Brian
>>
>>
>>     
>
>

Re: hadoop under cygwin issue

Posted by Aaron Kimball <aa...@cloudera.com>.

Brian, it looks like you missed a step in the instructions. You'll need to
format the hdfs filesystem instance before starting the NameNode server:

You need to run:

$ bin/hadoop namenode -format

.. then you can do bin/start-dfs.sh
Hope this helps,
- Aaron


On Sat, Jan 30, 2010 at 12:27 AM, Brian Wolf <br...@gmail.com> wrote:

>
> Hi,
>
> I am trying to run Hadoop 0.19.2 under cygwin as per directions on the
> hadoop "quickstart" web page.
>
> I know sshd is running and I can "ssh localhost" without a password.
>
> This is from my hadoop-site.xml
>
> <configuration>
> <property>
> <name>hadoop.tmp.dir</name>
> <value>/cygwin/tmp/hadoop-${user.name}</value>
> </property>
> <property>
> <name>fs.default.name</name>
> <value>hdfs://localhost:9000</value>
> </property>
> <property>
> <name>mapred.job.tracker</name>
> <value>localhost:9001</value>
> </property>
> <property>
> <name>mapred.job.reuse.jvm.num.tasks</name>
> <value>-1</value>
> </property>
> <property>
> <name>dfs.replication</name>
> <value>1</value>
> </property>
> <property>
> <name>dfs.permissions</name>
> <value>false</value>
> </property>
> <property>
> <name>webinterface.private.actions</name>
> <value>true</value>
> </property>
> </configuration>
>
> These are errors from my log files:
>
>
> 2010-01-30 00:03:33,091 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> Initializing RPC Metrics with hostName=NameNode, port=9000
> 2010-01-30 00:03:33,121 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/
> 127.0.0.1:9000
> 2010-01-30 00:03:33,161 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=NameNode, sessionId=null
> 2010-01-30 00:03:33,181 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing
> NameNodeMeterics using context
> object:org.apache.hadoop.metrics.spi.NullContext
> 2010-01-30 00:03:34,603 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> fsOwner=brian,None,Administrators,Users
> 2010-01-30 00:03:34,603 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
> 2010-01-30 00:03:34,603 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> isPermissionEnabled=false
> 2010-01-30 00:03:34,653 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
> Initializing FSNamesystemMetrics using context
> object:org.apache.hadoop.metrics.spi.NullContext
> 2010-01-30 00:03:34,653 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
> FSNamesystemStatusMBean
> 2010-01-30 00:03:34,803 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Storage directory C:\cygwin\tmp\hadoop-brian\dfs\name does not exist.
> 2010-01-30 00:03:34,813 ERROR
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
> initialization failed.
> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
> Directory C:\cygwin\tmp\hadoop-brian\dfs\name is in an inconsistent state:
> storage directory does not exist or is not accessible.
>   at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>   at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>   at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
>   at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:288)
>   at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>   at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>   at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>   at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>   at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
> 2010-01-30 00:03:34,823 INFO org.apache.hadoop.ipc.Server: Stopping server
> on 9000
>
>
>
>
>
> =========================================================
>
> 2010-01-29 15:13:30,270 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
> problem cleaning system directory: null
> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
> connection exception: java.net.ConnectException: Connection refused: no
> further information
>   at org.apache.hadoop.ipc.Client.wrapException(Client.java:724)
>   at org.apache.hadoop.ipc.Client.call(Client.java:700)
>   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>   at $Proxy4.getProtocolVersion(Unknown Source)
>   at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)
>   at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>
>
>
> Thanks
> Brian
>
>