You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Abhay Ratnaparkhi <ab...@gmail.com> on 2012/09/03 16:19:09 UTC

knowing the nodes on which reduce tasks will run

Hello,

How can one get to know the nodes on which reduce tasks will run?

One of my job is running and it's completing all the map tasks.
My map tasks write lots of intermediate data. The intermediate directory is
getting full on all the nodes.
If the reduce task take any node from cluster then It'll try to copy the
data to same disk and it'll eventually fail due to Disk space related
exceptions.

I have added few more tasktracker nodes in the cluster and now want to run
reducer on new nodes only.
Is it possible to choose a node on which the reducer will run? What's the
algorithm hadoop uses to get a new node to run reducer?

Thanks in advance.

Bye
Abhay

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Hi Hemanth,

That is a copy paste issue and also, I have cygwin installed. I will revisit my install on Ubuntu and ask for any help I may need there.

Thanks for taking the time to respond.
Udayini

--- On Tue, 9/4/12, Hemanth Yamijala <yh...@thoughtworks.com> wrote:

From: Hemanth Yamijala <yh...@thoughtworks.com>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Cc: "Udayini Pendyala" <ud...@yahoo.com>
Date: Tuesday, September 4, 2012, 9:29 PM

Though I agree with others that it would probably be easier to get Hadoop up and running on Unix based systems, couldn't help notice that this path:
 \tmp \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging


seems to have a space in the first component i.e '\tmp ' and not '\tmp'. Is that a copy paste issue, or is it really the case. Again, not sure if it could cause the specific error you're seeing, but could try removing the space if it does exist. Also assuming that you've set up Cygwin etc. if you still want to try out on Windows.

Thankshemanth
On Wed, Sep 5, 2012 at 12:12 AM, Marcos Ortiz <ml...@uci.cu> wrote:


  
    
  
  
    

    On 09/04/2012 02:35 PM, Udayini
      Pendyala wrote:

    
    
      
        
          
            Hi Bejoy,

              

              Thanks for your response. I first started to install on
              Ubuntu Linux and ran into a bunch of problems. So, I
              wanted to back off a bit and try something simple first.
              Hence, my attempt to install on my Windows 7 Laptop.

            
          
        
      
    
    Well, if you tell to us the problems that you have in Ubuntu, we can
    give you a hand.

    Michael Noll have great tutorials for this:

    

    Running Hadoop on Ubuntu Linux (Single node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

    

    Running Hadoop on Ubuntu Linux (Multi node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/


    
      
        
          
            

              I am doing the "standalone" mode - as per the
              documentation (link in my original email), I don't need
              ssh unless I am doing the distributed mode. Is that not
              correct?

            
          
        
      
    
    Yes, but I give you the same recommendation that Bejoy said to you:
    Use a Unix-based platform for Hadoop, it's more tested and have
    better performance than Windows.

    

    Best wishes

    
      
        
          
            

              Thanks again for responding

              Udayini

              

              

              --- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
              wrote:

              

                From: Bejoy Ks <be...@gmail.com>

                Subject: Re: Exception while running a Hadoop example on
                a standalone install on Windows 7

                To: user@hadoop.apache.org

                Date: Tuesday, September 4, 2012, 11:11 AM

                

                Hi Udayani
                  

                  
                  By default hadoop works well for linux and linux
                    based OS. Since you are on Windows you need to
                    install and configure ssh using cygwin before you
                    start hadoop daemons.

                    

                    
                      On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com>
                      wrote:

                      
                        
                          
                            
                              
                                Hi,
                                

                                  
                                Following
                                  is a description of what I am trying
                                  to do and the steps I followed. 

                                  
                                

                                  
                                GOAL:
                                a). Install
                                  Hadoop
                                  1.0.3
                                b). Hadoop in a standalone (or local)
                                  mode
                                c). OS: Windows 7
                                

                                
                                STEPS
                                    FOLLOWED:
                                1.    1.   I
                                  followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the
                                    steps I did - 
                                a.       I
                                  went to: http://hadoop.apache.org/core/releases.html.
                                b.      I
                                  installed hadoop-1.0.3 by downloading
                                  “hadoop-1.0.3.tar.gz” and
                                  unzipping/untarring the file.
                                c.       I
                                  installed JDK 1.6 and set up JAVA_HOME
                                  to point to it.
                                d.      I
                                  set up HADOOP_INSTALL to point to my
                                  Hadoop install location. I updated my
                                  PATH
                                  variable to have $HADOOP_INSTALL/bin
                                e.      After
the
                                  above steps, I ran the command:
                                  “hadoop version” and got the following
                                  information:
                                $ hadoop
                                    version
                                Hadoop
                                    1.0.3
                                Subversion
                                    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
                                    -r 1335192
                                Compiled
                                    by
                                    hortonfo on Tue May 8 20:31:25 UTC
                                    2012
                                From
                                    source with
                                    checksum
                                    e6b0c1e23dcf76907c5fecb4b832f3be
                                 
                                2.      2.  The
standalone
                                  was very easy to install as described
                                  above. Then, I tried to run a
                                  sample command as given in:
                                http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
                                Specifically,
                                    the
                                  steps followed were:
                                a.       cd
                                  $HADOOP_INSTALL
                                b.      mkdir
                                  input
                                c.       cp
conf/*.xml
                                  input
                                d.      bin/hadoop
jar
                                  hadoop-examples-1.0.3.jar grep input
                                  output ‘dfs[a-z.]+’
                                and got
                                  the following error:
                                 
                                $
                                    bin/hadoop jar
                                    hadoop-examples-1.0.3.jar grep input
                                    output 'dfs[a-z.]+'
                                12/09/03
                                    15:41:57 WARN util.NativeCodeLoader:
                                    Unable to load native-hadoop libra
                                    ry for
                                    your platform... using builtin-java
                                    classes where applicable
                                12/09/03
                                    15:41:57 ERROR
                                    security.UserGroupInformation:
                                    PriviledgedActionExceptio n
                                    as:upendyal
                                    cause:java.io.IOException: Failed to
                                    set permissions of path: \tmp
                                    \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
                                    to 0700
                                java.io.IOException:
                                    Failed to set
                                    permissions of path:
                                    \tmp\hadoop-upendyal\map
                                    red\staging\upendyal-1075683580\.staging
                                    to 0700
                                at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
                                at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
                                at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
                                    tem.java:509)
                                at
                                    org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
                                    a:344)
                                at
                                    org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
                                    9)
                                at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
                                    ssionFiles.java:116)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
                                at
                                    java.security.AccessController.doPrivileged(Native
                                    Method)
                                at
                                    javax.security.auth.Subject.doAs(Unknown
                                    Source)
                                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
                                    tion.java:1121)
                                at
                                    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
                                    50)
                                at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
                                at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
                                at
org.apache.hadoop.examples.Grep.run(Grep.java:69)
                                at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                                at
org.apache.hadoop.examples.Grep.main(Grep.java:93)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
                                    mDriver.java:68)
                                at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
                                at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)
                                 
                                3.    3.   I
                                  googled
                                    the problem and found the following
links
                                  but none of these suggestions helped. Most
                                    people seem to be getting a
                                    resolution when they change the
                                    version of Hadoop.

                                  
                                a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E

                                b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837

                                

                                  
                                Is this a
                                    problem in the version of Hadoop I
                                    selected OR am I doing something
                                    wrong? I would appreciate any help
                                    with
                                    this.
                                Thanks
                                
                                    Udayini
                                  
                            
                          
                        
                      
                    
                    

                  
                
              
            
          
        
      
    
    

    -- 

      
      
      
        
          Marcos Luis Ortíz Valmaseda

          Data Engineer && Sr. System Administrator at
            UCI

          about.me/marcosortiz

          My Blog

          @marcosluis2186
          

        
      
    
    

    

  















Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Hi Hemanth,

That is a copy paste issue and also, I have cygwin installed. I will revisit my install on Ubuntu and ask for any help I may need there.

Thanks for taking the time to respond.
Udayini

--- On Tue, 9/4/12, Hemanth Yamijala <yh...@thoughtworks.com> wrote:

From: Hemanth Yamijala <yh...@thoughtworks.com>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Cc: "Udayini Pendyala" <ud...@yahoo.com>
Date: Tuesday, September 4, 2012, 9:29 PM

Though I agree with others that it would probably be easier to get Hadoop up and running on Unix based systems, couldn't help notice that this path:
 \tmp \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging


seems to have a space in the first component i.e '\tmp ' and not '\tmp'. Is that a copy paste issue, or is it really the case. Again, not sure if it could cause the specific error you're seeing, but could try removing the space if it does exist. Also assuming that you've set up Cygwin etc. if you still want to try out on Windows.

Thankshemanth
On Wed, Sep 5, 2012 at 12:12 AM, Marcos Ortiz <ml...@uci.cu> wrote:


  
    
  
  
    

    On 09/04/2012 02:35 PM, Udayini
      Pendyala wrote:

    
    
      
        
          
            Hi Bejoy,

              

              Thanks for your response. I first started to install on
              Ubuntu Linux and ran into a bunch of problems. So, I
              wanted to back off a bit and try something simple first.
              Hence, my attempt to install on my Windows 7 Laptop.

            
          
        
      
    
    Well, if you tell to us the problems that you have in Ubuntu, we can
    give you a hand.

    Michael Noll have great tutorials for this:

    

    Running Hadoop on Ubuntu Linux (Single node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

    

    Running Hadoop on Ubuntu Linux (Multi node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/


    
      
        
          
            

              I am doing the "standalone" mode - as per the
              documentation (link in my original email), I don't need
              ssh unless I am doing the distributed mode. Is that not
              correct?

            
          
        
      
    
    Yes, but I give you the same recommendation that Bejoy said to you:
    Use a Unix-based platform for Hadoop, it's more tested and have
    better performance than Windows.

    

    Best wishes

    
      
        
          
            

              Thanks again for responding

              Udayini

              

              

              --- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
              wrote:

              

                From: Bejoy Ks <be...@gmail.com>

                Subject: Re: Exception while running a Hadoop example on
                a standalone install on Windows 7

                To: user@hadoop.apache.org

                Date: Tuesday, September 4, 2012, 11:11 AM

                

                Hi Udayani
                  

                  
                  By default hadoop works well for linux and linux
                    based OS. Since you are on Windows you need to
                    install and configure ssh using cygwin before you
                    start hadoop daemons.

                    

                    
                      On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com>
                      wrote:

                      
                        
                          
                            
                              
                                Hi,
                                

                                  
                                Following
                                  is a description of what I am trying
                                  to do and the steps I followed. 

                                  
                                

                                  
                                GOAL:
                                a). Install
                                  Hadoop
                                  1.0.3
                                b). Hadoop in a standalone (or local)
                                  mode
                                c). OS: Windows 7
                                

                                
                                STEPS
                                    FOLLOWED:
                                1.    1.   I
                                  followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the
                                    steps I did - 
                                a.       I
                                  went to: http://hadoop.apache.org/core/releases.html.
                                b.      I
                                  installed hadoop-1.0.3 by downloading
                                  “hadoop-1.0.3.tar.gz” and
                                  unzipping/untarring the file.
                                c.       I
                                  installed JDK 1.6 and set up JAVA_HOME
                                  to point to it.
                                d.      I
                                  set up HADOOP_INSTALL to point to my
                                  Hadoop install location. I updated my
                                  PATH
                                  variable to have $HADOOP_INSTALL/bin
                                e.      After
the
                                  above steps, I ran the command:
                                  “hadoop version” and got the following
                                  information:
                                $ hadoop
                                    version
                                Hadoop
                                    1.0.3
                                Subversion
                                    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
                                    -r 1335192
                                Compiled
                                    by
                                    hortonfo on Tue May 8 20:31:25 UTC
                                    2012
                                From
                                    source with
                                    checksum
                                    e6b0c1e23dcf76907c5fecb4b832f3be
                                 
                                2.      2.  The
standalone
                                  was very easy to install as described
                                  above. Then, I tried to run a
                                  sample command as given in:
                                http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
                                Specifically,
                                    the
                                  steps followed were:
                                a.       cd
                                  $HADOOP_INSTALL
                                b.      mkdir
                                  input
                                c.       cp
conf/*.xml
                                  input
                                d.      bin/hadoop
jar
                                  hadoop-examples-1.0.3.jar grep input
                                  output ‘dfs[a-z.]+’
                                and got
                                  the following error:
                                 
                                $
                                    bin/hadoop jar
                                    hadoop-examples-1.0.3.jar grep input
                                    output 'dfs[a-z.]+'
                                12/09/03
                                    15:41:57 WARN util.NativeCodeLoader:
                                    Unable to load native-hadoop libra
                                    ry for
                                    your platform... using builtin-java
                                    classes where applicable
                                12/09/03
                                    15:41:57 ERROR
                                    security.UserGroupInformation:
                                    PriviledgedActionExceptio n
                                    as:upendyal
                                    cause:java.io.IOException: Failed to
                                    set permissions of path: \tmp
                                    \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
                                    to 0700
                                java.io.IOException:
                                    Failed to set
                                    permissions of path:
                                    \tmp\hadoop-upendyal\map
                                    red\staging\upendyal-1075683580\.staging
                                    to 0700
                                at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
                                at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
                                at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
                                    tem.java:509)
                                at
                                    org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
                                    a:344)
                                at
                                    org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
                                    9)
                                at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
                                    ssionFiles.java:116)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
                                at
                                    java.security.AccessController.doPrivileged(Native
                                    Method)
                                at
                                    javax.security.auth.Subject.doAs(Unknown
                                    Source)
                                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
                                    tion.java:1121)
                                at
                                    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
                                    50)
                                at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
                                at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
                                at
org.apache.hadoop.examples.Grep.run(Grep.java:69)
                                at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                                at
org.apache.hadoop.examples.Grep.main(Grep.java:93)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
                                    mDriver.java:68)
                                at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
                                at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)
                                 
                                3.    3.   I
                                  googled
                                    the problem and found the following
links
                                  but none of these suggestions helped. Most
                                    people seem to be getting a
                                    resolution when they change the
                                    version of Hadoop.

                                  
                                a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E

                                b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837

                                

                                  
                                Is this a
                                    problem in the version of Hadoop I
                                    selected OR am I doing something
                                    wrong? I would appreciate any help
                                    with
                                    this.
                                Thanks
                                
                                    Udayini
                                  
                            
                          
                        
                      
                    
                    

                  
                
              
            
          
        
      
    
    

    -- 

      
      
      
        
          Marcos Luis Ortíz Valmaseda

          Data Engineer && Sr. System Administrator at
            UCI

          about.me/marcosortiz

          My Blog

          @marcosluis2186
          

        
      
    
    

    

  















Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Hi Hemanth,

That is a copy paste issue and also, I have cygwin installed. I will revisit my install on Ubuntu and ask for any help I may need there.

Thanks for taking the time to respond.
Udayini

--- On Tue, 9/4/12, Hemanth Yamijala <yh...@thoughtworks.com> wrote:

From: Hemanth Yamijala <yh...@thoughtworks.com>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Cc: "Udayini Pendyala" <ud...@yahoo.com>
Date: Tuesday, September 4, 2012, 9:29 PM

Though I agree with others that it would probably be easier to get Hadoop up and running on Unix based systems, couldn't help notice that this path:
 \tmp \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging


seems to have a space in the first component i.e '\tmp ' and not '\tmp'. Is that a copy paste issue, or is it really the case. Again, not sure if it could cause the specific error you're seeing, but could try removing the space if it does exist. Also assuming that you've set up Cygwin etc. if you still want to try out on Windows.

Thankshemanth
On Wed, Sep 5, 2012 at 12:12 AM, Marcos Ortiz <ml...@uci.cu> wrote:


  
    
  
  
    

    On 09/04/2012 02:35 PM, Udayini
      Pendyala wrote:

    
    
      
        
          
            Hi Bejoy,

              

              Thanks for your response. I first started to install on
              Ubuntu Linux and ran into a bunch of problems. So, I
              wanted to back off a bit and try something simple first.
              Hence, my attempt to install on my Windows 7 Laptop.

            
          
        
      
    
    Well, if you tell to us the problems that you have in Ubuntu, we can
    give you a hand.

    Michael Noll have great tutorials for this:

    

    Running Hadoop on Ubuntu Linux (Single node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

    

    Running Hadoop on Ubuntu Linux (Multi node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/


    
      
        
          
            

              I am doing the "standalone" mode - as per the
              documentation (link in my original email), I don't need
              ssh unless I am doing the distributed mode. Is that not
              correct?

            
          
        
      
    
    Yes, but I give you the same recommendation that Bejoy said to you:
    Use a Unix-based platform for Hadoop, it's more tested and have
    better performance than Windows.

    

    Best wishes

    
      
        
          
            

              Thanks again for responding

              Udayini

              

              

              --- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
              wrote:

              

                From: Bejoy Ks <be...@gmail.com>

                Subject: Re: Exception while running a Hadoop example on
                a standalone install on Windows 7

                To: user@hadoop.apache.org

                Date: Tuesday, September 4, 2012, 11:11 AM

                

                Hi Udayani
                  

                  
                  By default hadoop works well for linux and linux
                    based OS. Since you are on Windows you need to
                    install and configure ssh using cygwin before you
                    start hadoop daemons.

                    

                    
                      On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com>
                      wrote:

                      
                        
                          
                            
                              
                                Hi,
                                

                                  
                                Following
                                  is a description of what I am trying
                                  to do and the steps I followed. 

                                  
                                

                                  
                                GOAL:
                                a). Install
                                  Hadoop
                                  1.0.3
                                b). Hadoop in a standalone (or local)
                                  mode
                                c). OS: Windows 7
                                

                                
                                STEPS
                                    FOLLOWED:
                                1.    1.   I
                                  followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the
                                    steps I did - 
                                a.       I
                                  went to: http://hadoop.apache.org/core/releases.html.
                                b.      I
                                  installed hadoop-1.0.3 by downloading
                                  “hadoop-1.0.3.tar.gz” and
                                  unzipping/untarring the file.
                                c.       I
                                  installed JDK 1.6 and set up JAVA_HOME
                                  to point to it.
                                d.      I
                                  set up HADOOP_INSTALL to point to my
                                  Hadoop install location. I updated my
                                  PATH
                                  variable to have $HADOOP_INSTALL/bin
                                e.      After
the
                                  above steps, I ran the command:
                                  “hadoop version” and got the following
                                  information:
                                $ hadoop
                                    version
                                Hadoop
                                    1.0.3
                                Subversion
                                    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
                                    -r 1335192
                                Compiled
                                    by
                                    hortonfo on Tue May 8 20:31:25 UTC
                                    2012
                                From
                                    source with
                                    checksum
                                    e6b0c1e23dcf76907c5fecb4b832f3be
                                 
                                2.      2.  The
standalone
                                  was very easy to install as described
                                  above. Then, I tried to run a
                                  sample command as given in:
                                http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
                                Specifically,
                                    the
                                  steps followed were:
                                a.       cd
                                  $HADOOP_INSTALL
                                b.      mkdir
                                  input
                                c.       cp
conf/*.xml
                                  input
                                d.      bin/hadoop
jar
                                  hadoop-examples-1.0.3.jar grep input
                                  output ‘dfs[a-z.]+’
                                and got
                                  the following error:
                                 
                                $
                                    bin/hadoop jar
                                    hadoop-examples-1.0.3.jar grep input
                                    output 'dfs[a-z.]+'
                                12/09/03
                                    15:41:57 WARN util.NativeCodeLoader:
                                    Unable to load native-hadoop libra
                                    ry for
                                    your platform... using builtin-java
                                    classes where applicable
                                12/09/03
                                    15:41:57 ERROR
                                    security.UserGroupInformation:
                                    PriviledgedActionExceptio n
                                    as:upendyal
                                    cause:java.io.IOException: Failed to
                                    set permissions of path: \tmp
                                    \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
                                    to 0700
                                java.io.IOException:
                                    Failed to set
                                    permissions of path:
                                    \tmp\hadoop-upendyal\map
                                    red\staging\upendyal-1075683580\.staging
                                    to 0700
                                at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
                                at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
                                at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
                                    tem.java:509)
                                at
                                    org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
                                    a:344)
                                at
                                    org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
                                    9)
                                at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
                                    ssionFiles.java:116)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
                                at
                                    java.security.AccessController.doPrivileged(Native
                                    Method)
                                at
                                    javax.security.auth.Subject.doAs(Unknown
                                    Source)
                                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
                                    tion.java:1121)
                                at
                                    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
                                    50)
                                at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
                                at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
                                at
org.apache.hadoop.examples.Grep.run(Grep.java:69)
                                at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                                at
org.apache.hadoop.examples.Grep.main(Grep.java:93)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
                                    mDriver.java:68)
                                at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
                                at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)
                                 
                                3.    3.   I
                                  googled
                                    the problem and found the following
links
                                  but none of these suggestions helped. Most
                                    people seem to be getting a
                                    resolution when they change the
                                    version of Hadoop.

                                  
                                a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E

                                b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837

                                

                                  
                                Is this a
                                    problem in the version of Hadoop I
                                    selected OR am I doing something
                                    wrong? I would appreciate any help
                                    with
                                    this.
                                Thanks
                                
                                    Udayini
                                  
                            
                          
                        
                      
                    
                    

                  
                
              
            
          
        
      
    
    

    -- 

      
      
      
        
          Marcos Luis Ortíz Valmaseda

          Data Engineer && Sr. System Administrator at
            UCI

          about.me/marcosortiz

          My Blog

          @marcosluis2186
          

        
      
    
    

    

  















Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Hi Hemanth,

That is a copy paste issue and also, I have cygwin installed. I will revisit my install on Ubuntu and ask for any help I may need there.

Thanks for taking the time to respond.
Udayini

--- On Tue, 9/4/12, Hemanth Yamijala <yh...@thoughtworks.com> wrote:

From: Hemanth Yamijala <yh...@thoughtworks.com>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Cc: "Udayini Pendyala" <ud...@yahoo.com>
Date: Tuesday, September 4, 2012, 9:29 PM

Though I agree with others that it would probably be easier to get Hadoop up and running on Unix based systems, couldn't help notice that this path:
 \tmp \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging


seems to have a space in the first component i.e '\tmp ' and not '\tmp'. Is that a copy paste issue, or is it really the case. Again, not sure if it could cause the specific error you're seeing, but could try removing the space if it does exist. Also assuming that you've set up Cygwin etc. if you still want to try out on Windows.

Thankshemanth
On Wed, Sep 5, 2012 at 12:12 AM, Marcos Ortiz <ml...@uci.cu> wrote:


  
    
  
  
    

    On 09/04/2012 02:35 PM, Udayini
      Pendyala wrote:

    
    
      
        
          
            Hi Bejoy,

              

              Thanks for your response. I first started to install on
              Ubuntu Linux and ran into a bunch of problems. So, I
              wanted to back off a bit and try something simple first.
              Hence, my attempt to install on my Windows 7 Laptop.

            
          
        
      
    
    Well, if you tell to us the problems that you have in Ubuntu, we can
    give you a hand.

    Michael Noll have great tutorials for this:

    

    Running Hadoop on Ubuntu Linux (Single node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

    

    Running Hadoop on Ubuntu Linux (Multi node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/


    
      
        
          
            

              I am doing the "standalone" mode - as per the
              documentation (link in my original email), I don't need
              ssh unless I am doing the distributed mode. Is that not
              correct?

            
          
        
      
    
    Yes, but I give you the same recommendation that Bejoy said to you:
    Use a Unix-based platform for Hadoop, it's more tested and have
    better performance than Windows.

    

    Best wishes

    
      
        
          
            

              Thanks again for responding

              Udayini

              

              

              --- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
              wrote:

              

                From: Bejoy Ks <be...@gmail.com>

                Subject: Re: Exception while running a Hadoop example on
                a standalone install on Windows 7

                To: user@hadoop.apache.org

                Date: Tuesday, September 4, 2012, 11:11 AM

                

                Hi Udayani
                  

                  
                  By default hadoop works well for linux and linux
                    based OS. Since you are on Windows you need to
                    install and configure ssh using cygwin before you
                    start hadoop daemons.

                    

                    
                      On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com>
                      wrote:

                      
                        
                          
                            
                              
                                Hi,
                                

                                  
                                Following
                                  is a description of what I am trying
                                  to do and the steps I followed. 

                                  
                                

                                  
                                GOAL:
                                a). Install
                                  Hadoop
                                  1.0.3
                                b). Hadoop in a standalone (or local)
                                  mode
                                c). OS: Windows 7
                                

                                
                                STEPS
                                    FOLLOWED:
                                1.    1.   I
                                  followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the
                                    steps I did - 
                                a.       I
                                  went to: http://hadoop.apache.org/core/releases.html.
                                b.      I
                                  installed hadoop-1.0.3 by downloading
                                  “hadoop-1.0.3.tar.gz” and
                                  unzipping/untarring the file.
                                c.       I
                                  installed JDK 1.6 and set up JAVA_HOME
                                  to point to it.
                                d.      I
                                  set up HADOOP_INSTALL to point to my
                                  Hadoop install location. I updated my
                                  PATH
                                  variable to have $HADOOP_INSTALL/bin
                                e.      After
the
                                  above steps, I ran the command:
                                  “hadoop version” and got the following
                                  information:
                                $ hadoop
                                    version
                                Hadoop
                                    1.0.3
                                Subversion
                                    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
                                    -r 1335192
                                Compiled
                                    by
                                    hortonfo on Tue May 8 20:31:25 UTC
                                    2012
                                From
                                    source with
                                    checksum
                                    e6b0c1e23dcf76907c5fecb4b832f3be
                                 
                                2.      2.  The
standalone
                                  was very easy to install as described
                                  above. Then, I tried to run a
                                  sample command as given in:
                                http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
                                Specifically,
                                    the
                                  steps followed were:
                                a.       cd
                                  $HADOOP_INSTALL
                                b.      mkdir
                                  input
                                c.       cp
conf/*.xml
                                  input
                                d.      bin/hadoop
jar
                                  hadoop-examples-1.0.3.jar grep input
                                  output ‘dfs[a-z.]+’
                                and got
                                  the following error:
                                 
                                $
                                    bin/hadoop jar
                                    hadoop-examples-1.0.3.jar grep input
                                    output 'dfs[a-z.]+'
                                12/09/03
                                    15:41:57 WARN util.NativeCodeLoader:
                                    Unable to load native-hadoop libra
                                    ry for
                                    your platform... using builtin-java
                                    classes where applicable
                                12/09/03
                                    15:41:57 ERROR
                                    security.UserGroupInformation:
                                    PriviledgedActionExceptio n
                                    as:upendyal
                                    cause:java.io.IOException: Failed to
                                    set permissions of path: \tmp
                                    \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
                                    to 0700
                                java.io.IOException:
                                    Failed to set
                                    permissions of path:
                                    \tmp\hadoop-upendyal\map
                                    red\staging\upendyal-1075683580\.staging
                                    to 0700
                                at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
                                at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
                                at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
                                    tem.java:509)
                                at
                                    org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
                                    a:344)
                                at
                                    org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
                                    9)
                                at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
                                    ssionFiles.java:116)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
                                at
                                    java.security.AccessController.doPrivileged(Native
                                    Method)
                                at
                                    javax.security.auth.Subject.doAs(Unknown
                                    Source)
                                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
                                    tion.java:1121)
                                at
                                    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
                                    50)
                                at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
                                at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
                                at
org.apache.hadoop.examples.Grep.run(Grep.java:69)
                                at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                                at
org.apache.hadoop.examples.Grep.main(Grep.java:93)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
                                    mDriver.java:68)
                                at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
                                at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)
                                 
                                3.    3.   I
                                  googled
                                    the problem and found the following
links
                                  but none of these suggestions helped. Most
                                    people seem to be getting a
                                    resolution when they change the
                                    version of Hadoop.

                                  
                                a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E

                                b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837

                                

                                  
                                Is this a
                                    problem in the version of Hadoop I
                                    selected OR am I doing something
                                    wrong? I would appreciate any help
                                    with
                                    this.
                                Thanks
                                
                                    Udayini
                                  
                            
                          
                        
                      
                    
                    

                  
                
              
            
          
        
      
    
    

    -- 

      
      
      
        
          Marcos Luis Ortíz Valmaseda

          Data Engineer && Sr. System Administrator at
            UCI

          about.me/marcosortiz

          My Blog

          @marcosluis2186
          

        
      
    
    

    

  















Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Though I agree with others that it would probably be easier to get Hadoop
up and running on Unix based systems, couldn't help notice that this path:

 \tmp \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging

seems to have a space in the first component i.e '\tmp ' and not '\tmp'. Is
that a copy paste issue, or is it really the case. Again, not sure if it
could cause the specific error you're seeing, but could try removing the
space if it does exist. Also assuming that you've set up Cygwin etc. if you
still want to try out on Windows.

Thanks
hemanth

On Wed, Sep 5, 2012 at 12:12 AM, Marcos Ortiz <ml...@uci.cu> wrote:

>
> On 09/04/2012 02:35 PM, Udayini Pendyala wrote:
>
>   Hi Bejoy,
>
> Thanks for your response. I first started to install on Ubuntu Linux and
> ran into a bunch of problems. So, I wanted to back off a bit and try
> something simple first. Hence, my attempt to install on my Windows 7 Laptop.
>
> Well, if you tell to us the problems that you have in Ubuntu, we can give
> you a hand.
> Michael Noll have great tutorials for this:
>
> Running Hadoop on Ubuntu Linux (Single node cluster)
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> Running Hadoop on Ubuntu Linux (Multi node cluster)
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>
>
> I am doing the "standalone" mode - as per the documentation (link in my
> original email), I don't need ssh unless I am doing the distributed mode.
> Is that not correct?
>
> Yes, but I give you the same recommendation that Bejoy said to you: Use a
> Unix-based platform for Hadoop, it's more tested and have better
> performance than Windows.
>
> Best wishes
>
>
> Thanks again for responding
> Udayini
>
>
> --- On *Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
> * wrote:
>
>
> From: Bejoy Ks <be...@gmail.com> <be...@gmail.com>
> Subject: Re: Exception while running a Hadoop example on a standalone
> install on Windows 7
> To: user@hadoop.apache.org
> Date: Tuesday, September 4, 2012, 11:11 AM
>
> Hi Udayani
>
>  By default hadoop works well for linux and linux based OS. Since you are
> on Windows you need to install and configure ssh using cygwin before you
> start hadoop daemons.
>
>  On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <
> udayini_pendyala@yahoo.com<ht...@yahoo.com>
> > wrote:
>
>   Hi,
>
>
>  Following is a description of what I am trying to do and the steps I
> followed.
>
>
>  GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
>  STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.IO>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
>  Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>
>
>
> --
> **
>
> Marcos Luis Ortíz Valmaseda
> *Data Engineer && Sr. System Administrator at UCI*
> about.me/marcosortiz
> My Blog <http://marcosluis2186.posterous.com>
> @marcosluis2186 <http://twitter.com/marcosluis2186>
>  **
>
>
>
>   <http://www.uci.cu/>
>
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Though I agree with others that it would probably be easier to get Hadoop
up and running on Unix based systems, couldn't help notice that this path:

 \tmp \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging

seems to have a space in the first component i.e '\tmp ' and not '\tmp'. Is
that a copy paste issue, or is it really the case. Again, not sure if it
could cause the specific error you're seeing, but could try removing the
space if it does exist. Also assuming that you've set up Cygwin etc. if you
still want to try out on Windows.

Thanks
hemanth

On Wed, Sep 5, 2012 at 12:12 AM, Marcos Ortiz <ml...@uci.cu> wrote:

>
> On 09/04/2012 02:35 PM, Udayini Pendyala wrote:
>
>   Hi Bejoy,
>
> Thanks for your response. I first started to install on Ubuntu Linux and
> ran into a bunch of problems. So, I wanted to back off a bit and try
> something simple first. Hence, my attempt to install on my Windows 7 Laptop.
>
> Well, if you tell to us the problems that you have in Ubuntu, we can give
> you a hand.
> Michael Noll have great tutorials for this:
>
> Running Hadoop on Ubuntu Linux (Single node cluster)
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> Running Hadoop on Ubuntu Linux (Multi node cluster)
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>
>
> I am doing the "standalone" mode - as per the documentation (link in my
> original email), I don't need ssh unless I am doing the distributed mode.
> Is that not correct?
>
> Yes, but I give you the same recommendation that Bejoy said to you: Use a
> Unix-based platform for Hadoop, it's more tested and have better
> performance than Windows.
>
> Best wishes
>
>
> Thanks again for responding
> Udayini
>
>
> --- On *Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
> * wrote:
>
>
> From: Bejoy Ks <be...@gmail.com> <be...@gmail.com>
> Subject: Re: Exception while running a Hadoop example on a standalone
> install on Windows 7
> To: user@hadoop.apache.org
> Date: Tuesday, September 4, 2012, 11:11 AM
>
> Hi Udayani
>
>  By default hadoop works well for linux and linux based OS. Since you are
> on Windows you need to install and configure ssh using cygwin before you
> start hadoop daemons.
>
>  On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <
> udayini_pendyala@yahoo.com<ht...@yahoo.com>
> > wrote:
>
>   Hi,
>
>
>  Following is a description of what I am trying to do and the steps I
> followed.
>
>
>  GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
>  STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.IO>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
>  Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>
>
>
> --
> **
>
> Marcos Luis Ortíz Valmaseda
> *Data Engineer && Sr. System Administrator at UCI*
> about.me/marcosortiz
> My Blog <http://marcosluis2186.posterous.com>
> @marcosluis2186 <http://twitter.com/marcosluis2186>
>  **
>
>
>
>   <http://www.uci.cu/>
>
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Though I agree with others that it would probably be easier to get Hadoop
up and running on Unix based systems, couldn't help notice that this path:

 \tmp \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging

seems to have a space in the first component i.e '\tmp ' and not '\tmp'. Is
that a copy paste issue, or is it really the case. Again, not sure if it
could cause the specific error you're seeing, but could try removing the
space if it does exist. Also assuming that you've set up Cygwin etc. if you
still want to try out on Windows.

Thanks
hemanth

On Wed, Sep 5, 2012 at 12:12 AM, Marcos Ortiz <ml...@uci.cu> wrote:

>
> On 09/04/2012 02:35 PM, Udayini Pendyala wrote:
>
>   Hi Bejoy,
>
> Thanks for your response. I first started to install on Ubuntu Linux and
> ran into a bunch of problems. So, I wanted to back off a bit and try
> something simple first. Hence, my attempt to install on my Windows 7 Laptop.
>
> Well, if you tell to us the problems that you have in Ubuntu, we can give
> you a hand.
> Michael Noll have great tutorials for this:
>
> Running Hadoop on Ubuntu Linux (Single node cluster)
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> Running Hadoop on Ubuntu Linux (Multi node cluster)
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>
>
> I am doing the "standalone" mode - as per the documentation (link in my
> original email), I don't need ssh unless I am doing the distributed mode.
> Is that not correct?
>
> Yes, but I give you the same recommendation that Bejoy said to you: Use a
> Unix-based platform for Hadoop, it's more tested and have better
> performance than Windows.
>
> Best wishes
>
>
> Thanks again for responding
> Udayini
>
>
> --- On *Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
> * wrote:
>
>
> From: Bejoy Ks <be...@gmail.com> <be...@gmail.com>
> Subject: Re: Exception while running a Hadoop example on a standalone
> install on Windows 7
> To: user@hadoop.apache.org
> Date: Tuesday, September 4, 2012, 11:11 AM
>
> Hi Udayani
>
>  By default hadoop works well for linux and linux based OS. Since you are
> on Windows you need to install and configure ssh using cygwin before you
> start hadoop daemons.
>
>  On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <
> udayini_pendyala@yahoo.com<ht...@yahoo.com>
> > wrote:
>
>   Hi,
>
>
>  Following is a description of what I am trying to do and the steps I
> followed.
>
>
>  GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
>  STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.IO>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
>  Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>
>
>
> --
> **
>
> Marcos Luis Ortíz Valmaseda
> *Data Engineer && Sr. System Administrator at UCI*
> about.me/marcosortiz
> My Blog <http://marcosluis2186.posterous.com>
> @marcosluis2186 <http://twitter.com/marcosluis2186>
>  **
>
>
>
>   <http://www.uci.cu/>
>
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Though I agree with others that it would probably be easier to get Hadoop
up and running on Unix based systems, couldn't help notice that this path:

 \tmp \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging

seems to have a space in the first component i.e '\tmp ' and not '\tmp'. Is
that a copy paste issue, or is it really the case. Again, not sure if it
could cause the specific error you're seeing, but could try removing the
space if it does exist. Also assuming that you've set up Cygwin etc. if you
still want to try out on Windows.

Thanks
hemanth

On Wed, Sep 5, 2012 at 12:12 AM, Marcos Ortiz <ml...@uci.cu> wrote:

>
> On 09/04/2012 02:35 PM, Udayini Pendyala wrote:
>
>   Hi Bejoy,
>
> Thanks for your response. I first started to install on Ubuntu Linux and
> ran into a bunch of problems. So, I wanted to back off a bit and try
> something simple first. Hence, my attempt to install on my Windows 7 Laptop.
>
> Well, if you tell to us the problems that you have in Ubuntu, we can give
> you a hand.
> Michael Noll have great tutorials for this:
>
> Running Hadoop on Ubuntu Linux (Single node cluster)
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
>
> Running Hadoop on Ubuntu Linux (Multi node cluster)
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>
>
> I am doing the "standalone" mode - as per the documentation (link in my
> original email), I don't need ssh unless I am doing the distributed mode.
> Is that not correct?
>
> Yes, but I give you the same recommendation that Bejoy said to you: Use a
> Unix-based platform for Hadoop, it's more tested and have better
> performance than Windows.
>
> Best wishes
>
>
> Thanks again for responding
> Udayini
>
>
> --- On *Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
> * wrote:
>
>
> From: Bejoy Ks <be...@gmail.com> <be...@gmail.com>
> Subject: Re: Exception while running a Hadoop example on a standalone
> install on Windows 7
> To: user@hadoop.apache.org
> Date: Tuesday, September 4, 2012, 11:11 AM
>
> Hi Udayani
>
>  By default hadoop works well for linux and linux based OS. Since you are
> on Windows you need to install and configure ssh using cygwin before you
> start hadoop daemons.
>
>  On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <
> udayini_pendyala@yahoo.com<ht...@yahoo.com>
> > wrote:
>
>   Hi,
>
>
>  Following is a description of what I am trying to do and the steps I
> followed.
>
>
>  GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
>  STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.IO>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
>  Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>
>
>
> --
> **
>
> Marcos Luis Ortíz Valmaseda
> *Data Engineer && Sr. System Administrator at UCI*
> about.me/marcosortiz
> My Blog <http://marcosluis2186.posterous.com>
> @marcosluis2186 <http://twitter.com/marcosluis2186>
>  **
>
>
>
>   <http://www.uci.cu/>
>
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Thanks. 

I will refer to my notes on the problems I was having and get back to the list. Thanks for the links below.

Regards
Udayini

--- On Tue, 9/4/12, Marcos Ortiz <ml...@uci.cu> wrote:

From: Marcos Ortiz <ml...@uci.cu>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Cc: "Udayini Pendyala" <ud...@yahoo.com>
Date: Tuesday, September 4, 2012, 11:42 AM


  
    
  
  
    

    On 09/04/2012 02:35 PM, Udayini
      Pendyala wrote:

    
    
      
        
          
            Hi Bejoy,

              

              Thanks for your response. I first started to install on
              Ubuntu Linux and ran into a bunch of problems. So, I
              wanted to back off a bit and try something simple first.
              Hence, my attempt to install on my Windows 7 Laptop.

            
          
        
      
    
    Well, if you tell to us the problems that you have in Ubuntu, we can
    give you a hand.

    Michael Noll have great tutorials for this:

    

    Running Hadoop on Ubuntu Linux (Single node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

    

    Running Hadoop on Ubuntu Linux (Multi node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

    
      
        
          
            

              I am doing the "standalone" mode - as per the
              documentation (link in my original email), I don't need
              ssh unless I am doing the distributed mode. Is that not
              correct?

            
          
        
      
    
    Yes, but I give you the same recommendation that Bejoy said to you:
    Use a Unix-based platform for Hadoop, it's more tested and have
    better performance than Windows.

    

    Best wishes

    
      
        
          
            

              Thanks again for responding

              Udayini

              

              

              --- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
              wrote:

              

                From: Bejoy Ks <be...@gmail.com>

                Subject: Re: Exception while running a Hadoop example on
                a standalone install on Windows 7

                To: user@hadoop.apache.org

                Date: Tuesday, September 4, 2012, 11:11 AM

                

                Hi Udayani
                  

                  
                  By default hadoop works well for linux and linux
                    based OS. Since you are on Windows you need to
                    install and configure ssh using cygwin before you
                    start hadoop daemons.

                    

                    
                      On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com>
                      wrote:

                      
                        
                          
                            
                              
                                Hi,
                                

                                  
                                Following
                                  is a description of what I am trying
                                  to do and the steps I followed. 

                                  
                                

                                  
                                GOAL:
                                a). Install
                                  Hadoop
                                  1.0.3
                                b). Hadoop in a standalone (or local)
                                  mode
                                c). OS: Windows 7
                                

                                
                                STEPS
                                    FOLLOWED:
                                1.    1.   I
                                  followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the
                                    steps I did - 
                                a.       I
                                  went to: http://hadoop.apache.org/core/releases.html.
                                b.      I
                                  installed hadoop-1.0.3 by downloading
                                  “hadoop-1.0.3.tar.gz” and
                                  unzipping/untarring the file.
                                c.       I
                                  installed JDK 1.6 and set up JAVA_HOME
                                  to point to it.
                                d.      I
                                  set up HADOOP_INSTALL to point to my
                                  Hadoop install location. I updated my
                                  PATH
                                  variable to have $HADOOP_INSTALL/bin
                                e.      After
the
                                  above steps, I ran the command:
                                  “hadoop version” and got the following
                                  information:
                                $ hadoop
                                    version
                                Hadoop
                                    1.0.3
                                Subversion
                                    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
                                    -r 1335192
                                Compiled
                                    by
                                    hortonfo on Tue May 8 20:31:25 UTC
                                    2012
                                From
                                    source with
                                    checksum
                                    e6b0c1e23dcf76907c5fecb4b832f3be
                                 
                                2.      2.  The
standalone
                                  was very easy to install as described
                                  above. Then, I tried to run a
                                  sample command as given in:
                                http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
                                Specifically,
                                    the
                                  steps followed were:
                                a.       cd
                                  $HADOOP_INSTALL
                                b.      mkdir
                                  input
                                c.       cp
conf/*.xml
                                  input
                                d.      bin/hadoop
jar
                                  hadoop-examples-1.0.3.jar grep input
                                  output ‘dfs[a-z.]+’
                                and got
                                  the following error:
                                 
                                $
                                    bin/hadoop jar
                                    hadoop-examples-1.0.3.jar grep input
                                    output 'dfs[a-z.]+'
                                12/09/03
                                    15:41:57 WARN util.NativeCodeLoader:
                                    Unable to load native-hadoop libra
                                    ry for
                                    your platform... using builtin-java
                                    classes where applicable
                                12/09/03
                                    15:41:57 ERROR
                                    security.UserGroupInformation:
                                    PriviledgedActionExceptio n
                                    as:upendyal
                                    cause:java.io.IOException: Failed to
                                    set permissions of path: \tmp
                                    \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
                                    to 0700
                                java.io.IOException:
                                    Failed to set
                                    permissions of path:
                                    \tmp\hadoop-upendyal\map
                                    red\staging\upendyal-1075683580\.staging
                                    to 0700
                                at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
                                at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
                                at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
                                    tem.java:509)
                                at
                                    org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
                                    a:344)
                                at
                                    org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
                                    9)
                                at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
                                    ssionFiles.java:116)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
                                at
                                    java.security.AccessController.doPrivileged(Native
                                    Method)
                                at
                                    javax.security.auth.Subject.doAs(Unknown
                                    Source)
                                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
                                    tion.java:1121)
                                at
                                    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
                                    50)
                                at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
                                at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
                                at
org.apache.hadoop.examples.Grep.run(Grep.java:69)
                                at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                                at
org.apache.hadoop.examples.Grep.main(Grep.java:93)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
                                    mDriver.java:68)
                                at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
                                at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)
                                 
                                3.    3.   I
                                  googled
                                    the problem and found the following
links
                                  but none of these suggestions helped. Most
                                    people seem to be getting a
                                    resolution when they change the
                                    version of Hadoop.

                                  
                                a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
                                b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
                                

                                  
                                Is this a
                                    problem in the version of Hadoop I
                                    selected OR am I doing something
                                    wrong? I would appreciate any help
                                    with
                                    this.
                                Thanks
                                
                                    Udayini
                                  
                            
                          
                        
                      
                    
                    

                  
                
              
            
          
        
      
    
    

    -- 

      
      
      
        
          Marcos Luis Ortíz Valmaseda

          Data Engineer && Sr. System Administrator at
            UCI

          about.me/marcosortiz

          My Blog

          @marcosluis2186
          

        
      
    
    

    

  













Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Thanks. 

I will refer to my notes on the problems I was having and get back to the list. Thanks for the links below.

Regards
Udayini

--- On Tue, 9/4/12, Marcos Ortiz <ml...@uci.cu> wrote:

From: Marcos Ortiz <ml...@uci.cu>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Cc: "Udayini Pendyala" <ud...@yahoo.com>
Date: Tuesday, September 4, 2012, 11:42 AM


  
    
  
  
    

    On 09/04/2012 02:35 PM, Udayini
      Pendyala wrote:

    
    
      
        
          
            Hi Bejoy,

              

              Thanks for your response. I first started to install on
              Ubuntu Linux and ran into a bunch of problems. So, I
              wanted to back off a bit and try something simple first.
              Hence, my attempt to install on my Windows 7 Laptop.

            
          
        
      
    
    Well, if you tell to us the problems that you have in Ubuntu, we can
    give you a hand.

    Michael Noll have great tutorials for this:

    

    Running Hadoop on Ubuntu Linux (Single node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

    

    Running Hadoop on Ubuntu Linux (Multi node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

    
      
        
          
            

              I am doing the "standalone" mode - as per the
              documentation (link in my original email), I don't need
              ssh unless I am doing the distributed mode. Is that not
              correct?

            
          
        
      
    
    Yes, but I give you the same recommendation that Bejoy said to you:
    Use a Unix-based platform for Hadoop, it's more tested and have
    better performance than Windows.

    

    Best wishes

    
      
        
          
            

              Thanks again for responding

              Udayini

              

              

              --- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
              wrote:

              

                From: Bejoy Ks <be...@gmail.com>

                Subject: Re: Exception while running a Hadoop example on
                a standalone install on Windows 7

                To: user@hadoop.apache.org

                Date: Tuesday, September 4, 2012, 11:11 AM

                

                Hi Udayani
                  

                  
                  By default hadoop works well for linux and linux
                    based OS. Since you are on Windows you need to
                    install and configure ssh using cygwin before you
                    start hadoop daemons.

                    

                    
                      On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com>
                      wrote:

                      
                        
                          
                            
                              
                                Hi,
                                

                                  
                                Following
                                  is a description of what I am trying
                                  to do and the steps I followed. 

                                  
                                

                                  
                                GOAL:
                                a). Install
                                  Hadoop
                                  1.0.3
                                b). Hadoop in a standalone (or local)
                                  mode
                                c). OS: Windows 7
                                

                                
                                STEPS
                                    FOLLOWED:
                                1.    1.   I
                                  followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the
                                    steps I did - 
                                a.       I
                                  went to: http://hadoop.apache.org/core/releases.html.
                                b.      I
                                  installed hadoop-1.0.3 by downloading
                                  “hadoop-1.0.3.tar.gz” and
                                  unzipping/untarring the file.
                                c.       I
                                  installed JDK 1.6 and set up JAVA_HOME
                                  to point to it.
                                d.      I
                                  set up HADOOP_INSTALL to point to my
                                  Hadoop install location. I updated my
                                  PATH
                                  variable to have $HADOOP_INSTALL/bin
                                e.      After
the
                                  above steps, I ran the command:
                                  “hadoop version” and got the following
                                  information:
                                $ hadoop
                                    version
                                Hadoop
                                    1.0.3
                                Subversion
                                    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
                                    -r 1335192
                                Compiled
                                    by
                                    hortonfo on Tue May 8 20:31:25 UTC
                                    2012
                                From
                                    source with
                                    checksum
                                    e6b0c1e23dcf76907c5fecb4b832f3be
                                 
                                2.      2.  The
standalone
                                  was very easy to install as described
                                  above. Then, I tried to run a
                                  sample command as given in:
                                http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
                                Specifically,
                                    the
                                  steps followed were:
                                a.       cd
                                  $HADOOP_INSTALL
                                b.      mkdir
                                  input
                                c.       cp
conf/*.xml
                                  input
                                d.      bin/hadoop
jar
                                  hadoop-examples-1.0.3.jar grep input
                                  output ‘dfs[a-z.]+’
                                and got
                                  the following error:
                                 
                                $
                                    bin/hadoop jar
                                    hadoop-examples-1.0.3.jar grep input
                                    output 'dfs[a-z.]+'
                                12/09/03
                                    15:41:57 WARN util.NativeCodeLoader:
                                    Unable to load native-hadoop libra
                                    ry for
                                    your platform... using builtin-java
                                    classes where applicable
                                12/09/03
                                    15:41:57 ERROR
                                    security.UserGroupInformation:
                                    PriviledgedActionExceptio n
                                    as:upendyal
                                    cause:java.io.IOException: Failed to
                                    set permissions of path: \tmp
                                    \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
                                    to 0700
                                java.io.IOException:
                                    Failed to set
                                    permissions of path:
                                    \tmp\hadoop-upendyal\map
                                    red\staging\upendyal-1075683580\.staging
                                    to 0700
                                at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
                                at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
                                at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
                                    tem.java:509)
                                at
                                    org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
                                    a:344)
                                at
                                    org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
                                    9)
                                at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
                                    ssionFiles.java:116)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
                                at
                                    java.security.AccessController.doPrivileged(Native
                                    Method)
                                at
                                    javax.security.auth.Subject.doAs(Unknown
                                    Source)
                                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
                                    tion.java:1121)
                                at
                                    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
                                    50)
                                at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
                                at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
                                at
org.apache.hadoop.examples.Grep.run(Grep.java:69)
                                at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                                at
org.apache.hadoop.examples.Grep.main(Grep.java:93)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
                                    mDriver.java:68)
                                at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
                                at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)
                                 
                                3.    3.   I
                                  googled
                                    the problem and found the following
links
                                  but none of these suggestions helped. Most
                                    people seem to be getting a
                                    resolution when they change the
                                    version of Hadoop.

                                  
                                a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
                                b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
                                

                                  
                                Is this a
                                    problem in the version of Hadoop I
                                    selected OR am I doing something
                                    wrong? I would appreciate any help
                                    with
                                    this.
                                Thanks
                                
                                    Udayini
                                  
                            
                          
                        
                      
                    
                    

                  
                
              
            
          
        
      
    
    

    -- 

      
      
      
        
          Marcos Luis Ortíz Valmaseda

          Data Engineer && Sr. System Administrator at
            UCI

          about.me/marcosortiz

          My Blog

          @marcosluis2186
          

        
      
    
    

    

  













Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Thanks. 

I will refer to my notes on the problems I was having and get back to the list. Thanks for the links below.

Regards
Udayini

--- On Tue, 9/4/12, Marcos Ortiz <ml...@uci.cu> wrote:

From: Marcos Ortiz <ml...@uci.cu>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Cc: "Udayini Pendyala" <ud...@yahoo.com>
Date: Tuesday, September 4, 2012, 11:42 AM


  
    
  
  
    

    On 09/04/2012 02:35 PM, Udayini
      Pendyala wrote:

    
    
      
        
          
            Hi Bejoy,

              

              Thanks for your response. I first started to install on
              Ubuntu Linux and ran into a bunch of problems. So, I
              wanted to back off a bit and try something simple first.
              Hence, my attempt to install on my Windows 7 Laptop.

            
          
        
      
    
    Well, if you tell to us the problems that you have in Ubuntu, we can
    give you a hand.

    Michael Noll have great tutorials for this:

    

    Running Hadoop on Ubuntu Linux (Single node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

    

    Running Hadoop on Ubuntu Linux (Multi node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

    
      
        
          
            

              I am doing the "standalone" mode - as per the
              documentation (link in my original email), I don't need
              ssh unless I am doing the distributed mode. Is that not
              correct?

            
          
        
      
    
    Yes, but I give you the same recommendation that Bejoy said to you:
    Use a Unix-based platform for Hadoop, it's more tested and have
    better performance than Windows.

    

    Best wishes

    
      
        
          
            

              Thanks again for responding

              Udayini

              

              

              --- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
              wrote:

              

                From: Bejoy Ks <be...@gmail.com>

                Subject: Re: Exception while running a Hadoop example on
                a standalone install on Windows 7

                To: user@hadoop.apache.org

                Date: Tuesday, September 4, 2012, 11:11 AM

                

                Hi Udayani
                  

                  
                  By default hadoop works well for linux and linux
                    based OS. Since you are on Windows you need to
                    install and configure ssh using cygwin before you
                    start hadoop daemons.

                    

                    
                      On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com>
                      wrote:

                      
                        
                          
                            
                              
                                Hi,
                                

                                  
                                Following
                                  is a description of what I am trying
                                  to do and the steps I followed. 

                                  
                                

                                  
                                GOAL:
                                a). Install
                                  Hadoop
                                  1.0.3
                                b). Hadoop in a standalone (or local)
                                  mode
                                c). OS: Windows 7
                                

                                
                                STEPS
                                    FOLLOWED:
                                1.    1.   I
                                  followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the
                                    steps I did - 
                                a.       I
                                  went to: http://hadoop.apache.org/core/releases.html.
                                b.      I
                                  installed hadoop-1.0.3 by downloading
                                  “hadoop-1.0.3.tar.gz” and
                                  unzipping/untarring the file.
                                c.       I
                                  installed JDK 1.6 and set up JAVA_HOME
                                  to point to it.
                                d.      I
                                  set up HADOOP_INSTALL to point to my
                                  Hadoop install location. I updated my
                                  PATH
                                  variable to have $HADOOP_INSTALL/bin
                                e.      After
the
                                  above steps, I ran the command:
                                  “hadoop version” and got the following
                                  information:
                                $ hadoop
                                    version
                                Hadoop
                                    1.0.3
                                Subversion
                                    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
                                    -r 1335192
                                Compiled
                                    by
                                    hortonfo on Tue May 8 20:31:25 UTC
                                    2012
                                From
                                    source with
                                    checksum
                                    e6b0c1e23dcf76907c5fecb4b832f3be
                                 
                                2.      2.  The
standalone
                                  was very easy to install as described
                                  above. Then, I tried to run a
                                  sample command as given in:
                                http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
                                Specifically,
                                    the
                                  steps followed were:
                                a.       cd
                                  $HADOOP_INSTALL
                                b.      mkdir
                                  input
                                c.       cp
conf/*.xml
                                  input
                                d.      bin/hadoop
jar
                                  hadoop-examples-1.0.3.jar grep input
                                  output ‘dfs[a-z.]+’
                                and got
                                  the following error:
                                 
                                $
                                    bin/hadoop jar
                                    hadoop-examples-1.0.3.jar grep input
                                    output 'dfs[a-z.]+'
                                12/09/03
                                    15:41:57 WARN util.NativeCodeLoader:
                                    Unable to load native-hadoop libra
                                    ry for
                                    your platform... using builtin-java
                                    classes where applicable
                                12/09/03
                                    15:41:57 ERROR
                                    security.UserGroupInformation:
                                    PriviledgedActionExceptio n
                                    as:upendyal
                                    cause:java.io.IOException: Failed to
                                    set permissions of path: \tmp
                                    \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
                                    to 0700
                                java.io.IOException:
                                    Failed to set
                                    permissions of path:
                                    \tmp\hadoop-upendyal\map
                                    red\staging\upendyal-1075683580\.staging
                                    to 0700
                                at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
                                at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
                                at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
                                    tem.java:509)
                                at
                                    org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
                                    a:344)
                                at
                                    org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
                                    9)
                                at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
                                    ssionFiles.java:116)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
                                at
                                    java.security.AccessController.doPrivileged(Native
                                    Method)
                                at
                                    javax.security.auth.Subject.doAs(Unknown
                                    Source)
                                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
                                    tion.java:1121)
                                at
                                    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
                                    50)
                                at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
                                at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
                                at
org.apache.hadoop.examples.Grep.run(Grep.java:69)
                                at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                                at
org.apache.hadoop.examples.Grep.main(Grep.java:93)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
                                    mDriver.java:68)
                                at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
                                at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)
                                 
                                3.    3.   I
                                  googled
                                    the problem and found the following
links
                                  but none of these suggestions helped. Most
                                    people seem to be getting a
                                    resolution when they change the
                                    version of Hadoop.

                                  
                                a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
                                b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
                                

                                  
                                Is this a
                                    problem in the version of Hadoop I
                                    selected OR am I doing something
                                    wrong? I would appreciate any help
                                    with
                                    this.
                                Thanks
                                
                                    Udayini
                                  
                            
                          
                        
                      
                    
                    

                  
                
              
            
          
        
      
    
    

    -- 

      
      
      
        
          Marcos Luis Ortíz Valmaseda

          Data Engineer && Sr. System Administrator at
            UCI

          about.me/marcosortiz

          My Blog

          @marcosluis2186
          

        
      
    
    

    

  













Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Thanks. 

I will refer to my notes on the problems I was having and get back to the list. Thanks for the links below.

Regards
Udayini

--- On Tue, 9/4/12, Marcos Ortiz <ml...@uci.cu> wrote:

From: Marcos Ortiz <ml...@uci.cu>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Cc: "Udayini Pendyala" <ud...@yahoo.com>
Date: Tuesday, September 4, 2012, 11:42 AM


  
    
  
  
    

    On 09/04/2012 02:35 PM, Udayini
      Pendyala wrote:

    
    
      
        
          
            Hi Bejoy,

              

              Thanks for your response. I first started to install on
              Ubuntu Linux and ran into a bunch of problems. So, I
              wanted to back off a bit and try something simple first.
              Hence, my attempt to install on my Windows 7 Laptop.

            
          
        
      
    
    Well, if you tell to us the problems that you have in Ubuntu, we can
    give you a hand.

    Michael Noll have great tutorials for this:

    

    Running Hadoop on Ubuntu Linux (Single node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

    

    Running Hadoop on Ubuntu Linux (Multi node cluster)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

    
      
        
          
            

              I am doing the "standalone" mode - as per the
              documentation (link in my original email), I don't need
              ssh unless I am doing the distributed mode. Is that not
              correct?

            
          
        
      
    
    Yes, but I give you the same recommendation that Bejoy said to you:
    Use a Unix-based platform for Hadoop, it's more tested and have
    better performance than Windows.

    

    Best wishes

    
      
        
          
            

              Thanks again for responding

              Udayini

              

              

              --- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com>
              wrote:

              

                From: Bejoy Ks <be...@gmail.com>

                Subject: Re: Exception while running a Hadoop example on
                a standalone install on Windows 7

                To: user@hadoop.apache.org

                Date: Tuesday, September 4, 2012, 11:11 AM

                

                Hi Udayani
                  

                  
                  By default hadoop works well for linux and linux
                    based OS. Since you are on Windows you need to
                    install and configure ssh using cygwin before you
                    start hadoop daemons.

                    

                    
                      On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com>
                      wrote:

                      
                        
                          
                            
                              
                                Hi,
                                

                                  
                                Following
                                  is a description of what I am trying
                                  to do and the steps I followed. 

                                  
                                

                                  
                                GOAL:
                                a). Install
                                  Hadoop
                                  1.0.3
                                b). Hadoop in a standalone (or local)
                                  mode
                                c). OS: Windows 7
                                

                                
                                STEPS
                                    FOLLOWED:
                                1.    1.   I
                                  followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the
                                    steps I did - 
                                a.       I
                                  went to: http://hadoop.apache.org/core/releases.html.
                                b.      I
                                  installed hadoop-1.0.3 by downloading
                                  “hadoop-1.0.3.tar.gz” and
                                  unzipping/untarring the file.
                                c.       I
                                  installed JDK 1.6 and set up JAVA_HOME
                                  to point to it.
                                d.      I
                                  set up HADOOP_INSTALL to point to my
                                  Hadoop install location. I updated my
                                  PATH
                                  variable to have $HADOOP_INSTALL/bin
                                e.      After
the
                                  above steps, I ran the command:
                                  “hadoop version” and got the following
                                  information:
                                $ hadoop
                                    version
                                Hadoop
                                    1.0.3
                                Subversion
                                    https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
                                    -r 1335192
                                Compiled
                                    by
                                    hortonfo on Tue May 8 20:31:25 UTC
                                    2012
                                From
                                    source with
                                    checksum
                                    e6b0c1e23dcf76907c5fecb4b832f3be
                                 
                                2.      2.  The
standalone
                                  was very easy to install as described
                                  above. Then, I tried to run a
                                  sample command as given in:
                                http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
                                Specifically,
                                    the
                                  steps followed were:
                                a.       cd
                                  $HADOOP_INSTALL
                                b.      mkdir
                                  input
                                c.       cp
conf/*.xml
                                  input
                                d.      bin/hadoop
jar
                                  hadoop-examples-1.0.3.jar grep input
                                  output ‘dfs[a-z.]+’
                                and got
                                  the following error:
                                 
                                $
                                    bin/hadoop jar
                                    hadoop-examples-1.0.3.jar grep input
                                    output 'dfs[a-z.]+'
                                12/09/03
                                    15:41:57 WARN util.NativeCodeLoader:
                                    Unable to load native-hadoop libra
                                    ry for
                                    your platform... using builtin-java
                                    classes where applicable
                                12/09/03
                                    15:41:57 ERROR
                                    security.UserGroupInformation:
                                    PriviledgedActionExceptio n
                                    as:upendyal
                                    cause:java.io.IOException: Failed to
                                    set permissions of path: \tmp
                                    \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
                                    to 0700
                                java.io.IOException:
                                    Failed to set
                                    permissions of path:
                                    \tmp\hadoop-upendyal\map
                                    red\staging\upendyal-1075683580\.staging
                                    to 0700
                                at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
                                at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
                                at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
                                    tem.java:509)
                                at
                                    org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
                                    a:344)
                                at
                                    org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
                                    9)
                                at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
                                    ssionFiles.java:116)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
                                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
                                at
                                    java.security.AccessController.doPrivileged(Native
                                    Method)
                                at
                                    javax.security.auth.Subject.doAs(Unknown
                                    Source)
                                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
                                    tion.java:1121)
                                at
                                    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
                                    50)
                                at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
                                at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
                                at
org.apache.hadoop.examples.Grep.run(Grep.java:69)
                                at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                                at
org.apache.hadoop.examples.Grep.main(Grep.java:93)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
                                    mDriver.java:68)
                                at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
                                at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke0(Native
                                    Method)
                                at
                                    sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
                                    Source)
                                at
                                    java.lang.reflect.Method.invoke(Unknown
                                    Source)
                                at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)
                                 
                                3.    3.   I
                                  googled
                                    the problem and found the following
links
                                  but none of these suggestions helped. Most
                                    people seem to be getting a
                                    resolution when they change the
                                    version of Hadoop.

                                  
                                a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
                                b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
                                

                                  
                                Is this a
                                    problem in the version of Hadoop I
                                    selected OR am I doing something
                                    wrong? I would appreciate any help
                                    with
                                    this.
                                Thanks
                                
                                    Udayini
                                  
                            
                          
                        
                      
                    
                    

                  
                
              
            
          
        
      
    
    

    -- 

      
      
      
        
          Marcos Luis Ortíz Valmaseda

          Data Engineer && Sr. System Administrator at
            UCI

          about.me/marcosortiz

          My Blog

          @marcosluis2186
          

        
      
    
    

    

  













Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Marcos Ortiz <ml...@uci.cu>.
On 09/04/2012 02:35 PM, Udayini Pendyala wrote:
> Hi Bejoy,
>
> Thanks for your response. I first started to install on Ubuntu Linux 
> and ran into a bunch of problems. So, I wanted to back off a bit and 
> try something simple first. Hence, my attempt to install on my Windows 
> 7 Laptop.
>
Well, if you tell to us the problems that you have in Ubuntu, we can 
give you a hand.
Michael Noll have great tutorials for this:

Running Hadoop on Ubuntu Linux (Single node cluster)
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

Running Hadoop on Ubuntu Linux (Multi node cluster)
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>
> I am doing the "standalone" mode - as per the documentation (link in 
> my original email), I don't need ssh unless I am doing the distributed 
> mode. Is that not correct?
>
Yes, but I give you the same recommendation that Bejoy said to you: Use 
a Unix-based platform for Hadoop, it's more tested and have better 
performance than Windows.

Best wishes
>
> Thanks again for responding
> Udayini
>
>
> --- On *Tue, 9/4/12, Bejoy Ks /<be...@gmail.com>/* wrote:
>
>
>     From: Bejoy Ks <be...@gmail.com>
>     Subject: Re: Exception while running a Hadoop example on a
>     standalone install on Windows 7
>     To: user@hadoop.apache.org
>     Date: Tuesday, September 4, 2012, 11:11 AM
>
>     Hi Udayani
>
>     By default hadoop works well for linux and linux based OS. Since
>     you are on Windows you need to install and configure ssh using
>     cygwin before you start hadoop daemons.
>
>     On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala
>     <udayini_pendyala@yahoo.com
>     </m...@yahoo.com>> wrote:
>
>         Hi,
>
>
>         Following is a description of what I am trying to do and the
>         steps I followed.
>
>
>         GOAL:
>
>         a). Install Hadoop 1.0.3
>
>         b). Hadoop in a standalone (or local) mode
>
>         c). OS: Windows 7
>
>
>         STEPS FOLLOWED:
>
>         1.    1. I followed instructions from:
>         http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
>         Listing the steps I did -
>
>         a.I went to: http://hadoop.apache.org/core/releases.html.
>
>         b.I installed hadoop-1.0.3 by downloading
>         “hadoop-1.0.3.tar.gz” and unzipping/untarring the file.
>
>         c.I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
>         d.I set up HADOOP_INSTALL to point to my Hadoop install
>         location. I updated my PATH variable to have $HADOOP_INSTALL/bin
>
>         e.After the above steps, I ran the command: “hadoop version”
>         and got the following information:
>
>         $ hadoop version
>
>         Hadoop 1.0.3
>
>         Subversion
>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
>         -r 1335192
>
>         Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
>         From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>         2.      2. The standalone was very easy to install as
>         described above. Then, I tried to run a sample command as
>         given in:
>
>         http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
>         Specifically, the steps followed were:
>
>         a.cd $HADOOP_INSTALL
>
>         b.mkdir input
>
>         c.cp conf/*.xml input
>
>         d.bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
>         ‘dfs[a-z.]+’
>
>         and got the following error:
>
>         $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
>         'dfs[a-z.]+'
>
>         12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load
>         native-hadoop libra ry for your platform... using builtin-java
>         classes where applicable
>
>         12/09/03 15:41:57 ERROR security.UserGroupInformation:
>         PriviledgedActionExceptio n as:upendyal
>         cause:java.io.IOException: Failed to set permissions of path:
>         \tmp
>         \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
>         to 0700
>
>         java.io <http://java.io.IO>.IOException: Failed to set
>         permissions of path: \tmp\hadoop-upendyal\map
>         red\staging\upendyal-1075683580\.staging to 0700
>
>         at
>         org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
>         at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
>         at
>         org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
>         tem.java:509)
>
>         at
>         org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
>         a:344)
>
>         at
>         org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
>         9)
>
>         at
>         org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
>         ssionFiles.java:116)
>
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Unknown Source)
>
>         at
>         org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
>         tion.java:1121)
>
>         at
>         org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
>         50)
>
>         at
>         org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
>         at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
>         at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>         at java.lang.reflect.Method.invoke(Unknown Source)
>
>         at
>         org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
>         mDriver.java:68)
>
>         at
>         org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
>         at
>         org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>         at java.lang.reflect.Method.invoke(Unknown Source)
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>         3.    3. I googled the problem and found the following links
>         but none of these suggestions helped.Most people seem to be
>         getting a resolution when they change the version of Hadoop.
>
>         a.http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
>         b.http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
>         Is this a problem in the version of Hadoop I selected OR am I
>         doing something wrong? I would appreciate any help with this.
>
>         Thanks
>
>         Udayini
>
>

-- 

Marcos Luis Ortíz Valmaseda
*Data Engineer && Sr. System Administrator at UCI*
about.me/marcosortiz <http://about.me/marcosortiz>
My Blog <http://marcosluis2186.posterous.com>
@marcosluis2186 <http://twitter.com/marcosluis2186>





10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Marcos Ortiz <ml...@uci.cu>.
On 09/04/2012 02:35 PM, Udayini Pendyala wrote:
> Hi Bejoy,
>
> Thanks for your response. I first started to install on Ubuntu Linux 
> and ran into a bunch of problems. So, I wanted to back off a bit and 
> try something simple first. Hence, my attempt to install on my Windows 
> 7 Laptop.
>
Well, if you tell to us the problems that you have in Ubuntu, we can 
give you a hand.
Michael Noll have great tutorials for this:

Running Hadoop on Ubuntu Linux (Single node cluster)
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

Running Hadoop on Ubuntu Linux (Multi node cluster)
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>
> I am doing the "standalone" mode - as per the documentation (link in 
> my original email), I don't need ssh unless I am doing the distributed 
> mode. Is that not correct?
>
Yes, but I give you the same recommendation that Bejoy said to you: Use 
a Unix-based platform for Hadoop, it's more tested and have better 
performance than Windows.

Best wishes
>
> Thanks again for responding
> Udayini
>
>
> --- On *Tue, 9/4/12, Bejoy Ks /<be...@gmail.com>/* wrote:
>
>
>     From: Bejoy Ks <be...@gmail.com>
>     Subject: Re: Exception while running a Hadoop example on a
>     standalone install on Windows 7
>     To: user@hadoop.apache.org
>     Date: Tuesday, September 4, 2012, 11:11 AM
>
>     Hi Udayani
>
>     By default hadoop works well for linux and linux based OS. Since
>     you are on Windows you need to install and configure ssh using
>     cygwin before you start hadoop daemons.
>
>     On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala
>     <udayini_pendyala@yahoo.com
>     </m...@yahoo.com>> wrote:
>
>         Hi,
>
>
>         Following is a description of what I am trying to do and the
>         steps I followed.
>
>
>         GOAL:
>
>         a). Install Hadoop 1.0.3
>
>         b). Hadoop in a standalone (or local) mode
>
>         c). OS: Windows 7
>
>
>         STEPS FOLLOWED:
>
>         1.    1. I followed instructions from:
>         http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
>         Listing the steps I did -
>
>         a.I went to: http://hadoop.apache.org/core/releases.html.
>
>         b.I installed hadoop-1.0.3 by downloading
>         “hadoop-1.0.3.tar.gz” and unzipping/untarring the file.
>
>         c.I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
>         d.I set up HADOOP_INSTALL to point to my Hadoop install
>         location. I updated my PATH variable to have $HADOOP_INSTALL/bin
>
>         e.After the above steps, I ran the command: “hadoop version”
>         and got the following information:
>
>         $ hadoop version
>
>         Hadoop 1.0.3
>
>         Subversion
>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
>         -r 1335192
>
>         Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
>         From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>         2.      2. The standalone was very easy to install as
>         described above. Then, I tried to run a sample command as
>         given in:
>
>         http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
>         Specifically, the steps followed were:
>
>         a.cd $HADOOP_INSTALL
>
>         b.mkdir input
>
>         c.cp conf/*.xml input
>
>         d.bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
>         ‘dfs[a-z.]+’
>
>         and got the following error:
>
>         $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
>         'dfs[a-z.]+'
>
>         12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load
>         native-hadoop libra ry for your platform... using builtin-java
>         classes where applicable
>
>         12/09/03 15:41:57 ERROR security.UserGroupInformation:
>         PriviledgedActionExceptio n as:upendyal
>         cause:java.io.IOException: Failed to set permissions of path:
>         \tmp
>         \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
>         to 0700
>
>         java.io <http://java.io.IO>.IOException: Failed to set
>         permissions of path: \tmp\hadoop-upendyal\map
>         red\staging\upendyal-1075683580\.staging to 0700
>
>         at
>         org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
>         at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
>         at
>         org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
>         tem.java:509)
>
>         at
>         org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
>         a:344)
>
>         at
>         org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
>         9)
>
>         at
>         org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
>         ssionFiles.java:116)
>
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Unknown Source)
>
>         at
>         org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
>         tion.java:1121)
>
>         at
>         org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
>         50)
>
>         at
>         org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
>         at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
>         at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>         at java.lang.reflect.Method.invoke(Unknown Source)
>
>         at
>         org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
>         mDriver.java:68)
>
>         at
>         org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
>         at
>         org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>         at java.lang.reflect.Method.invoke(Unknown Source)
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>         3.    3. I googled the problem and found the following links
>         but none of these suggestions helped.Most people seem to be
>         getting a resolution when they change the version of Hadoop.
>
>         a.http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
>         b.http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
>         Is this a problem in the version of Hadoop I selected OR am I
>         doing something wrong? I would appreciate any help with this.
>
>         Thanks
>
>         Udayini
>
>

-- 

Marcos Luis Ortíz Valmaseda
*Data Engineer && Sr. System Administrator at UCI*
about.me/marcosortiz <http://about.me/marcosortiz>
My Blog <http://marcosluis2186.posterous.com>
@marcosluis2186 <http://twitter.com/marcosluis2186>





10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Marcos Ortiz <ml...@uci.cu>.
On 09/04/2012 02:35 PM, Udayini Pendyala wrote:
> Hi Bejoy,
>
> Thanks for your response. I first started to install on Ubuntu Linux 
> and ran into a bunch of problems. So, I wanted to back off a bit and 
> try something simple first. Hence, my attempt to install on my Windows 
> 7 Laptop.
>
Well, if you tell to us the problems that you have in Ubuntu, we can 
give you a hand.
Michael Noll have great tutorials for this:

Running Hadoop on Ubuntu Linux (Single node cluster)
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

Running Hadoop on Ubuntu Linux (Multi node cluster)
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>
> I am doing the "standalone" mode - as per the documentation (link in 
> my original email), I don't need ssh unless I am doing the distributed 
> mode. Is that not correct?
>
Yes, but I give you the same recommendation that Bejoy said to you: Use 
a Unix-based platform for Hadoop, it's more tested and have better 
performance than Windows.

Best wishes
>
> Thanks again for responding
> Udayini
>
>
> --- On *Tue, 9/4/12, Bejoy Ks /<be...@gmail.com>/* wrote:
>
>
>     From: Bejoy Ks <be...@gmail.com>
>     Subject: Re: Exception while running a Hadoop example on a
>     standalone install on Windows 7
>     To: user@hadoop.apache.org
>     Date: Tuesday, September 4, 2012, 11:11 AM
>
>     Hi Udayani
>
>     By default hadoop works well for linux and linux based OS. Since
>     you are on Windows you need to install and configure ssh using
>     cygwin before you start hadoop daemons.
>
>     On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala
>     <udayini_pendyala@yahoo.com
>     </m...@yahoo.com>> wrote:
>
>         Hi,
>
>
>         Following is a description of what I am trying to do and the
>         steps I followed.
>
>
>         GOAL:
>
>         a). Install Hadoop 1.0.3
>
>         b). Hadoop in a standalone (or local) mode
>
>         c). OS: Windows 7
>
>
>         STEPS FOLLOWED:
>
>         1.    1. I followed instructions from:
>         http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
>         Listing the steps I did -
>
>         a.I went to: http://hadoop.apache.org/core/releases.html.
>
>         b.I installed hadoop-1.0.3 by downloading
>         “hadoop-1.0.3.tar.gz” and unzipping/untarring the file.
>
>         c.I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
>         d.I set up HADOOP_INSTALL to point to my Hadoop install
>         location. I updated my PATH variable to have $HADOOP_INSTALL/bin
>
>         e.After the above steps, I ran the command: “hadoop version”
>         and got the following information:
>
>         $ hadoop version
>
>         Hadoop 1.0.3
>
>         Subversion
>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
>         -r 1335192
>
>         Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
>         From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>         2.      2. The standalone was very easy to install as
>         described above. Then, I tried to run a sample command as
>         given in:
>
>         http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
>         Specifically, the steps followed were:
>
>         a.cd $HADOOP_INSTALL
>
>         b.mkdir input
>
>         c.cp conf/*.xml input
>
>         d.bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
>         ‘dfs[a-z.]+’
>
>         and got the following error:
>
>         $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
>         'dfs[a-z.]+'
>
>         12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load
>         native-hadoop libra ry for your platform... using builtin-java
>         classes where applicable
>
>         12/09/03 15:41:57 ERROR security.UserGroupInformation:
>         PriviledgedActionExceptio n as:upendyal
>         cause:java.io.IOException: Failed to set permissions of path:
>         \tmp
>         \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
>         to 0700
>
>         java.io <http://java.io.IO>.IOException: Failed to set
>         permissions of path: \tmp\hadoop-upendyal\map
>         red\staging\upendyal-1075683580\.staging to 0700
>
>         at
>         org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
>         at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
>         at
>         org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
>         tem.java:509)
>
>         at
>         org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
>         a:344)
>
>         at
>         org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
>         9)
>
>         at
>         org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
>         ssionFiles.java:116)
>
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Unknown Source)
>
>         at
>         org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
>         tion.java:1121)
>
>         at
>         org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
>         50)
>
>         at
>         org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
>         at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
>         at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>         at java.lang.reflect.Method.invoke(Unknown Source)
>
>         at
>         org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
>         mDriver.java:68)
>
>         at
>         org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
>         at
>         org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>         at java.lang.reflect.Method.invoke(Unknown Source)
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>         3.    3. I googled the problem and found the following links
>         but none of these suggestions helped.Most people seem to be
>         getting a resolution when they change the version of Hadoop.
>
>         a.http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
>         b.http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
>         Is this a problem in the version of Hadoop I selected OR am I
>         doing something wrong? I would appreciate any help with this.
>
>         Thanks
>
>         Udayini
>
>

-- 

Marcos Luis Ortíz Valmaseda
*Data Engineer && Sr. System Administrator at UCI*
about.me/marcosortiz <http://about.me/marcosortiz>
My Blog <http://marcosluis2186.posterous.com>
@marcosluis2186 <http://twitter.com/marcosluis2186>





10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Marcos Ortiz <ml...@uci.cu>.
On 09/04/2012 02:35 PM, Udayini Pendyala wrote:
> Hi Bejoy,
>
> Thanks for your response. I first started to install on Ubuntu Linux 
> and ran into a bunch of problems. So, I wanted to back off a bit and 
> try something simple first. Hence, my attempt to install on my Windows 
> 7 Laptop.
>
Well, if you tell to us the problems that you have in Ubuntu, we can 
give you a hand.
Michael Noll have great tutorials for this:

Running Hadoop on Ubuntu Linux (Single node cluster)
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

Running Hadoop on Ubuntu Linux (Multi node cluster)
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
>
> I am doing the "standalone" mode - as per the documentation (link in 
> my original email), I don't need ssh unless I am doing the distributed 
> mode. Is that not correct?
>
Yes, but I give you the same recommendation that Bejoy said to you: Use 
a Unix-based platform for Hadoop, it's more tested and have better 
performance than Windows.

Best wishes
>
> Thanks again for responding
> Udayini
>
>
> --- On *Tue, 9/4/12, Bejoy Ks /<be...@gmail.com>/* wrote:
>
>
>     From: Bejoy Ks <be...@gmail.com>
>     Subject: Re: Exception while running a Hadoop example on a
>     standalone install on Windows 7
>     To: user@hadoop.apache.org
>     Date: Tuesday, September 4, 2012, 11:11 AM
>
>     Hi Udayani
>
>     By default hadoop works well for linux and linux based OS. Since
>     you are on Windows you need to install and configure ssh using
>     cygwin before you start hadoop daemons.
>
>     On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala
>     <udayini_pendyala@yahoo.com
>     </m...@yahoo.com>> wrote:
>
>         Hi,
>
>
>         Following is a description of what I am trying to do and the
>         steps I followed.
>
>
>         GOAL:
>
>         a). Install Hadoop 1.0.3
>
>         b). Hadoop in a standalone (or local) mode
>
>         c). OS: Windows 7
>
>
>         STEPS FOLLOWED:
>
>         1.    1. I followed instructions from:
>         http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
>         Listing the steps I did -
>
>         a.I went to: http://hadoop.apache.org/core/releases.html.
>
>         b.I installed hadoop-1.0.3 by downloading
>         “hadoop-1.0.3.tar.gz” and unzipping/untarring the file.
>
>         c.I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
>         d.I set up HADOOP_INSTALL to point to my Hadoop install
>         location. I updated my PATH variable to have $HADOOP_INSTALL/bin
>
>         e.After the above steps, I ran the command: “hadoop version”
>         and got the following information:
>
>         $ hadoop version
>
>         Hadoop 1.0.3
>
>         Subversion
>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
>         -r 1335192
>
>         Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
>         From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>         2.      2. The standalone was very easy to install as
>         described above. Then, I tried to run a sample command as
>         given in:
>
>         http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
>         Specifically, the steps followed were:
>
>         a.cd $HADOOP_INSTALL
>
>         b.mkdir input
>
>         c.cp conf/*.xml input
>
>         d.bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
>         ‘dfs[a-z.]+’
>
>         and got the following error:
>
>         $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
>         'dfs[a-z.]+'
>
>         12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load
>         native-hadoop libra ry for your platform... using builtin-java
>         classes where applicable
>
>         12/09/03 15:41:57 ERROR security.UserGroupInformation:
>         PriviledgedActionExceptio n as:upendyal
>         cause:java.io.IOException: Failed to set permissions of path:
>         \tmp
>         \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging
>         to 0700
>
>         java.io <http://java.io.IO>.IOException: Failed to set
>         permissions of path: \tmp\hadoop-upendyal\map
>         red\staging\upendyal-1075683580\.staging to 0700
>
>         at
>         org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
>         at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
>         at
>         org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
>         tem.java:509)
>
>         at
>         org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
>         a:344)
>
>         at
>         org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18
>         9)
>
>         at
>         org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
>         ssionFiles.java:116)
>
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Unknown Source)
>
>         at
>         org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
>         tion.java:1121)
>
>         at
>         org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
>         50)
>
>         at
>         org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
>         at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
>         at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>         at java.lang.reflect.Method.invoke(Unknown Source)
>
>         at
>         org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
>         mDriver.java:68)
>
>         at
>         org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
>         at
>         org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
>         at java.lang.reflect.Method.invoke(Unknown Source)
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>         3.    3. I googled the problem and found the following links
>         but none of these suggestions helped.Most people seem to be
>         getting a resolution when they change the version of Hadoop.
>
>         a.http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
>         b.http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
>         Is this a problem in the version of Hadoop I selected OR am I
>         doing something wrong? I would appreciate any help with this.
>
>         Thanks
>
>         Udayini
>
>

-- 

Marcos Luis Ortíz Valmaseda
*Data Engineer && Sr. System Administrator at UCI*
about.me/marcosortiz <http://about.me/marcosortiz>
My Blog <http://marcosluis2186.posterous.com>
@marcosluis2186 <http://twitter.com/marcosluis2186>





10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Hi Bejoy,

Thanks for your response. I first started to install on Ubuntu Linux and ran into a bunch of problems. So, I wanted to back off a bit and try something simple first. Hence, my attempt to install on my Windows 7 Laptop.

I am doing the "standalone" mode - as per the documentation (link in my original email), I don't need ssh unless I am doing the distributed mode. Is that not correct?

Thanks again for responding
Udayini


--- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com> wrote:

From: Bejoy Ks <be...@gmail.com>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Date: Tuesday, September 4, 2012, 11:11 AM

Hi Udayani
By default hadoop works well for linux and linux based OS. Since you are on Windows you need to install and configure ssh using cygwin before you start hadoop daemons.


On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com> wrote:



Hi,
Following is a description of what I am trying to do and the steps I followed. 




GOAL:

a). Install Hadoop
1.0.3

b). Hadoop in a standalone (or local) mode

c). OS: Windows 7


STEPS FOLLOWED:

1.    1.   I
followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the steps I did - 


a.       I
went to: http://hadoop.apache.org/core/releases.html.

b.      I
installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
unzipping/untarring the file.

c.       I
installed JDK 1.6 and set up JAVA_HOME to point to it.

d.      I
set up HADOOP_INSTALL to point to my Hadoop install location. I updated my PATH
variable to have $HADOOP_INSTALL/bin

e.      After
the above steps, I ran the command: “hadoop version” and got the following
information:

$ hadoop version


Hadoop 1.0.3


Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
-r 1335192

Compiled by
hortonfo on Tue May 8 20:31:25 UTC 2012

From source with
checksum e6b0c1e23dcf76907c5fecb4b832f3be

 

2.      2.  The
standalone was very easy to install as described above. Then, I tried to run a
sample command as given in:

http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local

Specifically, the
steps followed were:

a.       cd
$HADOOP_INSTALL

b.      mkdir
input

c.       cp
conf/*.xml input

d.      bin/hadoop
jar hadoop-examples-1.0.3.jar grep input output ‘dfs[a-z.]+’

and got the following error:

 

$
bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'

12/09/03
15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for
your platform... using builtin-java classes where applicable

12/09/03
15:41:57 ERROR security.UserGroupInformation: PriviledgedActionExceptio n
as:upendyal cause:java.io.IOException: Failed to set permissions of path: \tmp
\hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700

java.io.IOException: Failed to set
permissions of path: \tmp\hadoop-upendyal\map
red\staging\upendyal-1075683580\.staging to 0700

at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)

at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)

at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:509)

at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav a:344)

at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)

at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
ssionFiles.java:116)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Unknown Source)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1121)

at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8 50)

at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)

at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)

at
org.apache.hadoop.examples.Grep.run(Grep.java:69)

at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at
org.apache.hadoop.examples.Grep.main(Grep.java:93)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
mDriver.java:68)

at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 

3.    3.   I
googled the problem and found the following
links but none of these suggestions helped. Most
people seem to be getting a resolution when they change the version of Hadoop.


a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E


b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837



Is this a problem in the version of Hadoop I selected OR am I doing something wrong? I would appreciate any help with
this.

Thanks

Udayini




Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Hi Bejoy,

Thanks for your response. I first started to install on Ubuntu Linux and ran into a bunch of problems. So, I wanted to back off a bit and try something simple first. Hence, my attempt to install on my Windows 7 Laptop.

I am doing the "standalone" mode - as per the documentation (link in my original email), I don't need ssh unless I am doing the distributed mode. Is that not correct?

Thanks again for responding
Udayini


--- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com> wrote:

From: Bejoy Ks <be...@gmail.com>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Date: Tuesday, September 4, 2012, 11:11 AM

Hi Udayani
By default hadoop works well for linux and linux based OS. Since you are on Windows you need to install and configure ssh using cygwin before you start hadoop daemons.


On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com> wrote:



Hi,
Following is a description of what I am trying to do and the steps I followed. 




GOAL:

a). Install Hadoop
1.0.3

b). Hadoop in a standalone (or local) mode

c). OS: Windows 7


STEPS FOLLOWED:

1.    1.   I
followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the steps I did - 


a.       I
went to: http://hadoop.apache.org/core/releases.html.

b.      I
installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
unzipping/untarring the file.

c.       I
installed JDK 1.6 and set up JAVA_HOME to point to it.

d.      I
set up HADOOP_INSTALL to point to my Hadoop install location. I updated my PATH
variable to have $HADOOP_INSTALL/bin

e.      After
the above steps, I ran the command: “hadoop version” and got the following
information:

$ hadoop version


Hadoop 1.0.3


Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
-r 1335192

Compiled by
hortonfo on Tue May 8 20:31:25 UTC 2012

From source with
checksum e6b0c1e23dcf76907c5fecb4b832f3be

 

2.      2.  The
standalone was very easy to install as described above. Then, I tried to run a
sample command as given in:

http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local

Specifically, the
steps followed were:

a.       cd
$HADOOP_INSTALL

b.      mkdir
input

c.       cp
conf/*.xml input

d.      bin/hadoop
jar hadoop-examples-1.0.3.jar grep input output ‘dfs[a-z.]+’

and got the following error:

 

$
bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'

12/09/03
15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for
your platform... using builtin-java classes where applicable

12/09/03
15:41:57 ERROR security.UserGroupInformation: PriviledgedActionExceptio n
as:upendyal cause:java.io.IOException: Failed to set permissions of path: \tmp
\hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700

java.io.IOException: Failed to set
permissions of path: \tmp\hadoop-upendyal\map
red\staging\upendyal-1075683580\.staging to 0700

at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)

at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)

at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:509)

at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav a:344)

at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)

at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
ssionFiles.java:116)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Unknown Source)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1121)

at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8 50)

at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)

at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)

at
org.apache.hadoop.examples.Grep.run(Grep.java:69)

at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at
org.apache.hadoop.examples.Grep.main(Grep.java:93)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
mDriver.java:68)

at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 

3.    3.   I
googled the problem and found the following
links but none of these suggestions helped. Most
people seem to be getting a resolution when they change the version of Hadoop.


a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E


b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837



Is this a problem in the version of Hadoop I selected OR am I doing something wrong? I would appreciate any help with
this.

Thanks

Udayini




Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Hi Bejoy,

Thanks for your response. I first started to install on Ubuntu Linux and ran into a bunch of problems. So, I wanted to back off a bit and try something simple first. Hence, my attempt to install on my Windows 7 Laptop.

I am doing the "standalone" mode - as per the documentation (link in my original email), I don't need ssh unless I am doing the distributed mode. Is that not correct?

Thanks again for responding
Udayini


--- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com> wrote:

From: Bejoy Ks <be...@gmail.com>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Date: Tuesday, September 4, 2012, 11:11 AM

Hi Udayani
By default hadoop works well for linux and linux based OS. Since you are on Windows you need to install and configure ssh using cygwin before you start hadoop daemons.


On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com> wrote:



Hi,
Following is a description of what I am trying to do and the steps I followed. 




GOAL:

a). Install Hadoop
1.0.3

b). Hadoop in a standalone (or local) mode

c). OS: Windows 7


STEPS FOLLOWED:

1.    1.   I
followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the steps I did - 


a.       I
went to: http://hadoop.apache.org/core/releases.html.

b.      I
installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
unzipping/untarring the file.

c.       I
installed JDK 1.6 and set up JAVA_HOME to point to it.

d.      I
set up HADOOP_INSTALL to point to my Hadoop install location. I updated my PATH
variable to have $HADOOP_INSTALL/bin

e.      After
the above steps, I ran the command: “hadoop version” and got the following
information:

$ hadoop version


Hadoop 1.0.3


Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
-r 1335192

Compiled by
hortonfo on Tue May 8 20:31:25 UTC 2012

From source with
checksum e6b0c1e23dcf76907c5fecb4b832f3be

 

2.      2.  The
standalone was very easy to install as described above. Then, I tried to run a
sample command as given in:

http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local

Specifically, the
steps followed were:

a.       cd
$HADOOP_INSTALL

b.      mkdir
input

c.       cp
conf/*.xml input

d.      bin/hadoop
jar hadoop-examples-1.0.3.jar grep input output ‘dfs[a-z.]+’

and got the following error:

 

$
bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'

12/09/03
15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for
your platform... using builtin-java classes where applicable

12/09/03
15:41:57 ERROR security.UserGroupInformation: PriviledgedActionExceptio n
as:upendyal cause:java.io.IOException: Failed to set permissions of path: \tmp
\hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700

java.io.IOException: Failed to set
permissions of path: \tmp\hadoop-upendyal\map
red\staging\upendyal-1075683580\.staging to 0700

at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)

at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)

at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:509)

at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav a:344)

at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)

at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
ssionFiles.java:116)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Unknown Source)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1121)

at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8 50)

at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)

at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)

at
org.apache.hadoop.examples.Grep.run(Grep.java:69)

at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at
org.apache.hadoop.examples.Grep.main(Grep.java:93)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
mDriver.java:68)

at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 

3.    3.   I
googled the problem and found the following
links but none of these suggestions helped. Most
people seem to be getting a resolution when they change the version of Hadoop.


a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E


b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837



Is this a problem in the version of Hadoop I selected OR am I doing something wrong? I would appreciate any help with
this.

Thanks

Udayini




Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.
Hi Bejoy,

Thanks for your response. I first started to install on Ubuntu Linux and ran into a bunch of problems. So, I wanted to back off a bit and try something simple first. Hence, my attempt to install on my Windows 7 Laptop.

I am doing the "standalone" mode - as per the documentation (link in my original email), I don't need ssh unless I am doing the distributed mode. Is that not correct?

Thanks again for responding
Udayini


--- On Tue, 9/4/12, Bejoy Ks <be...@gmail.com> wrote:

From: Bejoy Ks <be...@gmail.com>
Subject: Re: Exception while running a Hadoop example on a standalone install on Windows 7
To: user@hadoop.apache.org
Date: Tuesday, September 4, 2012, 11:11 AM

Hi Udayani
By default hadoop works well for linux and linux based OS. Since you are on Windows you need to install and configure ssh using cygwin before you start hadoop daemons.


On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <ud...@yahoo.com> wrote:



Hi,
Following is a description of what I am trying to do and the steps I followed. 




GOAL:

a). Install Hadoop
1.0.3

b). Hadoop in a standalone (or local) mode

c). OS: Windows 7


STEPS FOLLOWED:

1.    1.   I
followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the steps I did - 


a.       I
went to: http://hadoop.apache.org/core/releases.html.

b.      I
installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
unzipping/untarring the file.

c.       I
installed JDK 1.6 and set up JAVA_HOME to point to it.

d.      I
set up HADOOP_INSTALL to point to my Hadoop install location. I updated my PATH
variable to have $HADOOP_INSTALL/bin

e.      After
the above steps, I ran the command: “hadoop version” and got the following
information:

$ hadoop version


Hadoop 1.0.3


Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
-r 1335192

Compiled by
hortonfo on Tue May 8 20:31:25 UTC 2012

From source with
checksum e6b0c1e23dcf76907c5fecb4b832f3be

 

2.      2.  The
standalone was very easy to install as described above. Then, I tried to run a
sample command as given in:

http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local

Specifically, the
steps followed were:

a.       cd
$HADOOP_INSTALL

b.      mkdir
input

c.       cp
conf/*.xml input

d.      bin/hadoop
jar hadoop-examples-1.0.3.jar grep input output ‘dfs[a-z.]+’

and got the following error:

 

$
bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'

12/09/03
15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for
your platform... using builtin-java classes where applicable

12/09/03
15:41:57 ERROR security.UserGroupInformation: PriviledgedActionExceptio n
as:upendyal cause:java.io.IOException: Failed to set permissions of path: \tmp
\hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700

java.io.IOException: Failed to set
permissions of path: \tmp\hadoop-upendyal\map
red\staging\upendyal-1075683580\.staging to 0700

at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)

at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)

at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:509)

at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav a:344)

at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)

at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
ssionFiles.java:116)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Unknown Source)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1121)

at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8 50)

at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)

at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)

at
org.apache.hadoop.examples.Grep.run(Grep.java:69)

at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at
org.apache.hadoop.examples.Grep.main(Grep.java:93)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
mDriver.java:68)

at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 

3.    3.   I
googled the problem and found the following
links but none of these suggestions helped. Most
people seem to be getting a resolution when they change the version of Hadoop.


a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E


b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837



Is this a problem in the version of Hadoop I selected OR am I doing something wrong? I would appreciate any help with
this.

Thanks

Udayini




Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Bejoy Ks <be...@gmail.com>.
Hi Udayani

By default hadoop works well for linux and linux based OS. Since you are on
Windows you need to install and configure ssh using cygwin before you start
hadoop daemons.

On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <udayini_pendyala@yahoo.com
> wrote:

> Hi,
>
>
> Following is a description of what I am trying to do and the steps I
> followed.
>
>
> GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
> STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.IO>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
> Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Visioner Sadak <vi...@gmail.com>.
Hadoop 1.0.3 will give you lot of problems with windows and cygwin, becoz
of complexities of cygwin configuration paths,so better downgrade to lower
versions for development and testing purpose on windows(i downgraded to
0.22.0)  and you can  use 1.0.3 on production with linux servers...I will
be attaching a tutorial for hadoop installation on windows 7 with cygwin
soon

On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <udayini_pendyala@yahoo.com
> wrote:

>   Hi,
>
>
> Following is a description of what I am trying to do and the steps I
> followed.
>
>
> GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
> STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.io/>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
> Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Bejoy Ks <be...@gmail.com>.
Hi Udayani

By default hadoop works well for linux and linux based OS. Since you are on
Windows you need to install and configure ssh using cygwin before you start
hadoop daemons.

On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <udayini_pendyala@yahoo.com
> wrote:

> Hi,
>
>
> Following is a description of what I am trying to do and the steps I
> followed.
>
>
> GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
> STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.IO>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
> Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Bejoy Ks <be...@gmail.com>.
Hi Udayani

By default hadoop works well for linux and linux based OS. Since you are on
Windows you need to install and configure ssh using cygwin before you start
hadoop daemons.

On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <udayini_pendyala@yahoo.com
> wrote:

> Hi,
>
>
> Following is a description of what I am trying to do and the steps I
> followed.
>
>
> GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
> STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.IO>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
> Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Visioner Sadak <vi...@gmail.com>.
Hadoop 1.0.3 will give you lot of problems with windows and cygwin, becoz
of complexities of cygwin configuration paths,so better downgrade to lower
versions for development and testing purpose on windows(i downgraded to
0.22.0)  and you can  use 1.0.3 on production with linux servers...I will
be attaching a tutorial for hadoop installation on windows 7 with cygwin
soon

On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <udayini_pendyala@yahoo.com
> wrote:

>   Hi,
>
>
> Following is a description of what I am trying to do and the steps I
> followed.
>
>
> GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
> STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.io/>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
> Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Visioner Sadak <vi...@gmail.com>.
Hadoop 1.0.3 will give you lot of problems with windows and cygwin, becoz
of complexities of cygwin configuration paths,so better downgrade to lower
versions for development and testing purpose on windows(i downgraded to
0.22.0)  and you can  use 1.0.3 on production with linux servers...I will
be attaching a tutorial for hadoop installation on windows 7 with cygwin
soon

On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <udayini_pendyala@yahoo.com
> wrote:

>   Hi,
>
>
> Following is a description of what I am trying to do and the steps I
> followed.
>
>
> GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
> STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.io/>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
> Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Bejoy Ks <be...@gmail.com>.
Hi Udayani

By default hadoop works well for linux and linux based OS. Since you are on
Windows you need to install and configure ssh using cygwin before you start
hadoop daemons.

On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <udayini_pendyala@yahoo.com
> wrote:

> Hi,
>
>
> Following is a description of what I am trying to do and the steps I
> followed.
>
>
> GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
> STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.IO>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
> Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>

Re: Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Visioner Sadak <vi...@gmail.com>.
Hadoop 1.0.3 will give you lot of problems with windows and cygwin, becoz
of complexities of cygwin configuration paths,so better downgrade to lower
versions for development and testing purpose on windows(i downgraded to
0.22.0)  and you can  use 1.0.3 on production with linux servers...I will
be attaching a tutorial for hadoop installation on windows 7 with cygwin
soon

On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala <udayini_pendyala@yahoo.com
> wrote:

>   Hi,
>
>
> Following is a description of what I am trying to do and the steps I
> followed.
>
>
> GOAL:
>
> a). Install Hadoop 1.0.3
>
> b). Hadoop in a standalone (or local) mode
>
> c). OS: Windows 7
>
>
> STEPS FOLLOWED:
>
> 1.    1.   I followed instructions from:
> http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html.
> Listing the steps I did -
>
> a.       I went to: http://hadoop.apache.org/core/releases.html.
>
> b.      I installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
> unzipping/untarring the file.
>
> c.       I installed JDK 1.6 and set up JAVA_HOME to point to it.
>
> d.      I set up HADOOP_INSTALL to point to my Hadoop install location. I
> updated my PATH variable to have $HADOOP_INSTALL/bin
>
> e.      After the above steps, I ran the command: “hadoop version” and
> got the following information:
>
> $ hadoop version
>
> Hadoop 1.0.3
>
> Subversion
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
> 1335192
>
> Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
>
> From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
>
>
>
> 2.      2.  The standalone was very easy to install as described above.
> Then, I tried to run a sample command as given in:
>
> http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local
>
> Specifically, the steps followed were:
>
> a.       cd $HADOOP_INSTALL
>
> b.      mkdir input
>
> c.       cp conf/*.xml input
>
> d.      bin/hadoop jar hadoop-examples-1.0.3.jar grep input output
> ‘dfs[a-z.]+’
>
> and got the following error:
>
>
>
> $ bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'
>
> 12/09/03 15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop
> libra ry for your platform... using builtin-java classes where applicable
>
> 12/09/03 15:41:57 ERROR security.UserGroupInformation:
> PriviledgedActionExceptio n as:upendyal cause:java.io.IOException: Failed
> to set permissions of path: \tmp
> \hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700
>
> java.io <http://java.io.io/>.IOException: Failed to set permissions of
> path: \tmp\hadoop-upendyal\map red\staging\upendyal-1075683580\.staging to
> 0700
>
> at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
>
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
> tem.java:509)
>
> at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav
> a:344)
>
> at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)
>
> at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
> ssionFiles.java:116)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Unknown Source)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
> tion.java:1121)
>
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8
> 50)
>
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:68)
>
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>
> at java.lang.reflect.Method.invoke(Unknown Source)
>
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>
> 3.    3.   I googled the problem and found the following links but none
> of these suggestions helped. Most people seem to be getting a resolution
> when they change the version of Hadoop.
>
> a.
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E
>
> b.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837
>
>
> Is this a problem in the version of Hadoop I selected OR am I doing
> something wrong? I would appreciate any help with this.
>
> Thanks
>
> Udayini
>

Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.

Hi,
Following is a description of what I am trying to do and the steps I followed. 



GOAL:

a). Install Hadoop
1.0.3

b). Hadoop in a standalone (or local) mode

c). OS: Windows 7


STEPS FOLLOWED:

1.    1.   I
followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the steps I did - 

a.       I
went to: http://hadoop.apache.org/core/releases.html.

b.      I
installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
unzipping/untarring the file.

c.       I
installed JDK 1.6 and set up JAVA_HOME to point to it.

d.      I
set up HADOOP_INSTALL to point to my Hadoop install location. I updated my PATH
variable to have $HADOOP_INSTALL/bin

e.      After
the above steps, I ran the command: “hadoop version” and got the following
information:

$ hadoop version

Hadoop 1.0.3

Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
-r 1335192

Compiled by
hortonfo on Tue May 8 20:31:25 UTC 2012

From source with
checksum e6b0c1e23dcf76907c5fecb4b832f3be

 

2.      2.  The
standalone was very easy to install as described above. Then, I tried to run a
sample command as given in:

http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local

Specifically, the
steps followed were:

a.       cd
$HADOOP_INSTALL

b.      mkdir
input

c.       cp
conf/*.xml input

d.      bin/hadoop
jar hadoop-examples-1.0.3.jar grep input output ‘dfs[a-z.]+’

and got the following error:

 

$
bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'

12/09/03
15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for
your platform... using builtin-java classes where applicable

12/09/03
15:41:57 ERROR security.UserGroupInformation: PriviledgedActionExceptio n
as:upendyal cause:java.io.IOException: Failed to set permissions of path: \tmp
\hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700

java.io.IOException: Failed to set
permissions of path: \tmp\hadoop-upendyal\map
red\staging\upendyal-1075683580\.staging to 0700

at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)

at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)

at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:509)

at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav a:344)

at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)

at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
ssionFiles.java:116)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Unknown Source)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1121)

at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8 50)

at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)

at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)

at
org.apache.hadoop.examples.Grep.run(Grep.java:69)

at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at
org.apache.hadoop.examples.Grep.main(Grep.java:93)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
mDriver.java:68)

at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 

3.    3.   I
googled the problem and found the following
links but none of these suggestions helped. Most
people seem to be getting a resolution when they change the version of Hadoop.


a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E

b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837


Is this a problem in the version of Hadoop I selected OR am I doing something wrong? I would appreciate any help with
this.

Thanks

Udayini


Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.

Hi,
Following is a description of what I am trying to do and the steps I followed. 



GOAL:

a). Install Hadoop
1.0.3

b). Hadoop in a standalone (or local) mode

c). OS: Windows 7


STEPS FOLLOWED:

1.    1.   I
followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the steps I did - 

a.       I
went to: http://hadoop.apache.org/core/releases.html.

b.      I
installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
unzipping/untarring the file.

c.       I
installed JDK 1.6 and set up JAVA_HOME to point to it.

d.      I
set up HADOOP_INSTALL to point to my Hadoop install location. I updated my PATH
variable to have $HADOOP_INSTALL/bin

e.      After
the above steps, I ran the command: “hadoop version” and got the following
information:

$ hadoop version

Hadoop 1.0.3

Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
-r 1335192

Compiled by
hortonfo on Tue May 8 20:31:25 UTC 2012

From source with
checksum e6b0c1e23dcf76907c5fecb4b832f3be

 

2.      2.  The
standalone was very easy to install as described above. Then, I tried to run a
sample command as given in:

http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local

Specifically, the
steps followed were:

a.       cd
$HADOOP_INSTALL

b.      mkdir
input

c.       cp
conf/*.xml input

d.      bin/hadoop
jar hadoop-examples-1.0.3.jar grep input output ‘dfs[a-z.]+’

and got the following error:

 

$
bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'

12/09/03
15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for
your platform... using builtin-java classes where applicable

12/09/03
15:41:57 ERROR security.UserGroupInformation: PriviledgedActionExceptio n
as:upendyal cause:java.io.IOException: Failed to set permissions of path: \tmp
\hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700

java.io.IOException: Failed to set
permissions of path: \tmp\hadoop-upendyal\map
red\staging\upendyal-1075683580\.staging to 0700

at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)

at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)

at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:509)

at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav a:344)

at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)

at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
ssionFiles.java:116)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Unknown Source)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1121)

at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8 50)

at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)

at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)

at
org.apache.hadoop.examples.Grep.run(Grep.java:69)

at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at
org.apache.hadoop.examples.Grep.main(Grep.java:93)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
mDriver.java:68)

at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 

3.    3.   I
googled the problem and found the following
links but none of these suggestions helped. Most
people seem to be getting a resolution when they change the version of Hadoop.


a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E

b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837


Is this a problem in the version of Hadoop I selected OR am I doing something wrong? I would appreciate any help with
this.

Thanks

Udayini


Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.

Hi,
Following is a description of what I am trying to do and the steps I followed. 



GOAL:

a). Install Hadoop
1.0.3

b). Hadoop in a standalone (or local) mode

c). OS: Windows 7


STEPS FOLLOWED:

1.    1.   I
followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the steps I did - 

a.       I
went to: http://hadoop.apache.org/core/releases.html.

b.      I
installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
unzipping/untarring the file.

c.       I
installed JDK 1.6 and set up JAVA_HOME to point to it.

d.      I
set up HADOOP_INSTALL to point to my Hadoop install location. I updated my PATH
variable to have $HADOOP_INSTALL/bin

e.      After
the above steps, I ran the command: “hadoop version” and got the following
information:

$ hadoop version

Hadoop 1.0.3

Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
-r 1335192

Compiled by
hortonfo on Tue May 8 20:31:25 UTC 2012

From source with
checksum e6b0c1e23dcf76907c5fecb4b832f3be

 

2.      2.  The
standalone was very easy to install as described above. Then, I tried to run a
sample command as given in:

http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local

Specifically, the
steps followed were:

a.       cd
$HADOOP_INSTALL

b.      mkdir
input

c.       cp
conf/*.xml input

d.      bin/hadoop
jar hadoop-examples-1.0.3.jar grep input output ‘dfs[a-z.]+’

and got the following error:

 

$
bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'

12/09/03
15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for
your platform... using builtin-java classes where applicable

12/09/03
15:41:57 ERROR security.UserGroupInformation: PriviledgedActionExceptio n
as:upendyal cause:java.io.IOException: Failed to set permissions of path: \tmp
\hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700

java.io.IOException: Failed to set
permissions of path: \tmp\hadoop-upendyal\map
red\staging\upendyal-1075683580\.staging to 0700

at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)

at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)

at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:509)

at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav a:344)

at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)

at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
ssionFiles.java:116)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Unknown Source)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1121)

at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8 50)

at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)

at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)

at
org.apache.hadoop.examples.Grep.run(Grep.java:69)

at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at
org.apache.hadoop.examples.Grep.main(Grep.java:93)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
mDriver.java:68)

at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 

3.    3.   I
googled the problem and found the following
links but none of these suggestions helped. Most
people seem to be getting a resolution when they change the version of Hadoop.


a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E

b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837


Is this a problem in the version of Hadoop I selected OR am I doing something wrong? I would appreciate any help with
this.

Thanks

Udayini


Exception while running a Hadoop example on a standalone install on Windows 7

Posted by Udayini Pendyala <ud...@yahoo.com>.

Hi,
Following is a description of what I am trying to do and the steps I followed. 



GOAL:

a). Install Hadoop
1.0.3

b). Hadoop in a standalone (or local) mode

c). OS: Windows 7


STEPS FOLLOWED:

1.    1.   I
followed instructions from: http://www.oreillynet.com/pub/a/other-programming/excerpts/hadoop-tdg/installing-apache-hadoop.html. Listing the steps I did - 

a.       I
went to: http://hadoop.apache.org/core/releases.html.

b.      I
installed hadoop-1.0.3 by downloading “hadoop-1.0.3.tar.gz” and
unzipping/untarring the file.

c.       I
installed JDK 1.6 and set up JAVA_HOME to point to it.

d.      I
set up HADOOP_INSTALL to point to my Hadoop install location. I updated my PATH
variable to have $HADOOP_INSTALL/bin

e.      After
the above steps, I ran the command: “hadoop version” and got the following
information:

$ hadoop version

Hadoop 1.0.3

Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0
-r 1335192

Compiled by
hortonfo on Tue May 8 20:31:25 UTC 2012

From source with
checksum e6b0c1e23dcf76907c5fecb4b832f3be

 

2.      2.  The
standalone was very easy to install as described above. Then, I tried to run a
sample command as given in:

http://hadoop.apache.org/common/docs/r0.17.2/quickstart.html#Local

Specifically, the
steps followed were:

a.       cd
$HADOOP_INSTALL

b.      mkdir
input

c.       cp
conf/*.xml input

d.      bin/hadoop
jar hadoop-examples-1.0.3.jar grep input output ‘dfs[a-z.]+’

and got the following error:

 

$
bin/hadoop jar hadoop-examples-1.0.3.jar grep input output 'dfs[a-z.]+'

12/09/03
15:41:57 WARN util.NativeCodeLoader: Unable to load native-hadoop libra ry for
your platform... using builtin-java classes where applicable

12/09/03
15:41:57 ERROR security.UserGroupInformation: PriviledgedActionExceptio n
as:upendyal cause:java.io.IOException: Failed to set permissions of path: \tmp
\hadoop-upendyal\mapred\staging\upendyal-1075683580\.staging to 0700

java.io.IOException: Failed to set
permissions of path: \tmp\hadoop-upendyal\map
red\staging\upendyal-1075683580\.staging to 0700

at
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)

at
org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)

at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSys
tem.java:509)

at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.jav a:344)

at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:18 9)

at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmi
ssionFiles.java:116)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)

at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Unknown Source)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
tion.java:1121)

at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:8 50)

at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)

at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)

at
org.apache.hadoop.examples.Grep.run(Grep.java:69)

at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at
org.apache.hadoop.examples.Grep.main(Grep.java:93)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
mDriver.java:68)

at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

at
java.lang.reflect.Method.invoke(Unknown Source)

at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 

3.    3.   I
googled the problem and found the following
links but none of these suggestions helped. Most
people seem to be getting a resolution when they change the version of Hadoop.


a.       http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201105.mbox/%3CBANLkTin-8+z8uYBTdmaa4cvxz4JzM14VfA@mail.gmail.com%3E

b.      http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/25837


Is this a problem in the version of Hadoop I selected OR am I doing something wrong? I would appreciate any help with
this.

Thanks

Udayini


Re: knowing the nodes on which reduce tasks will run

Posted by Steve Loughran <st...@hortonworks.com>.
On 3 September 2012 15:19, Abhay Ratnaparkhi <ab...@gmail.com>wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
>
you could always set up specific partitions for intermediate data, though
you get better bandwidth by striping the data across all disks, and better
flexibility by sharing the same partition.

There's also a property to set the amount of space to allocate for DFS
storage; reduce that by changing  dfs.datanode.du.reserved and the
datanodes will leave more free space around.

see: http://wiki.apache.org/hadoop/DiskSetup

Re: knowing the nodes on which reduce tasks will run

Posted by Bejoy Ks <be...@gmail.com>.
Hi Abhay

You need this value to be changed before you submit your job and restart
TT. Modifying this value in  mid time won't affect the running jobs.

On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
> running tasktracker?
> Seems that I need to restart the tasktracker and in that case I'll loose
> the output of map tasks by particular tasktracker.
>
> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
> restarting tasktracker?
>
> ~Abhay
>
>
> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>
>> HI Abhay
>>
>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>> random based on the reduce slot availability. So if you don't need the
>> reduce tasks to be scheduled on some particular nodes you need to set
>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>> bottleneck here is that this property is not a job level one you need to
>> set it on a cluster level.
>>
>> A cleaner approach will be to configure each of your nodes with the right
>> number of map and reduce slots based on the resources available on each
>> machine.
>>
>>
>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> How can one get to know the nodes on which reduce tasks will run?
>>>
>>> One of my job is running and it's completing all the map tasks.
>>> My map tasks write lots of intermediate data. The intermediate directory
>>> is getting full on all the nodes.
>>> If the reduce task take any node from cluster then It'll try to copy the
>>> data to same disk and it'll eventually fail due to Disk space related
>>> exceptions.
>>>
>>> I have added few more tasktracker nodes in the cluster and now want to
>>> run reducer on new nodes only.
>>> Is it possible to choose a node on which the reducer will run? What's
>>> the algorithm hadoop uses to get a new node to run reducer?
>>>
>>> Thanks in advance.
>>>
>>> Bye
>>> Abhay
>>>
>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Narasingu Ramesh <ra...@gmail.com>.
Hi Abhay,
            NameNode it has address of the all data nodes. MapReduce can do
all the data is processing. First data set is putting into HDFS filesystem
and then run hadoop jar file. Map task can handle input files for shufle,
sorting and grouped together. Map task is completed and then taks Reduce
taks yet to start and then run the again sorting the data mean while job
tracker and task tracker is running in each job tasks.

Thanks & Regards,
Ramesh.Narasingu

On Mon, Sep 3, 2012 at 9:30 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> All of my map tasks are about to complete and there is not much processing
> to be done in reducer.
> The job is running from a week so I don't want the job to fail. Any other
> suggestion to tackle this is welcome.
>
> ~Abhay
>
> On Mon, Sep 3, 2012 at 9:26 PM, Hemanth Yamijala <
> yhemanth@thoughtworks.com> wrote:
>
>> Hi,
>>
>> You are right that a change to mapred.tasktracker.reduce.tasks.maximum
>> will require a restart of the tasktrackers. AFAIK, there is no way of
>> modifying this property without restarting.
>>
>> On a different note, could you see if the amount of intermediate data can
>> be reduced using a combiner, or some other form of local aggregation ?
>>
>> Thanks
>> hemanth
>>
>>
>> On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
>>> running tasktracker?
>>> Seems that I need to restart the tasktracker and in that case I'll loose
>>> the output of map tasks by particular tasktracker.
>>>
>>> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"
>>> without restarting tasktracker?
>>>
>>> ~Abhay
>>>
>>>
>>> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>>>
>>>> HI Abhay
>>>>
>>>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>>>> random based on the reduce slot availability. So if you don't need the
>>>> reduce tasks to be scheduled on some particular nodes you need to set
>>>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>>>> bottleneck here is that this property is not a job level one you need to
>>>> set it on a cluster level.
>>>>
>>>> A cleaner approach will be to configure each of your nodes with the
>>>> right number of map and reduce slots based on the resources available on
>>>> each machine.
>>>>
>>>>
>>>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>>>> abhay.ratnaparkhi@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> How can one get to know the nodes on which reduce tasks will run?
>>>>>
>>>>> One of my job is running and it's completing all the map tasks.
>>>>> My map tasks write lots of intermediate data. The intermediate
>>>>> directory is getting full on all the nodes.
>>>>> If the reduce task take any node from cluster then It'll try to copy
>>>>> the data to same disk and it'll eventually fail due to Disk space related
>>>>> exceptions.
>>>>>
>>>>> I have added few more tasktracker nodes in the cluster and now want to
>>>>> run reducer on new nodes only.
>>>>> Is it possible to choose a node on which the reducer will run? What's
>>>>> the algorithm hadoop uses to get a new node to run reducer?
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Bye
>>>>> Abhay
>>>>>
>>>>
>>>>
>>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Narasingu Ramesh <ra...@gmail.com>.
Hi Abhay,
            NameNode it has address of the all data nodes. MapReduce can do
all the data is processing. First data set is putting into HDFS filesystem
and then run hadoop jar file. Map task can handle input files for shufle,
sorting and grouped together. Map task is completed and then taks Reduce
taks yet to start and then run the again sorting the data mean while job
tracker and task tracker is running in each job tasks.

Thanks & Regards,
Ramesh.Narasingu

On Mon, Sep 3, 2012 at 9:30 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> All of my map tasks are about to complete and there is not much processing
> to be done in reducer.
> The job is running from a week so I don't want the job to fail. Any other
> suggestion to tackle this is welcome.
>
> ~Abhay
>
> On Mon, Sep 3, 2012 at 9:26 PM, Hemanth Yamijala <
> yhemanth@thoughtworks.com> wrote:
>
>> Hi,
>>
>> You are right that a change to mapred.tasktracker.reduce.tasks.maximum
>> will require a restart of the tasktrackers. AFAIK, there is no way of
>> modifying this property without restarting.
>>
>> On a different note, could you see if the amount of intermediate data can
>> be reduced using a combiner, or some other form of local aggregation ?
>>
>> Thanks
>> hemanth
>>
>>
>> On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
>>> running tasktracker?
>>> Seems that I need to restart the tasktracker and in that case I'll loose
>>> the output of map tasks by particular tasktracker.
>>>
>>> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"
>>> without restarting tasktracker?
>>>
>>> ~Abhay
>>>
>>>
>>> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>>>
>>>> HI Abhay
>>>>
>>>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>>>> random based on the reduce slot availability. So if you don't need the
>>>> reduce tasks to be scheduled on some particular nodes you need to set
>>>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>>>> bottleneck here is that this property is not a job level one you need to
>>>> set it on a cluster level.
>>>>
>>>> A cleaner approach will be to configure each of your nodes with the
>>>> right number of map and reduce slots based on the resources available on
>>>> each machine.
>>>>
>>>>
>>>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>>>> abhay.ratnaparkhi@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> How can one get to know the nodes on which reduce tasks will run?
>>>>>
>>>>> One of my job is running and it's completing all the map tasks.
>>>>> My map tasks write lots of intermediate data. The intermediate
>>>>> directory is getting full on all the nodes.
>>>>> If the reduce task take any node from cluster then It'll try to copy
>>>>> the data to same disk and it'll eventually fail due to Disk space related
>>>>> exceptions.
>>>>>
>>>>> I have added few more tasktracker nodes in the cluster and now want to
>>>>> run reducer on new nodes only.
>>>>> Is it possible to choose a node on which the reducer will run? What's
>>>>> the algorithm hadoop uses to get a new node to run reducer?
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Bye
>>>>> Abhay
>>>>>
>>>>
>>>>
>>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Narasingu Ramesh <ra...@gmail.com>.
Hi Abhay,
            NameNode it has address of the all data nodes. MapReduce can do
all the data is processing. First data set is putting into HDFS filesystem
and then run hadoop jar file. Map task can handle input files for shufle,
sorting and grouped together. Map task is completed and then taks Reduce
taks yet to start and then run the again sorting the data mean while job
tracker and task tracker is running in each job tasks.

Thanks & Regards,
Ramesh.Narasingu

On Mon, Sep 3, 2012 at 9:30 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> All of my map tasks are about to complete and there is not much processing
> to be done in reducer.
> The job is running from a week so I don't want the job to fail. Any other
> suggestion to tackle this is welcome.
>
> ~Abhay
>
> On Mon, Sep 3, 2012 at 9:26 PM, Hemanth Yamijala <
> yhemanth@thoughtworks.com> wrote:
>
>> Hi,
>>
>> You are right that a change to mapred.tasktracker.reduce.tasks.maximum
>> will require a restart of the tasktrackers. AFAIK, there is no way of
>> modifying this property without restarting.
>>
>> On a different note, could you see if the amount of intermediate data can
>> be reduced using a combiner, or some other form of local aggregation ?
>>
>> Thanks
>> hemanth
>>
>>
>> On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
>>> running tasktracker?
>>> Seems that I need to restart the tasktracker and in that case I'll loose
>>> the output of map tasks by particular tasktracker.
>>>
>>> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"
>>> without restarting tasktracker?
>>>
>>> ~Abhay
>>>
>>>
>>> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>>>
>>>> HI Abhay
>>>>
>>>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>>>> random based on the reduce slot availability. So if you don't need the
>>>> reduce tasks to be scheduled on some particular nodes you need to set
>>>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>>>> bottleneck here is that this property is not a job level one you need to
>>>> set it on a cluster level.
>>>>
>>>> A cleaner approach will be to configure each of your nodes with the
>>>> right number of map and reduce slots based on the resources available on
>>>> each machine.
>>>>
>>>>
>>>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>>>> abhay.ratnaparkhi@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> How can one get to know the nodes on which reduce tasks will run?
>>>>>
>>>>> One of my job is running and it's completing all the map tasks.
>>>>> My map tasks write lots of intermediate data. The intermediate
>>>>> directory is getting full on all the nodes.
>>>>> If the reduce task take any node from cluster then It'll try to copy
>>>>> the data to same disk and it'll eventually fail due to Disk space related
>>>>> exceptions.
>>>>>
>>>>> I have added few more tasktracker nodes in the cluster and now want to
>>>>> run reducer on new nodes only.
>>>>> Is it possible to choose a node on which the reducer will run? What's
>>>>> the algorithm hadoop uses to get a new node to run reducer?
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Bye
>>>>> Abhay
>>>>>
>>>>
>>>>
>>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Narasingu Ramesh <ra...@gmail.com>.
Hi Abhay,
            NameNode it has address of the all data nodes. MapReduce can do
all the data is processing. First data set is putting into HDFS filesystem
and then run hadoop jar file. Map task can handle input files for shufle,
sorting and grouped together. Map task is completed and then taks Reduce
taks yet to start and then run the again sorting the data mean while job
tracker and task tracker is running in each job tasks.

Thanks & Regards,
Ramesh.Narasingu

On Mon, Sep 3, 2012 at 9:30 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> All of my map tasks are about to complete and there is not much processing
> to be done in reducer.
> The job is running from a week so I don't want the job to fail. Any other
> suggestion to tackle this is welcome.
>
> ~Abhay
>
> On Mon, Sep 3, 2012 at 9:26 PM, Hemanth Yamijala <
> yhemanth@thoughtworks.com> wrote:
>
>> Hi,
>>
>> You are right that a change to mapred.tasktracker.reduce.tasks.maximum
>> will require a restart of the tasktrackers. AFAIK, there is no way of
>> modifying this property without restarting.
>>
>> On a different note, could you see if the amount of intermediate data can
>> be reduced using a combiner, or some other form of local aggregation ?
>>
>> Thanks
>> hemanth
>>
>>
>> On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
>>> running tasktracker?
>>> Seems that I need to restart the tasktracker and in that case I'll loose
>>> the output of map tasks by particular tasktracker.
>>>
>>> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"
>>> without restarting tasktracker?
>>>
>>> ~Abhay
>>>
>>>
>>> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>>>
>>>> HI Abhay
>>>>
>>>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>>>> random based on the reduce slot availability. So if you don't need the
>>>> reduce tasks to be scheduled on some particular nodes you need to set
>>>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>>>> bottleneck here is that this property is not a job level one you need to
>>>> set it on a cluster level.
>>>>
>>>> A cleaner approach will be to configure each of your nodes with the
>>>> right number of map and reduce slots based on the resources available on
>>>> each machine.
>>>>
>>>>
>>>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>>>> abhay.ratnaparkhi@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> How can one get to know the nodes on which reduce tasks will run?
>>>>>
>>>>> One of my job is running and it's completing all the map tasks.
>>>>> My map tasks write lots of intermediate data. The intermediate
>>>>> directory is getting full on all the nodes.
>>>>> If the reduce task take any node from cluster then It'll try to copy
>>>>> the data to same disk and it'll eventually fail due to Disk space related
>>>>> exceptions.
>>>>>
>>>>> I have added few more tasktracker nodes in the cluster and now want to
>>>>> run reducer on new nodes only.
>>>>> Is it possible to choose a node on which the reducer will run? What's
>>>>> the algorithm hadoop uses to get a new node to run reducer?
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Bye
>>>>> Abhay
>>>>>
>>>>
>>>>
>>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Abhay Ratnaparkhi <ab...@gmail.com>.
All of my map tasks are about to complete and there is not much processing
to be done in reducer.
The job is running from a week so I don't want the job to fail. Any other
suggestion to tackle this is welcome.

~Abhay

On Mon, Sep 3, 2012 at 9:26 PM, Hemanth Yamijala
<yh...@thoughtworks.com>wrote:

> Hi,
>
> You are right that a change to mapred.tasktracker.reduce.tasks.maximum
> will require a restart of the tasktrackers. AFAIK, there is no way of
> modifying this property without restarting.
>
> On a different note, could you see if the amount of intermediate data can
> be reduced using a combiner, or some other form of local aggregation ?
>
> Thanks
> hemanth
>
>
> On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
> abhay.ratnaparkhi@gmail.com> wrote:
>
>> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
>> running tasktracker?
>> Seems that I need to restart the tasktracker and in that case I'll loose
>> the output of map tasks by particular tasktracker.
>>
>> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
>> restarting tasktracker?
>>
>> ~Abhay
>>
>>
>> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>>
>>> HI Abhay
>>>
>>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>>> random based on the reduce slot availability. So if you don't need the
>>> reduce tasks to be scheduled on some particular nodes you need to set
>>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>>> bottleneck here is that this property is not a job level one you need to
>>> set it on a cluster level.
>>>
>>> A cleaner approach will be to configure each of your nodes with the
>>> right number of map and reduce slots based on the resources available on
>>> each machine.
>>>
>>>
>>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>>> abhay.ratnaparkhi@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> How can one get to know the nodes on which reduce tasks will run?
>>>>
>>>> One of my job is running and it's completing all the map tasks.
>>>> My map tasks write lots of intermediate data. The intermediate
>>>> directory is getting full on all the nodes.
>>>> If the reduce task take any node from cluster then It'll try to copy
>>>> the data to same disk and it'll eventually fail due to Disk space related
>>>> exceptions.
>>>>
>>>> I have added few more tasktracker nodes in the cluster and now want to
>>>> run reducer on new nodes only.
>>>> Is it possible to choose a node on which the reducer will run? What's
>>>> the algorithm hadoop uses to get a new node to run reducer?
>>>>
>>>> Thanks in advance.
>>>>
>>>> Bye
>>>> Abhay
>>>>
>>>
>>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Abhay Ratnaparkhi <ab...@gmail.com>.
All of my map tasks are about to complete and there is not much processing
to be done in reducer.
The job is running from a week so I don't want the job to fail. Any other
suggestion to tackle this is welcome.

~Abhay

On Mon, Sep 3, 2012 at 9:26 PM, Hemanth Yamijala
<yh...@thoughtworks.com>wrote:

> Hi,
>
> You are right that a change to mapred.tasktracker.reduce.tasks.maximum
> will require a restart of the tasktrackers. AFAIK, there is no way of
> modifying this property without restarting.
>
> On a different note, could you see if the amount of intermediate data can
> be reduced using a combiner, or some other form of local aggregation ?
>
> Thanks
> hemanth
>
>
> On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
> abhay.ratnaparkhi@gmail.com> wrote:
>
>> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
>> running tasktracker?
>> Seems that I need to restart the tasktracker and in that case I'll loose
>> the output of map tasks by particular tasktracker.
>>
>> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
>> restarting tasktracker?
>>
>> ~Abhay
>>
>>
>> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>>
>>> HI Abhay
>>>
>>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>>> random based on the reduce slot availability. So if you don't need the
>>> reduce tasks to be scheduled on some particular nodes you need to set
>>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>>> bottleneck here is that this property is not a job level one you need to
>>> set it on a cluster level.
>>>
>>> A cleaner approach will be to configure each of your nodes with the
>>> right number of map and reduce slots based on the resources available on
>>> each machine.
>>>
>>>
>>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>>> abhay.ratnaparkhi@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> How can one get to know the nodes on which reduce tasks will run?
>>>>
>>>> One of my job is running and it's completing all the map tasks.
>>>> My map tasks write lots of intermediate data. The intermediate
>>>> directory is getting full on all the nodes.
>>>> If the reduce task take any node from cluster then It'll try to copy
>>>> the data to same disk and it'll eventually fail due to Disk space related
>>>> exceptions.
>>>>
>>>> I have added few more tasktracker nodes in the cluster and now want to
>>>> run reducer on new nodes only.
>>>> Is it possible to choose a node on which the reducer will run? What's
>>>> the algorithm hadoop uses to get a new node to run reducer?
>>>>
>>>> Thanks in advance.
>>>>
>>>> Bye
>>>> Abhay
>>>>
>>>
>>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Abhay Ratnaparkhi <ab...@gmail.com>.
All of my map tasks are about to complete and there is not much processing
to be done in reducer.
The job is running from a week so I don't want the job to fail. Any other
suggestion to tackle this is welcome.

~Abhay

On Mon, Sep 3, 2012 at 9:26 PM, Hemanth Yamijala
<yh...@thoughtworks.com>wrote:

> Hi,
>
> You are right that a change to mapred.tasktracker.reduce.tasks.maximum
> will require a restart of the tasktrackers. AFAIK, there is no way of
> modifying this property without restarting.
>
> On a different note, could you see if the amount of intermediate data can
> be reduced using a combiner, or some other form of local aggregation ?
>
> Thanks
> hemanth
>
>
> On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
> abhay.ratnaparkhi@gmail.com> wrote:
>
>> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
>> running tasktracker?
>> Seems that I need to restart the tasktracker and in that case I'll loose
>> the output of map tasks by particular tasktracker.
>>
>> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
>> restarting tasktracker?
>>
>> ~Abhay
>>
>>
>> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>>
>>> HI Abhay
>>>
>>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>>> random based on the reduce slot availability. So if you don't need the
>>> reduce tasks to be scheduled on some particular nodes you need to set
>>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>>> bottleneck here is that this property is not a job level one you need to
>>> set it on a cluster level.
>>>
>>> A cleaner approach will be to configure each of your nodes with the
>>> right number of map and reduce slots based on the resources available on
>>> each machine.
>>>
>>>
>>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>>> abhay.ratnaparkhi@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> How can one get to know the nodes on which reduce tasks will run?
>>>>
>>>> One of my job is running and it's completing all the map tasks.
>>>> My map tasks write lots of intermediate data. The intermediate
>>>> directory is getting full on all the nodes.
>>>> If the reduce task take any node from cluster then It'll try to copy
>>>> the data to same disk and it'll eventually fail due to Disk space related
>>>> exceptions.
>>>>
>>>> I have added few more tasktracker nodes in the cluster and now want to
>>>> run reducer on new nodes only.
>>>> Is it possible to choose a node on which the reducer will run? What's
>>>> the algorithm hadoop uses to get a new node to run reducer?
>>>>
>>>> Thanks in advance.
>>>>
>>>> Bye
>>>> Abhay
>>>>
>>>
>>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Abhay Ratnaparkhi <ab...@gmail.com>.
All of my map tasks are about to complete and there is not much processing
to be done in reducer.
The job is running from a week so I don't want the job to fail. Any other
suggestion to tackle this is welcome.

~Abhay

On Mon, Sep 3, 2012 at 9:26 PM, Hemanth Yamijala
<yh...@thoughtworks.com>wrote:

> Hi,
>
> You are right that a change to mapred.tasktracker.reduce.tasks.maximum
> will require a restart of the tasktrackers. AFAIK, there is no way of
> modifying this property without restarting.
>
> On a different note, could you see if the amount of intermediate data can
> be reduced using a combiner, or some other form of local aggregation ?
>
> Thanks
> hemanth
>
>
> On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
> abhay.ratnaparkhi@gmail.com> wrote:
>
>> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
>> running tasktracker?
>> Seems that I need to restart the tasktracker and in that case I'll loose
>> the output of map tasks by particular tasktracker.
>>
>> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
>> restarting tasktracker?
>>
>> ~Abhay
>>
>>
>> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>>
>>> HI Abhay
>>>
>>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>>> random based on the reduce slot availability. So if you don't need the
>>> reduce tasks to be scheduled on some particular nodes you need to set
>>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>>> bottleneck here is that this property is not a job level one you need to
>>> set it on a cluster level.
>>>
>>> A cleaner approach will be to configure each of your nodes with the
>>> right number of map and reduce slots based on the resources available on
>>> each machine.
>>>
>>>
>>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>>> abhay.ratnaparkhi@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> How can one get to know the nodes on which reduce tasks will run?
>>>>
>>>> One of my job is running and it's completing all the map tasks.
>>>> My map tasks write lots of intermediate data. The intermediate
>>>> directory is getting full on all the nodes.
>>>> If the reduce task take any node from cluster then It'll try to copy
>>>> the data to same disk and it'll eventually fail due to Disk space related
>>>> exceptions.
>>>>
>>>> I have added few more tasktracker nodes in the cluster and now want to
>>>> run reducer on new nodes only.
>>>> Is it possible to choose a node on which the reducer will run? What's
>>>> the algorithm hadoop uses to get a new node to run reducer?
>>>>
>>>> Thanks in advance.
>>>>
>>>> Bye
>>>> Abhay
>>>>
>>>
>>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Hi,

You are right that a change to mapred.tasktracker.reduce.tasks.maximum will
require a restart of the tasktrackers. AFAIK, there is no way of modifying
this property without restarting.

On a different note, could you see if the amount of intermediate data can
be reduced using a combiner, or some other form of local aggregation ?

Thanks
hemanth

On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
> running tasktracker?
> Seems that I need to restart the tasktracker and in that case I'll loose
> the output of map tasks by particular tasktracker.
>
> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
> restarting tasktracker?
>
> ~Abhay
>
>
> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>
>> HI Abhay
>>
>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>> random based on the reduce slot availability. So if you don't need the
>> reduce tasks to be scheduled on some particular nodes you need to set
>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>> bottleneck here is that this property is not a job level one you need to
>> set it on a cluster level.
>>
>> A cleaner approach will be to configure each of your nodes with the right
>> number of map and reduce slots based on the resources available on each
>> machine.
>>
>>
>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> How can one get to know the nodes on which reduce tasks will run?
>>>
>>> One of my job is running and it's completing all the map tasks.
>>> My map tasks write lots of intermediate data. The intermediate directory
>>> is getting full on all the nodes.
>>> If the reduce task take any node from cluster then It'll try to copy the
>>> data to same disk and it'll eventually fail due to Disk space related
>>> exceptions.
>>>
>>> I have added few more tasktracker nodes in the cluster and now want to
>>> run reducer on new nodes only.
>>> Is it possible to choose a node on which the reducer will run? What's
>>> the algorithm hadoop uses to get a new node to run reducer?
>>>
>>> Thanks in advance.
>>>
>>> Bye
>>> Abhay
>>>
>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Hi,

You are right that a change to mapred.tasktracker.reduce.tasks.maximum will
require a restart of the tasktrackers. AFAIK, there is no way of modifying
this property without restarting.

On a different note, could you see if the amount of intermediate data can
be reduced using a combiner, or some other form of local aggregation ?

Thanks
hemanth

On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
> running tasktracker?
> Seems that I need to restart the tasktracker and in that case I'll loose
> the output of map tasks by particular tasktracker.
>
> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
> restarting tasktracker?
>
> ~Abhay
>
>
> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>
>> HI Abhay
>>
>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>> random based on the reduce slot availability. So if you don't need the
>> reduce tasks to be scheduled on some particular nodes you need to set
>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>> bottleneck here is that this property is not a job level one you need to
>> set it on a cluster level.
>>
>> A cleaner approach will be to configure each of your nodes with the right
>> number of map and reduce slots based on the resources available on each
>> machine.
>>
>>
>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> How can one get to know the nodes on which reduce tasks will run?
>>>
>>> One of my job is running and it's completing all the map tasks.
>>> My map tasks write lots of intermediate data. The intermediate directory
>>> is getting full on all the nodes.
>>> If the reduce task take any node from cluster then It'll try to copy the
>>> data to same disk and it'll eventually fail due to Disk space related
>>> exceptions.
>>>
>>> I have added few more tasktracker nodes in the cluster and now want to
>>> run reducer on new nodes only.
>>> Is it possible to choose a node on which the reducer will run? What's
>>> the algorithm hadoop uses to get a new node to run reducer?
>>>
>>> Thanks in advance.
>>>
>>> Bye
>>> Abhay
>>>
>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Hi,

You are right that a change to mapred.tasktracker.reduce.tasks.maximum will
require a restart of the tasktrackers. AFAIK, there is no way of modifying
this property without restarting.

On a different note, could you see if the amount of intermediate data can
be reduced using a combiner, or some other form of local aggregation ?

Thanks
hemanth

On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
> running tasktracker?
> Seems that I need to restart the tasktracker and in that case I'll loose
> the output of map tasks by particular tasktracker.
>
> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
> restarting tasktracker?
>
> ~Abhay
>
>
> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>
>> HI Abhay
>>
>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>> random based on the reduce slot availability. So if you don't need the
>> reduce tasks to be scheduled on some particular nodes you need to set
>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>> bottleneck here is that this property is not a job level one you need to
>> set it on a cluster level.
>>
>> A cleaner approach will be to configure each of your nodes with the right
>> number of map and reduce slots based on the resources available on each
>> machine.
>>
>>
>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> How can one get to know the nodes on which reduce tasks will run?
>>>
>>> One of my job is running and it's completing all the map tasks.
>>> My map tasks write lots of intermediate data. The intermediate directory
>>> is getting full on all the nodes.
>>> If the reduce task take any node from cluster then It'll try to copy the
>>> data to same disk and it'll eventually fail due to Disk space related
>>> exceptions.
>>>
>>> I have added few more tasktracker nodes in the cluster and now want to
>>> run reducer on new nodes only.
>>> Is it possible to choose a node on which the reducer will run? What's
>>> the algorithm hadoop uses to get a new node to run reducer?
>>>
>>> Thanks in advance.
>>>
>>> Bye
>>> Abhay
>>>
>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Hi,

You are right that a change to mapred.tasktracker.reduce.tasks.maximum will
require a restart of the tasktrackers. AFAIK, there is no way of modifying
this property without restarting.

On a different note, could you see if the amount of intermediate data can
be reduced using a combiner, or some other form of local aggregation ?

Thanks
hemanth

On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
> running tasktracker?
> Seems that I need to restart the tasktracker and in that case I'll loose
> the output of map tasks by particular tasktracker.
>
> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
> restarting tasktracker?
>
> ~Abhay
>
>
> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>
>> HI Abhay
>>
>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>> random based on the reduce slot availability. So if you don't need the
>> reduce tasks to be scheduled on some particular nodes you need to set
>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>> bottleneck here is that this property is not a job level one you need to
>> set it on a cluster level.
>>
>> A cleaner approach will be to configure each of your nodes with the right
>> number of map and reduce slots based on the resources available on each
>> machine.
>>
>>
>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> How can one get to know the nodes on which reduce tasks will run?
>>>
>>> One of my job is running and it's completing all the map tasks.
>>> My map tasks write lots of intermediate data. The intermediate directory
>>> is getting full on all the nodes.
>>> If the reduce task take any node from cluster then It'll try to copy the
>>> data to same disk and it'll eventually fail due to Disk space related
>>> exceptions.
>>>
>>> I have added few more tasktracker nodes in the cluster and now want to
>>> run reducer on new nodes only.
>>> Is it possible to choose a node on which the reducer will run? What's
>>> the algorithm hadoop uses to get a new node to run reducer?
>>>
>>> Thanks in advance.
>>>
>>> Bye
>>> Abhay
>>>
>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Bejoy Ks <be...@gmail.com>.
Hi Abhay

You need this value to be changed before you submit your job and restart
TT. Modifying this value in  mid time won't affect the running jobs.

On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
> running tasktracker?
> Seems that I need to restart the tasktracker and in that case I'll loose
> the output of map tasks by particular tasktracker.
>
> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
> restarting tasktracker?
>
> ~Abhay
>
>
> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>
>> HI Abhay
>>
>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>> random based on the reduce slot availability. So if you don't need the
>> reduce tasks to be scheduled on some particular nodes you need to set
>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>> bottleneck here is that this property is not a job level one you need to
>> set it on a cluster level.
>>
>> A cleaner approach will be to configure each of your nodes with the right
>> number of map and reduce slots based on the resources available on each
>> machine.
>>
>>
>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> How can one get to know the nodes on which reduce tasks will run?
>>>
>>> One of my job is running and it's completing all the map tasks.
>>> My map tasks write lots of intermediate data. The intermediate directory
>>> is getting full on all the nodes.
>>> If the reduce task take any node from cluster then It'll try to copy the
>>> data to same disk and it'll eventually fail due to Disk space related
>>> exceptions.
>>>
>>> I have added few more tasktracker nodes in the cluster and now want to
>>> run reducer on new nodes only.
>>> Is it possible to choose a node on which the reducer will run? What's
>>> the algorithm hadoop uses to get a new node to run reducer?
>>>
>>> Thanks in advance.
>>>
>>> Bye
>>> Abhay
>>>
>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Bejoy Ks <be...@gmail.com>.
Hi Abhay

You need this value to be changed before you submit your job and restart
TT. Modifying this value in  mid time won't affect the running jobs.

On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
> running tasktracker?
> Seems that I need to restart the tasktracker and in that case I'll loose
> the output of map tasks by particular tasktracker.
>
> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
> restarting tasktracker?
>
> ~Abhay
>
>
> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>
>> HI Abhay
>>
>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>> random based on the reduce slot availability. So if you don't need the
>> reduce tasks to be scheduled on some particular nodes you need to set
>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>> bottleneck here is that this property is not a job level one you need to
>> set it on a cluster level.
>>
>> A cleaner approach will be to configure each of your nodes with the right
>> number of map and reduce slots based on the resources available on each
>> machine.
>>
>>
>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> How can one get to know the nodes on which reduce tasks will run?
>>>
>>> One of my job is running and it's completing all the map tasks.
>>> My map tasks write lots of intermediate data. The intermediate directory
>>> is getting full on all the nodes.
>>> If the reduce task take any node from cluster then It'll try to copy the
>>> data to same disk and it'll eventually fail due to Disk space related
>>> exceptions.
>>>
>>> I have added few more tasktracker nodes in the cluster and now want to
>>> run reducer on new nodes only.
>>> Is it possible to choose a node on which the reducer will run? What's
>>> the algorithm hadoop uses to get a new node to run reducer?
>>>
>>> Thanks in advance.
>>>
>>> Bye
>>> Abhay
>>>
>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Bejoy Ks <be...@gmail.com>.
Hi Abhay

You need this value to be changed before you submit your job and restart
TT. Modifying this value in  mid time won't affect the running jobs.

On Mon, Sep 3, 2012 at 9:06 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
> running tasktracker?
> Seems that I need to restart the tasktracker and in that case I'll loose
> the output of map tasks by particular tasktracker.
>
> Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
> restarting tasktracker?
>
> ~Abhay
>
>
> On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:
>
>> HI Abhay
>>
>> The TaskTrackers on which the reduce tasks are triggered is chosen in
>> random based on the reduce slot availability. So if you don't need the
>> reduce tasks to be scheduled on some particular nodes you need to set
>> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
>> bottleneck here is that this property is not a job level one you need to
>> set it on a cluster level.
>>
>> A cleaner approach will be to configure each of your nodes with the right
>> number of map and reduce slots based on the resources available on each
>> machine.
>>
>>
>> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
>> abhay.ratnaparkhi@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> How can one get to know the nodes on which reduce tasks will run?
>>>
>>> One of my job is running and it's completing all the map tasks.
>>> My map tasks write lots of intermediate data. The intermediate directory
>>> is getting full on all the nodes.
>>> If the reduce task take any node from cluster then It'll try to copy the
>>> data to same disk and it'll eventually fail due to Disk space related
>>> exceptions.
>>>
>>> I have added few more tasktracker nodes in the cluster and now want to
>>> run reducer on new nodes only.
>>> Is it possible to choose a node on which the reducer will run? What's
>>> the algorithm hadoop uses to get a new node to run reducer?
>>>
>>> Thanks in advance.
>>>
>>> Bye
>>> Abhay
>>>
>>
>>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Abhay Ratnaparkhi <ab...@gmail.com>.
How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
running tasktracker?
Seems that I need to restart the tasktracker and in that case I'll loose
the output of map tasks by particular tasktracker.

Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
restarting tasktracker?

~Abhay

On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:

> HI Abhay
>
> The TaskTrackers on which the reduce tasks are triggered is chosen in
> random based on the reduce slot availability. So if you don't need the
> reduce tasks to be scheduled on some particular nodes you need to set
> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
> bottleneck here is that this property is not a job level one you need to
> set it on a cluster level.
>
> A cleaner approach will be to configure each of your nodes with the right
> number of map and reduce slots based on the resources available on each
> machine.
>
>
> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
> abhay.ratnaparkhi@gmail.com> wrote:
>
>> Hello,
>>
>> How can one get to know the nodes on which reduce tasks will run?
>>
>> One of my job is running and it's completing all the map tasks.
>> My map tasks write lots of intermediate data. The intermediate directory
>> is getting full on all the nodes.
>> If the reduce task take any node from cluster then It'll try to copy the
>> data to same disk and it'll eventually fail due to Disk space related
>> exceptions.
>>
>> I have added few more tasktracker nodes in the cluster and now want to
>> run reducer on new nodes only.
>> Is it possible to choose a node on which the reducer will run? What's the
>> algorithm hadoop uses to get a new node to run reducer?
>>
>> Thanks in advance.
>>
>> Bye
>> Abhay
>>
>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Abhay Ratnaparkhi <ab...@gmail.com>.
How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
running tasktracker?
Seems that I need to restart the tasktracker and in that case I'll loose
the output of map tasks by particular tasktracker.

Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
restarting tasktracker?

~Abhay

On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:

> HI Abhay
>
> The TaskTrackers on which the reduce tasks are triggered is chosen in
> random based on the reduce slot availability. So if you don't need the
> reduce tasks to be scheduled on some particular nodes you need to set
> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
> bottleneck here is that this property is not a job level one you need to
> set it on a cluster level.
>
> A cleaner approach will be to configure each of your nodes with the right
> number of map and reduce slots based on the resources available on each
> machine.
>
>
> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
> abhay.ratnaparkhi@gmail.com> wrote:
>
>> Hello,
>>
>> How can one get to know the nodes on which reduce tasks will run?
>>
>> One of my job is running and it's completing all the map tasks.
>> My map tasks write lots of intermediate data. The intermediate directory
>> is getting full on all the nodes.
>> If the reduce task take any node from cluster then It'll try to copy the
>> data to same disk and it'll eventually fail due to Disk space related
>> exceptions.
>>
>> I have added few more tasktracker nodes in the cluster and now want to
>> run reducer on new nodes only.
>> Is it possible to choose a node on which the reducer will run? What's the
>> algorithm hadoop uses to get a new node to run reducer?
>>
>> Thanks in advance.
>>
>> Bye
>> Abhay
>>
>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Abhay Ratnaparkhi <ab...@gmail.com>.
How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
running tasktracker?
Seems that I need to restart the tasktracker and in that case I'll loose
the output of map tasks by particular tasktracker.

Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
restarting tasktracker?

~Abhay

On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:

> HI Abhay
>
> The TaskTrackers on which the reduce tasks are triggered is chosen in
> random based on the reduce slot availability. So if you don't need the
> reduce tasks to be scheduled on some particular nodes you need to set
> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
> bottleneck here is that this property is not a job level one you need to
> set it on a cluster level.
>
> A cleaner approach will be to configure each of your nodes with the right
> number of map and reduce slots based on the resources available on each
> machine.
>
>
> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
> abhay.ratnaparkhi@gmail.com> wrote:
>
>> Hello,
>>
>> How can one get to know the nodes on which reduce tasks will run?
>>
>> One of my job is running and it's completing all the map tasks.
>> My map tasks write lots of intermediate data. The intermediate directory
>> is getting full on all the nodes.
>> If the reduce task take any node from cluster then It'll try to copy the
>> data to same disk and it'll eventually fail due to Disk space related
>> exceptions.
>>
>> I have added few more tasktracker nodes in the cluster and now want to
>> run reducer on new nodes only.
>> Is it possible to choose a node on which the reducer will run? What's the
>> algorithm hadoop uses to get a new node to run reducer?
>>
>> Thanks in advance.
>>
>> Bye
>> Abhay
>>
>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Abhay Ratnaparkhi <ab...@gmail.com>.
How can I set  'mapred.tasktracker.reduce.tasks.maximum'  to "0" in a
running tasktracker?
Seems that I need to restart the tasktracker and in that case I'll loose
the output of map tasks by particular tasktracker.

Can I change   'mapred.tasktracker.reduce.tasks.maximum'  to "0"  without
restarting tasktracker?

~Abhay

On Mon, Sep 3, 2012 at 8:53 PM, Bejoy Ks <be...@gmail.com> wrote:

> HI Abhay
>
> The TaskTrackers on which the reduce tasks are triggered is chosen in
> random based on the reduce slot availability. So if you don't need the
> reduce tasks to be scheduled on some particular nodes you need to set
> 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
> bottleneck here is that this property is not a job level one you need to
> set it on a cluster level.
>
> A cleaner approach will be to configure each of your nodes with the right
> number of map and reduce slots based on the resources available on each
> machine.
>
>
> On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
> abhay.ratnaparkhi@gmail.com> wrote:
>
>> Hello,
>>
>> How can one get to know the nodes on which reduce tasks will run?
>>
>> One of my job is running and it's completing all the map tasks.
>> My map tasks write lots of intermediate data. The intermediate directory
>> is getting full on all the nodes.
>> If the reduce task take any node from cluster then It'll try to copy the
>> data to same disk and it'll eventually fail due to Disk space related
>> exceptions.
>>
>> I have added few more tasktracker nodes in the cluster and now want to
>> run reducer on new nodes only.
>> Is it possible to choose a node on which the reducer will run? What's the
>> algorithm hadoop uses to get a new node to run reducer?
>>
>> Thanks in advance.
>>
>> Bye
>> Abhay
>>
>
>

Re: knowing the nodes on which reduce tasks will run

Posted by Bejoy Ks <be...@gmail.com>.
HI Abhay

The TaskTrackers on which the reduce tasks are triggered is chosen in
random based on the reduce slot availability. So if you don't need the
reduce tasks to be scheduled on some particular nodes you need to set
'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
bottleneck here is that this property is not a job level one you need to
set it on a cluster level.

A cleaner approach will be to configure each of your nodes with the right
number of map and reduce slots based on the resources available on each
machine.

On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
> I have added few more tasktracker nodes in the cluster and now want to run
> reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the
> algorithm hadoop uses to get a new node to run reducer?
>
> Thanks in advance.
>
> Bye
> Abhay
>

Re: knowing the nodes on which reduce tasks will run

Posted by Steve Loughran <st...@hortonworks.com>.
On 3 September 2012 15:19, Abhay Ratnaparkhi <ab...@gmail.com>wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
>
you could always set up specific partitions for intermediate data, though
you get better bandwidth by striping the data across all disks, and better
flexibility by sharing the same partition.

There's also a property to set the amount of space to allocate for DFS
storage; reduce that by changing  dfs.datanode.du.reserved and the
datanodes will leave more free space around.

see: http://wiki.apache.org/hadoop/DiskSetup

Re: knowing the nodes on which reduce tasks will run

Posted by Michael Segel <mi...@hotmail.com>.
The short answer is no. 
The longer answer is that you can attempt to force data locality, however even then if an open slot becomes available, its used regardless of what you want to do...


On Sep 3, 2012, at 9:19 AM, Abhay Ratnaparkhi <ab...@gmail.com> wrote:

> Hello,
> 
> How can one get to know the nodes on which reduce tasks will run?
> 
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory is getting full on all the nodes. 
> If the reduce task take any node from cluster then It'll try to copy the data to same disk and it'll eventually fail due to Disk space related exceptions.
> 
> I have added few more tasktracker nodes in the cluster and now want to run reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the algorithm hadoop uses to get a new node to run reducer?
> 
> Thanks in advance.
> 
> Bye
> Abhay


Re: knowing the nodes on which reduce tasks will run

Posted by Steve Loughran <st...@hortonworks.com>.
On 3 September 2012 15:19, Abhay Ratnaparkhi <ab...@gmail.com>wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
>
you could always set up specific partitions for intermediate data, though
you get better bandwidth by striping the data across all disks, and better
flexibility by sharing the same partition.

There's also a property to set the amount of space to allocate for DFS
storage; reduce that by changing  dfs.datanode.du.reserved and the
datanodes will leave more free space around.

see: http://wiki.apache.org/hadoop/DiskSetup

Re: knowing the nodes on which reduce tasks will run

Posted by Bertrand Dechoux <de...@gmail.com>.
Hi,

The reducer is run where there is slot available, the location is not
related to where the data is located and it is not possible to choose where
the reducer will run (except by tweaking the tasktracker...).

Regards

Bertrand

On Mon, Sep 3, 2012 at 4:19 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
> I have added few more tasktracker nodes in the cluster and now want to run
> reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the
> algorithm hadoop uses to get a new node to run reducer?
>
> Thanks in advance.
>
> Bye
> Abhay
>



-- 
Bertrand Dechoux

Re: knowing the nodes on which reduce tasks will run

Posted by Bejoy Ks <be...@gmail.com>.
HI Abhay

The TaskTrackers on which the reduce tasks are triggered is chosen in
random based on the reduce slot availability. So if you don't need the
reduce tasks to be scheduled on some particular nodes you need to set
'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
bottleneck here is that this property is not a job level one you need to
set it on a cluster level.

A cleaner approach will be to configure each of your nodes with the right
number of map and reduce slots based on the resources available on each
machine.

On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
> I have added few more tasktracker nodes in the cluster and now want to run
> reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the
> algorithm hadoop uses to get a new node to run reducer?
>
> Thanks in advance.
>
> Bye
> Abhay
>

Re: knowing the nodes on which reduce tasks will run

Posted by Bertrand Dechoux <de...@gmail.com>.
Hi,

The reducer is run where there is slot available, the location is not
related to where the data is located and it is not possible to choose where
the reducer will run (except by tweaking the tasktracker...).

Regards

Bertrand

On Mon, Sep 3, 2012 at 4:19 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
> I have added few more tasktracker nodes in the cluster and now want to run
> reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the
> algorithm hadoop uses to get a new node to run reducer?
>
> Thanks in advance.
>
> Bye
> Abhay
>



-- 
Bertrand Dechoux

Re: knowing the nodes on which reduce tasks will run

Posted by Michael Segel <mi...@hotmail.com>.
The short answer is no. 
The longer answer is that you can attempt to force data locality, however even then if an open slot becomes available, its used regardless of what you want to do...


On Sep 3, 2012, at 9:19 AM, Abhay Ratnaparkhi <ab...@gmail.com> wrote:

> Hello,
> 
> How can one get to know the nodes on which reduce tasks will run?
> 
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory is getting full on all the nodes. 
> If the reduce task take any node from cluster then It'll try to copy the data to same disk and it'll eventually fail due to Disk space related exceptions.
> 
> I have added few more tasktracker nodes in the cluster and now want to run reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the algorithm hadoop uses to get a new node to run reducer?
> 
> Thanks in advance.
> 
> Bye
> Abhay


Re: knowing the nodes on which reduce tasks will run

Posted by Bejoy Ks <be...@gmail.com>.
HI Abhay

The TaskTrackers on which the reduce tasks are triggered is chosen in
random based on the reduce slot availability. So if you don't need the
reduce tasks to be scheduled on some particular nodes you need to set
'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
bottleneck here is that this property is not a job level one you need to
set it on a cluster level.

A cleaner approach will be to configure each of your nodes with the right
number of map and reduce slots based on the resources available on each
machine.

On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
> I have added few more tasktracker nodes in the cluster and now want to run
> reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the
> algorithm hadoop uses to get a new node to run reducer?
>
> Thanks in advance.
>
> Bye
> Abhay
>

Re: knowing the nodes on which reduce tasks will run

Posted by Michael Segel <mi...@hotmail.com>.
The short answer is no. 
The longer answer is that you can attempt to force data locality, however even then if an open slot becomes available, its used regardless of what you want to do...


On Sep 3, 2012, at 9:19 AM, Abhay Ratnaparkhi <ab...@gmail.com> wrote:

> Hello,
> 
> How can one get to know the nodes on which reduce tasks will run?
> 
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory is getting full on all the nodes. 
> If the reduce task take any node from cluster then It'll try to copy the data to same disk and it'll eventually fail due to Disk space related exceptions.
> 
> I have added few more tasktracker nodes in the cluster and now want to run reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the algorithm hadoop uses to get a new node to run reducer?
> 
> Thanks in advance.
> 
> Bye
> Abhay


Re: knowing the nodes on which reduce tasks will run

Posted by Steve Loughran <st...@hortonworks.com>.
On 3 September 2012 15:19, Abhay Ratnaparkhi <ab...@gmail.com>wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
>
you could always set up specific partitions for intermediate data, though
you get better bandwidth by striping the data across all disks, and better
flexibility by sharing the same partition.

There's also a property to set the amount of space to allocate for DFS
storage; reduce that by changing  dfs.datanode.du.reserved and the
datanodes will leave more free space around.

see: http://wiki.apache.org/hadoop/DiskSetup

Re: knowing the nodes on which reduce tasks will run

Posted by Bejoy Ks <be...@gmail.com>.
HI Abhay

The TaskTrackers on which the reduce tasks are triggered is chosen in
random based on the reduce slot availability. So if you don't need the
reduce tasks to be scheduled on some particular nodes you need to set
'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The
bottleneck here is that this property is not a job level one you need to
set it on a cluster level.

A cleaner approach will be to configure each of your nodes with the right
number of map and reduce slots based on the resources available on each
machine.

On Mon, Sep 3, 2012 at 7:49 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
> I have added few more tasktracker nodes in the cluster and now want to run
> reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the
> algorithm hadoop uses to get a new node to run reducer?
>
> Thanks in advance.
>
> Bye
> Abhay
>

Re: knowing the nodes on which reduce tasks will run

Posted by Michael Segel <mi...@hotmail.com>.
The short answer is no. 
The longer answer is that you can attempt to force data locality, however even then if an open slot becomes available, its used regardless of what you want to do...


On Sep 3, 2012, at 9:19 AM, Abhay Ratnaparkhi <ab...@gmail.com> wrote:

> Hello,
> 
> How can one get to know the nodes on which reduce tasks will run?
> 
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory is getting full on all the nodes. 
> If the reduce task take any node from cluster then It'll try to copy the data to same disk and it'll eventually fail due to Disk space related exceptions.
> 
> I have added few more tasktracker nodes in the cluster and now want to run reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the algorithm hadoop uses to get a new node to run reducer?
> 
> Thanks in advance.
> 
> Bye
> Abhay


Re: knowing the nodes on which reduce tasks will run

Posted by Bertrand Dechoux <de...@gmail.com>.
Hi,

The reducer is run where there is slot available, the location is not
related to where the data is located and it is not possible to choose where
the reducer will run (except by tweaking the tasktracker...).

Regards

Bertrand

On Mon, Sep 3, 2012 at 4:19 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
> I have added few more tasktracker nodes in the cluster and now want to run
> reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the
> algorithm hadoop uses to get a new node to run reducer?
>
> Thanks in advance.
>
> Bye
> Abhay
>



-- 
Bertrand Dechoux

Re: knowing the nodes on which reduce tasks will run

Posted by Bertrand Dechoux <de...@gmail.com>.
Hi,

The reducer is run where there is slot available, the location is not
related to where the data is located and it is not possible to choose where
the reducer will run (except by tweaking the tasktracker...).

Regards

Bertrand

On Mon, Sep 3, 2012 at 4:19 PM, Abhay Ratnaparkhi <
abhay.ratnaparkhi@gmail.com> wrote:

> Hello,
>
> How can one get to know the nodes on which reduce tasks will run?
>
> One of my job is running and it's completing all the map tasks.
> My map tasks write lots of intermediate data. The intermediate directory
> is getting full on all the nodes.
> If the reduce task take any node from cluster then It'll try to copy the
> data to same disk and it'll eventually fail due to Disk space related
> exceptions.
>
> I have added few more tasktracker nodes in the cluster and now want to run
> reducer on new nodes only.
> Is it possible to choose a node on which the reducer will run? What's the
> algorithm hadoop uses to get a new node to run reducer?
>
> Thanks in advance.
>
> Bye
> Abhay
>



-- 
Bertrand Dechoux