You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Gopi Krishna M <mg...@gmail.com> on 2013/03/31 08:17:02 UTC

one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Hi..

I have installed hadoop 0.23.5 and is working fine on two of my
installations.  In a new installation on Windows Azure VMs, I am seeing an
inordinate delay (of 1 minute) when doing a simple dfs ls command.

This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre

command : bin/hadoop dfs -ls /

With log set to DEBUG, following is the output on the console.  The logs in
bold show the sleep/timeout/delay of *1 minute*.  Any pointers on what to
check for / what is the reason for this delay?

13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
successful kerberos logins
and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
failed kerberos logins and
latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
related metrics
13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
*13/03/31 05:49:55 DEBUG security.Groups: Group mapping
impl=org.apache.hadoop.security.ShellBasedUni*
*xGroupsMapping; cacheTimeout=300000*
*13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
user:UnixPrincipal: hduser
13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser
(auth:SIMPLE)
13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
ClientProtocol
13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000


Thanks
Gopi

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
just tried the command.  returns instantaneously, no delay

~/installations/hadoop-0.23.5$ groups hduser
hduser : hduser

ps: -F option is not working.

On Thu, Apr 4, 2013 at 6:29 PM, Daryn Sharp <da...@yahoo-inc.com> wrote:

>  Hi Gopi,
>
>  Check if you can run "groups -F hduser".  I think it's causing the delay.
>
>  Daryn
>
>  On Mar 31, 2013, at 1:17 AM, Gopi Krishna M wrote:
>
>  Hi..
>
>  I have installed hadoop 0.23.5 and is working fine on two of my
> installations.  In a new installation on Windows Azure VMs, I am seeing an
> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>
>  This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>
>  command : bin/hadoop dfs -ls /
>
>  With log set to DEBUG, following is the output on the console.  The logs
> in bold show the sleep/timeout/delay of *1 minute*.  Any pointers on what
> to check for / what is the reason for this delay?
>
>  13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> successful kerberos logins
> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> failed kerberos logins and
> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
> related metrics
> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUni*
> *xGroupsMapping; cacheTimeout=300000*
> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
> user:UnixPrincipal: hduser
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
> loginUser:hduser (auth:SIMPLE)
> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
> ClientProtocol
> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>
>
>  Thanks
> Gopi
>
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
just tried the command.  returns instantaneously, no delay

~/installations/hadoop-0.23.5$ groups hduser
hduser : hduser

ps: -F option is not working.

On Thu, Apr 4, 2013 at 6:29 PM, Daryn Sharp <da...@yahoo-inc.com> wrote:

>  Hi Gopi,
>
>  Check if you can run "groups -F hduser".  I think it's causing the delay.
>
>  Daryn
>
>  On Mar 31, 2013, at 1:17 AM, Gopi Krishna M wrote:
>
>  Hi..
>
>  I have installed hadoop 0.23.5 and is working fine on two of my
> installations.  In a new installation on Windows Azure VMs, I am seeing an
> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>
>  This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>
>  command : bin/hadoop dfs -ls /
>
>  With log set to DEBUG, following is the output on the console.  The logs
> in bold show the sleep/timeout/delay of *1 minute*.  Any pointers on what
> to check for / what is the reason for this delay?
>
>  13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> successful kerberos logins
> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> failed kerberos logins and
> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
> related metrics
> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUni*
> *xGroupsMapping; cacheTimeout=300000*
> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
> user:UnixPrincipal: hduser
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
> loginUser:hduser (auth:SIMPLE)
> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
> ClientProtocol
> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>
>
>  Thanks
> Gopi
>
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
just tried the command.  returns instantaneously, no delay

~/installations/hadoop-0.23.5$ groups hduser
hduser : hduser

ps: -F option is not working.

On Thu, Apr 4, 2013 at 6:29 PM, Daryn Sharp <da...@yahoo-inc.com> wrote:

>  Hi Gopi,
>
>  Check if you can run "groups -F hduser".  I think it's causing the delay.
>
>  Daryn
>
>  On Mar 31, 2013, at 1:17 AM, Gopi Krishna M wrote:
>
>  Hi..
>
>  I have installed hadoop 0.23.5 and is working fine on two of my
> installations.  In a new installation on Windows Azure VMs, I am seeing an
> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>
>  This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>
>  command : bin/hadoop dfs -ls /
>
>  With log set to DEBUG, following is the output on the console.  The logs
> in bold show the sleep/timeout/delay of *1 minute*.  Any pointers on what
> to check for / what is the reason for this delay?
>
>  13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> successful kerberos logins
> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> failed kerberos logins and
> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
> related metrics
> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUni*
> *xGroupsMapping; cacheTimeout=300000*
> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
> user:UnixPrincipal: hduser
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
> loginUser:hduser (auth:SIMPLE)
> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
> ClientProtocol
> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>
>
>  Thanks
> Gopi
>
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
just tried the command.  returns instantaneously, no delay

~/installations/hadoop-0.23.5$ groups hduser
hduser : hduser

ps: -F option is not working.

On Thu, Apr 4, 2013 at 6:29 PM, Daryn Sharp <da...@yahoo-inc.com> wrote:

>  Hi Gopi,
>
>  Check if you can run "groups -F hduser".  I think it's causing the delay.
>
>  Daryn
>
>  On Mar 31, 2013, at 1:17 AM, Gopi Krishna M wrote:
>
>  Hi..
>
>  I have installed hadoop 0.23.5 and is working fine on two of my
> installations.  In a new installation on Windows Azure VMs, I am seeing an
> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>
>  This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>
>  command : bin/hadoop dfs -ls /
>
>  With log set to DEBUG, following is the output on the console.  The logs
> in bold show the sleep/timeout/delay of *1 minute*.  Any pointers on what
> to check for / what is the reason for this delay?
>
>  13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> successful kerberos logins
> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> failed kerberos logins and
> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
> related metrics
> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUni*
> *xGroupsMapping; cacheTimeout=300000*
> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
> user:UnixPrincipal: hduser
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
> loginUser:hduser (auth:SIMPLE)
> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
> ClientProtocol
> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>
>
>  Thanks
> Gopi
>
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Daryn Sharp <da...@yahoo-inc.com>.
Hi Gopi,

Check if you can run "groups -F hduser".  I think it's causing the delay.

Daryn

On Mar 31, 2013, at 1:17 AM, Gopi Krishna M wrote:

Hi..

I have installed hadoop 0.23.5 and is working fine on two of my installations.  In a new installation on Windows Azure VMs, I am seeing an inordinate delay (of 1 minute) when doing a simple dfs ls command.

This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre

command : bin/hadoop dfs -ls /

With log set to DEBUG, following is the output on the console.  The logs in bold show the sleep/timeout/delay of 1 minute.  Any pointers on what to check for / what is the reason for this delay?

13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins
and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and
latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
13/03/31 05:49:55 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUni
xGroupsMapping; cacheTimeout=300000
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser
13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser (auth:SIMPLE)
13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol ClientProtocol
13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000


Thanks
Gopi



Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Daryn Sharp <da...@yahoo-inc.com>.
Hi Gopi,

Check if you can run "groups -F hduser".  I think it's causing the delay.

Daryn

On Mar 31, 2013, at 1:17 AM, Gopi Krishna M wrote:

Hi..

I have installed hadoop 0.23.5 and is working fine on two of my installations.  In a new installation on Windows Azure VMs, I am seeing an inordinate delay (of 1 minute) when doing a simple dfs ls command.

This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre

command : bin/hadoop dfs -ls /

With log set to DEBUG, following is the output on the console.  The logs in bold show the sleep/timeout/delay of 1 minute.  Any pointers on what to check for / what is the reason for this delay?

13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins
and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and
latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
13/03/31 05:49:55 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUni
xGroupsMapping; cacheTimeout=300000
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser
13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser (auth:SIMPLE)
13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol ClientProtocol
13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000


Thanks
Gopi



Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
Hi Marcos,

executed it from shell.. returns immediately, no delay and the output is
just hduser..

hduser@hmaster-asia:~/installations/hadoop-0.23.5$ bash -c "id -Gn hduser"
hduser

one important thing I noticed on logs is (posted in my first mail), it
always get stuck *twice for exactly 60secs*.  This looks like some timeout.
 Went through settings which are close to 60secs, but no luck yet.
Output of strace shows futex as the last call after which it takes 2 mins
and then print the ls output.  Not clear if it is stuck on futex there as
that seems to be the last system call.  I tried with 0.23.6 but got the
same behavior.
*
*
*strace -t bin/hadoop dfs -ls / ; date*
.......
15:09:29 openat(AT_FDCWD,
"/home/hduser/installations/hadoop-0.23.5/share/hadoop/mapreduce/", O_RDON
LY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
15:09:29 getdents(3, /* 11 entries */, 32768) = 584
15:09:29 getdents(3, /* 0 entries */, 32768) = 0
15:09:29 close(3)                       = 0
15:09:29 mmap(NULL, 1052672, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7
ff716300000
15:09:29 mprotect(0x7ff716300000, 4096, PROT_NONE) = 0
15:09:29 clone(child_stack=0x7ff7163ffff0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_T
HREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7ff716400
9d0, tls=0x7ff716400700, child_tidptr=0x7ff7164009d0) = 42720
*15:09:29 futex*(0x7ff7164009d0, FUTEX_WAIT, 42720, NULLFound 5 items
drwxr-xr-x   - hduser supergroup          0 2013-03-26 12:28
/aea8d83f-c0a6-4646-a88d-f9a89be92486
drwxr-xr-x   - hduser supergroup          0 2013-03-06 12:43 /in
drwxr-xr-x   - hduser supergroup          0 2013-03-04 12:57 /out1
drwxrwx---   - hduser supergroup          0 2013-03-04 12:32 /tmp
drwx------   - hduser supergroup          0 2013-03-04 12:49 /user
 <unfinished ... exit status 0>
Thu Apr  4* 15:11:32 UTC 2013*
~/installations/hadoop-0.23.5$







On Thu, Apr 4, 2013 at 6:36 PM, MARCOS MEDRADO RUBINELLI <
marcosm@buscapecompany.com> wrote:

>  Gopi,
>
> The namenode is essentially running bash -c "id -Gn hduser" and waiting
> for the response. You could try executing from the shell it to see if it
> does take a long time in Azure, or if the output is too complex to parse.
>
> Regards,
> Marcos
>
>
> In 04-04-2013 09:33, Gopi Krishna M wrote:
>
> in case anybody has some idea on what I should look for in debugging
> this.. below is a strace of the simple bin/hadoop dfs -ls / command run on
> the machine running namenode
>
>  % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  49.56    0.000853          45        19        14 execve
>  36.20    0.000623           1       925           rt_sigprocmask
>   5.75    0.000099           2        62           brk
>   5.29    0.000091           9        10           getgid
>   3.20    0.000055           1        81           rt_sigaction
>   0.00    0.000000           0        71           read
>   0.00    0.000000           0         3           write
>   0.00    0.000000           0        49        11 open
>   0.00    0.000000           0        87           close
>   0.00    0.000000           0        71        31 stat
>   0.00    0.000000           0        36           fstat
>   0.00    0.000000           0        12           lseek
>   0.00    0.000000           0        72           mmap
>   0.00    0.000000           0        36           mprotect
>   0.00    0.000000           0         9           munmap
>   0.00    0.000000           0        19           rt_sigreturn
>   0.00    0.000000           0         2         2 ioctl
>   0.00    0.000000           0        40        31 access
>   0.00    0.000000           0        19           pipe
>   0.00    0.000000           0         8           dup2
>   0.00    0.000000           0         2           getpid
>   0.00    0.000000           0        20           clone
>   0.00    0.000000           0        38        19 wait4
>   0.00    0.000000           0         2           uname
>   0.00    0.000000           0        24         2 fcntl
>   0.00    0.000000           0        16           getdents
>   0.00    0.000000           0         2           readlink
>   0.00    0.000000           0         2           gettimeofday
>   0.00    0.000000           0         5           getrlimit
>   0.00    0.000000           0        10           getuid
>   0.00    0.000000           0        10           geteuid
>   0.00    0.000000           0        10           getegid
>   0.00    0.000000           0         2           getppid
>   0.00    0.000000           0         2           getpgrp
>   0.00    0.000000           0         5           arch_prctl
>   0.00    0.000000           0         2         1 futex
>   0.00    0.000000           0         1           set_tid_address
>   0.00    0.000000           0        10         2 openat
>   0.00    0.000000           0         1           set_robust_list
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.001721                  1795       113 total
>
>
> On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com> wrote:
>
>> Hi..
>>
>>  I have installed hadoop 0.23.5 and is working fine on two of my
>> installations.  In a new installation on Windows Azure VMs, I am seeing an
>> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>>
>>  This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>>
>>  command : bin/hadoop dfs -ls /
>>
>>  With log set to DEBUG, following is the output on the console.  The
>> logs in bold show the sleep/timeout/delay of *1 minute*.  Any pointers
>> on what to check for / what is the reason for this delay?
>>
>>  13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
>> org.apache.hadoop.metrics2.lib.MutableRate
>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
>> with annotation @org.apache.
>> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
>> successful kerberos logins
>> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
>> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
>> org.apache.hadoop.metrics2.lib.MutableRate
>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
>> with annotation @org.apache.
>> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
>> failed kerberos logins and
>> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
>> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and
>> group related metrics
>> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
>> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
>> impl=org.apache.hadoop.security.ShellBasedUni*
>> *xGroupsMapping; cacheTimeout=300000*
>> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
>> user:UnixPrincipal: hduser
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
>> loginUser:hduser (auth:SIMPLE)
>> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
>> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
>> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for
>> protocol ClientProtocol
>> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>>
>>
>>  Thanks
>>  Gopi
>>
>>
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
Hi Marcos,

executed it from shell.. returns immediately, no delay and the output is
just hduser..

hduser@hmaster-asia:~/installations/hadoop-0.23.5$ bash -c "id -Gn hduser"
hduser

one important thing I noticed on logs is (posted in my first mail), it
always get stuck *twice for exactly 60secs*.  This looks like some timeout.
 Went through settings which are close to 60secs, but no luck yet.
Output of strace shows futex as the last call after which it takes 2 mins
and then print the ls output.  Not clear if it is stuck on futex there as
that seems to be the last system call.  I tried with 0.23.6 but got the
same behavior.
*
*
*strace -t bin/hadoop dfs -ls / ; date*
.......
15:09:29 openat(AT_FDCWD,
"/home/hduser/installations/hadoop-0.23.5/share/hadoop/mapreduce/", O_RDON
LY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
15:09:29 getdents(3, /* 11 entries */, 32768) = 584
15:09:29 getdents(3, /* 0 entries */, 32768) = 0
15:09:29 close(3)                       = 0
15:09:29 mmap(NULL, 1052672, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7
ff716300000
15:09:29 mprotect(0x7ff716300000, 4096, PROT_NONE) = 0
15:09:29 clone(child_stack=0x7ff7163ffff0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_T
HREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7ff716400
9d0, tls=0x7ff716400700, child_tidptr=0x7ff7164009d0) = 42720
*15:09:29 futex*(0x7ff7164009d0, FUTEX_WAIT, 42720, NULLFound 5 items
drwxr-xr-x   - hduser supergroup          0 2013-03-26 12:28
/aea8d83f-c0a6-4646-a88d-f9a89be92486
drwxr-xr-x   - hduser supergroup          0 2013-03-06 12:43 /in
drwxr-xr-x   - hduser supergroup          0 2013-03-04 12:57 /out1
drwxrwx---   - hduser supergroup          0 2013-03-04 12:32 /tmp
drwx------   - hduser supergroup          0 2013-03-04 12:49 /user
 <unfinished ... exit status 0>
Thu Apr  4* 15:11:32 UTC 2013*
~/installations/hadoop-0.23.5$







On Thu, Apr 4, 2013 at 6:36 PM, MARCOS MEDRADO RUBINELLI <
marcosm@buscapecompany.com> wrote:

>  Gopi,
>
> The namenode is essentially running bash -c "id -Gn hduser" and waiting
> for the response. You could try executing from the shell it to see if it
> does take a long time in Azure, or if the output is too complex to parse.
>
> Regards,
> Marcos
>
>
> In 04-04-2013 09:33, Gopi Krishna M wrote:
>
> in case anybody has some idea on what I should look for in debugging
> this.. below is a strace of the simple bin/hadoop dfs -ls / command run on
> the machine running namenode
>
>  % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  49.56    0.000853          45        19        14 execve
>  36.20    0.000623           1       925           rt_sigprocmask
>   5.75    0.000099           2        62           brk
>   5.29    0.000091           9        10           getgid
>   3.20    0.000055           1        81           rt_sigaction
>   0.00    0.000000           0        71           read
>   0.00    0.000000           0         3           write
>   0.00    0.000000           0        49        11 open
>   0.00    0.000000           0        87           close
>   0.00    0.000000           0        71        31 stat
>   0.00    0.000000           0        36           fstat
>   0.00    0.000000           0        12           lseek
>   0.00    0.000000           0        72           mmap
>   0.00    0.000000           0        36           mprotect
>   0.00    0.000000           0         9           munmap
>   0.00    0.000000           0        19           rt_sigreturn
>   0.00    0.000000           0         2         2 ioctl
>   0.00    0.000000           0        40        31 access
>   0.00    0.000000           0        19           pipe
>   0.00    0.000000           0         8           dup2
>   0.00    0.000000           0         2           getpid
>   0.00    0.000000           0        20           clone
>   0.00    0.000000           0        38        19 wait4
>   0.00    0.000000           0         2           uname
>   0.00    0.000000           0        24         2 fcntl
>   0.00    0.000000           0        16           getdents
>   0.00    0.000000           0         2           readlink
>   0.00    0.000000           0         2           gettimeofday
>   0.00    0.000000           0         5           getrlimit
>   0.00    0.000000           0        10           getuid
>   0.00    0.000000           0        10           geteuid
>   0.00    0.000000           0        10           getegid
>   0.00    0.000000           0         2           getppid
>   0.00    0.000000           0         2           getpgrp
>   0.00    0.000000           0         5           arch_prctl
>   0.00    0.000000           0         2         1 futex
>   0.00    0.000000           0         1           set_tid_address
>   0.00    0.000000           0        10         2 openat
>   0.00    0.000000           0         1           set_robust_list
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.001721                  1795       113 total
>
>
> On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com> wrote:
>
>> Hi..
>>
>>  I have installed hadoop 0.23.5 and is working fine on two of my
>> installations.  In a new installation on Windows Azure VMs, I am seeing an
>> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>>
>>  This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>>
>>  command : bin/hadoop dfs -ls /
>>
>>  With log set to DEBUG, following is the output on the console.  The
>> logs in bold show the sleep/timeout/delay of *1 minute*.  Any pointers
>> on what to check for / what is the reason for this delay?
>>
>>  13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
>> org.apache.hadoop.metrics2.lib.MutableRate
>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
>> with annotation @org.apache.
>> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
>> successful kerberos logins
>> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
>> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
>> org.apache.hadoop.metrics2.lib.MutableRate
>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
>> with annotation @org.apache.
>> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
>> failed kerberos logins and
>> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
>> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and
>> group related metrics
>> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
>> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
>> impl=org.apache.hadoop.security.ShellBasedUni*
>> *xGroupsMapping; cacheTimeout=300000*
>> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
>> user:UnixPrincipal: hduser
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
>> loginUser:hduser (auth:SIMPLE)
>> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
>> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
>> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for
>> protocol ClientProtocol
>> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>>
>>
>>  Thanks
>>  Gopi
>>
>>
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
Hi Marcos,

executed it from shell.. returns immediately, no delay and the output is
just hduser..

hduser@hmaster-asia:~/installations/hadoop-0.23.5$ bash -c "id -Gn hduser"
hduser

one important thing I noticed on logs is (posted in my first mail), it
always get stuck *twice for exactly 60secs*.  This looks like some timeout.
 Went through settings which are close to 60secs, but no luck yet.
Output of strace shows futex as the last call after which it takes 2 mins
and then print the ls output.  Not clear if it is stuck on futex there as
that seems to be the last system call.  I tried with 0.23.6 but got the
same behavior.
*
*
*strace -t bin/hadoop dfs -ls / ; date*
.......
15:09:29 openat(AT_FDCWD,
"/home/hduser/installations/hadoop-0.23.5/share/hadoop/mapreduce/", O_RDON
LY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
15:09:29 getdents(3, /* 11 entries */, 32768) = 584
15:09:29 getdents(3, /* 0 entries */, 32768) = 0
15:09:29 close(3)                       = 0
15:09:29 mmap(NULL, 1052672, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7
ff716300000
15:09:29 mprotect(0x7ff716300000, 4096, PROT_NONE) = 0
15:09:29 clone(child_stack=0x7ff7163ffff0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_T
HREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7ff716400
9d0, tls=0x7ff716400700, child_tidptr=0x7ff7164009d0) = 42720
*15:09:29 futex*(0x7ff7164009d0, FUTEX_WAIT, 42720, NULLFound 5 items
drwxr-xr-x   - hduser supergroup          0 2013-03-26 12:28
/aea8d83f-c0a6-4646-a88d-f9a89be92486
drwxr-xr-x   - hduser supergroup          0 2013-03-06 12:43 /in
drwxr-xr-x   - hduser supergroup          0 2013-03-04 12:57 /out1
drwxrwx---   - hduser supergroup          0 2013-03-04 12:32 /tmp
drwx------   - hduser supergroup          0 2013-03-04 12:49 /user
 <unfinished ... exit status 0>
Thu Apr  4* 15:11:32 UTC 2013*
~/installations/hadoop-0.23.5$







On Thu, Apr 4, 2013 at 6:36 PM, MARCOS MEDRADO RUBINELLI <
marcosm@buscapecompany.com> wrote:

>  Gopi,
>
> The namenode is essentially running bash -c "id -Gn hduser" and waiting
> for the response. You could try executing from the shell it to see if it
> does take a long time in Azure, or if the output is too complex to parse.
>
> Regards,
> Marcos
>
>
> In 04-04-2013 09:33, Gopi Krishna M wrote:
>
> in case anybody has some idea on what I should look for in debugging
> this.. below is a strace of the simple bin/hadoop dfs -ls / command run on
> the machine running namenode
>
>  % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  49.56    0.000853          45        19        14 execve
>  36.20    0.000623           1       925           rt_sigprocmask
>   5.75    0.000099           2        62           brk
>   5.29    0.000091           9        10           getgid
>   3.20    0.000055           1        81           rt_sigaction
>   0.00    0.000000           0        71           read
>   0.00    0.000000           0         3           write
>   0.00    0.000000           0        49        11 open
>   0.00    0.000000           0        87           close
>   0.00    0.000000           0        71        31 stat
>   0.00    0.000000           0        36           fstat
>   0.00    0.000000           0        12           lseek
>   0.00    0.000000           0        72           mmap
>   0.00    0.000000           0        36           mprotect
>   0.00    0.000000           0         9           munmap
>   0.00    0.000000           0        19           rt_sigreturn
>   0.00    0.000000           0         2         2 ioctl
>   0.00    0.000000           0        40        31 access
>   0.00    0.000000           0        19           pipe
>   0.00    0.000000           0         8           dup2
>   0.00    0.000000           0         2           getpid
>   0.00    0.000000           0        20           clone
>   0.00    0.000000           0        38        19 wait4
>   0.00    0.000000           0         2           uname
>   0.00    0.000000           0        24         2 fcntl
>   0.00    0.000000           0        16           getdents
>   0.00    0.000000           0         2           readlink
>   0.00    0.000000           0         2           gettimeofday
>   0.00    0.000000           0         5           getrlimit
>   0.00    0.000000           0        10           getuid
>   0.00    0.000000           0        10           geteuid
>   0.00    0.000000           0        10           getegid
>   0.00    0.000000           0         2           getppid
>   0.00    0.000000           0         2           getpgrp
>   0.00    0.000000           0         5           arch_prctl
>   0.00    0.000000           0         2         1 futex
>   0.00    0.000000           0         1           set_tid_address
>   0.00    0.000000           0        10         2 openat
>   0.00    0.000000           0         1           set_robust_list
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.001721                  1795       113 total
>
>
> On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com> wrote:
>
>> Hi..
>>
>>  I have installed hadoop 0.23.5 and is working fine on two of my
>> installations.  In a new installation on Windows Azure VMs, I am seeing an
>> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>>
>>  This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>>
>>  command : bin/hadoop dfs -ls /
>>
>>  With log set to DEBUG, following is the output on the console.  The
>> logs in bold show the sleep/timeout/delay of *1 minute*.  Any pointers
>> on what to check for / what is the reason for this delay?
>>
>>  13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
>> org.apache.hadoop.metrics2.lib.MutableRate
>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
>> with annotation @org.apache.
>> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
>> successful kerberos logins
>> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
>> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
>> org.apache.hadoop.metrics2.lib.MutableRate
>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
>> with annotation @org.apache.
>> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
>> failed kerberos logins and
>> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
>> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and
>> group related metrics
>> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
>> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
>> impl=org.apache.hadoop.security.ShellBasedUni*
>> *xGroupsMapping; cacheTimeout=300000*
>> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
>> user:UnixPrincipal: hduser
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
>> loginUser:hduser (auth:SIMPLE)
>> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
>> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
>> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for
>> protocol ClientProtocol
>> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>>
>>
>>  Thanks
>>  Gopi
>>
>>
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
Hi Marcos,

executed it from shell.. returns immediately, no delay and the output is
just hduser..

hduser@hmaster-asia:~/installations/hadoop-0.23.5$ bash -c "id -Gn hduser"
hduser

one important thing I noticed on logs is (posted in my first mail), it
always get stuck *twice for exactly 60secs*.  This looks like some timeout.
 Went through settings which are close to 60secs, but no luck yet.
Output of strace shows futex as the last call after which it takes 2 mins
and then print the ls output.  Not clear if it is stuck on futex there as
that seems to be the last system call.  I tried with 0.23.6 but got the
same behavior.
*
*
*strace -t bin/hadoop dfs -ls / ; date*
.......
15:09:29 openat(AT_FDCWD,
"/home/hduser/installations/hadoop-0.23.5/share/hadoop/mapreduce/", O_RDON
LY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
15:09:29 getdents(3, /* 11 entries */, 32768) = 584
15:09:29 getdents(3, /* 0 entries */, 32768) = 0
15:09:29 close(3)                       = 0
15:09:29 mmap(NULL, 1052672, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7
ff716300000
15:09:29 mprotect(0x7ff716300000, 4096, PROT_NONE) = 0
15:09:29 clone(child_stack=0x7ff7163ffff0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_T
HREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7ff716400
9d0, tls=0x7ff716400700, child_tidptr=0x7ff7164009d0) = 42720
*15:09:29 futex*(0x7ff7164009d0, FUTEX_WAIT, 42720, NULLFound 5 items
drwxr-xr-x   - hduser supergroup          0 2013-03-26 12:28
/aea8d83f-c0a6-4646-a88d-f9a89be92486
drwxr-xr-x   - hduser supergroup          0 2013-03-06 12:43 /in
drwxr-xr-x   - hduser supergroup          0 2013-03-04 12:57 /out1
drwxrwx---   - hduser supergroup          0 2013-03-04 12:32 /tmp
drwx------   - hduser supergroup          0 2013-03-04 12:49 /user
 <unfinished ... exit status 0>
Thu Apr  4* 15:11:32 UTC 2013*
~/installations/hadoop-0.23.5$







On Thu, Apr 4, 2013 at 6:36 PM, MARCOS MEDRADO RUBINELLI <
marcosm@buscapecompany.com> wrote:

>  Gopi,
>
> The namenode is essentially running bash -c "id -Gn hduser" and waiting
> for the response. You could try executing from the shell it to see if it
> does take a long time in Azure, or if the output is too complex to parse.
>
> Regards,
> Marcos
>
>
> In 04-04-2013 09:33, Gopi Krishna M wrote:
>
> in case anybody has some idea on what I should look for in debugging
> this.. below is a strace of the simple bin/hadoop dfs -ls / command run on
> the machine running namenode
>
>  % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  49.56    0.000853          45        19        14 execve
>  36.20    0.000623           1       925           rt_sigprocmask
>   5.75    0.000099           2        62           brk
>   5.29    0.000091           9        10           getgid
>   3.20    0.000055           1        81           rt_sigaction
>   0.00    0.000000           0        71           read
>   0.00    0.000000           0         3           write
>   0.00    0.000000           0        49        11 open
>   0.00    0.000000           0        87           close
>   0.00    0.000000           0        71        31 stat
>   0.00    0.000000           0        36           fstat
>   0.00    0.000000           0        12           lseek
>   0.00    0.000000           0        72           mmap
>   0.00    0.000000           0        36           mprotect
>   0.00    0.000000           0         9           munmap
>   0.00    0.000000           0        19           rt_sigreturn
>   0.00    0.000000           0         2         2 ioctl
>   0.00    0.000000           0        40        31 access
>   0.00    0.000000           0        19           pipe
>   0.00    0.000000           0         8           dup2
>   0.00    0.000000           0         2           getpid
>   0.00    0.000000           0        20           clone
>   0.00    0.000000           0        38        19 wait4
>   0.00    0.000000           0         2           uname
>   0.00    0.000000           0        24         2 fcntl
>   0.00    0.000000           0        16           getdents
>   0.00    0.000000           0         2           readlink
>   0.00    0.000000           0         2           gettimeofday
>   0.00    0.000000           0         5           getrlimit
>   0.00    0.000000           0        10           getuid
>   0.00    0.000000           0        10           geteuid
>   0.00    0.000000           0        10           getegid
>   0.00    0.000000           0         2           getppid
>   0.00    0.000000           0         2           getpgrp
>   0.00    0.000000           0         5           arch_prctl
>   0.00    0.000000           0         2         1 futex
>   0.00    0.000000           0         1           set_tid_address
>   0.00    0.000000           0        10         2 openat
>   0.00    0.000000           0         1           set_robust_list
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.001721                  1795       113 total
>
>
> On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com> wrote:
>
>> Hi..
>>
>>  I have installed hadoop 0.23.5 and is working fine on two of my
>> installations.  In a new installation on Windows Azure VMs, I am seeing an
>> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>>
>>  This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>>
>>  command : bin/hadoop dfs -ls /
>>
>>  With log set to DEBUG, following is the output on the console.  The
>> logs in bold show the sleep/timeout/delay of *1 minute*.  Any pointers
>> on what to check for / what is the reason for this delay?
>>
>>  13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
>> org.apache.hadoop.metrics2.lib.MutableRate
>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
>> with annotation @org.apache.
>> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
>> successful kerberos logins
>> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
>> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
>> org.apache.hadoop.metrics2.lib.MutableRate
>> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
>> with annotation @org.apache.
>> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
>> failed kerberos logins and
>> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
>> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and
>> group related metrics
>> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
>> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
>> impl=org.apache.hadoop.security.ShellBasedUni*
>> *xGroupsMapping; cacheTimeout=300000*
>> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
>> user:UnixPrincipal: hduser
>> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
>> loginUser:hduser (auth:SIMPLE)
>> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
>> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
>> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for
>> protocol ClientProtocol
>> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>>
>>
>>  Thanks
>>  Gopi
>>
>>
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by MARCOS MEDRADO RUBINELLI <ma...@buscapecompany.com>.
Gopi,

The namenode is essentially running bash -c "id -Gn hduser" and waiting for the response. You could try executing from the shell it to see if it does take a long time in Azure, or if the output is too complex to parse.

Regards,
Marcos

In 04-04-2013 09:33, Gopi Krishna M wrote:
in case anybody has some idea on what I should look for in debugging this.. below is a strace of the simple bin/hadoop dfs -ls / command run on the machine running namenode

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 49.56    0.000853          45        19        14 execve
 36.20    0.000623           1       925           rt_sigprocmask
  5.75    0.000099           2        62           brk
  5.29    0.000091           9        10           getgid
  3.20    0.000055           1        81           rt_sigaction
  0.00    0.000000           0        71           read
  0.00    0.000000           0         3           write
  0.00    0.000000           0        49        11 open
  0.00    0.000000           0        87           close
  0.00    0.000000           0        71        31 stat
  0.00    0.000000           0        36           fstat
  0.00    0.000000           0        12           lseek
  0.00    0.000000           0        72           mmap
  0.00    0.000000           0        36           mprotect
  0.00    0.000000           0         9           munmap
  0.00    0.000000           0        19           rt_sigreturn
  0.00    0.000000           0         2         2 ioctl
  0.00    0.000000           0        40        31 access
  0.00    0.000000           0        19           pipe
  0.00    0.000000           0         8           dup2
  0.00    0.000000           0         2           getpid
  0.00    0.000000           0        20           clone
  0.00    0.000000           0        38        19 wait4
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        24         2 fcntl
  0.00    0.000000           0        16           getdents
  0.00    0.000000           0         2           readlink
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        10           getuid
  0.00    0.000000           0        10           geteuid
  0.00    0.000000           0        10           getegid
  0.00    0.000000           0         2           getppid
  0.00    0.000000           0         2           getpgrp
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0        10         2 openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.001721                  1795       113 total


On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com>> wrote:
Hi..

I have installed hadoop 0.23.5 and is working fine on two of my installations.  In a new installation on Windows Azure VMs, I am seeing an inordinate delay (of 1 minute) when doing a simple dfs ls command.

This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre

command : bin/hadoop dfs -ls /

With log set to DEBUG, following is the output on the console.  The logs in bold show the sleep/timeout/delay of 1 minute.  Any pointers on what to check for / what is the reason for this delay?

13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins
and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and
latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
13/03/31 05:49:55 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUni
xGroupsMapping; cacheTimeout=300000
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser
13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser (auth:SIMPLE)
13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol ClientProtocol
13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000


Thanks
Gopi




Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by MARCOS MEDRADO RUBINELLI <ma...@buscapecompany.com>.
Gopi,

The namenode is essentially running bash -c "id -Gn hduser" and waiting for the response. You could try executing from the shell it to see if it does take a long time in Azure, or if the output is too complex to parse.

Regards,
Marcos

In 04-04-2013 09:33, Gopi Krishna M wrote:
in case anybody has some idea on what I should look for in debugging this.. below is a strace of the simple bin/hadoop dfs -ls / command run on the machine running namenode

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 49.56    0.000853          45        19        14 execve
 36.20    0.000623           1       925           rt_sigprocmask
  5.75    0.000099           2        62           brk
  5.29    0.000091           9        10           getgid
  3.20    0.000055           1        81           rt_sigaction
  0.00    0.000000           0        71           read
  0.00    0.000000           0         3           write
  0.00    0.000000           0        49        11 open
  0.00    0.000000           0        87           close
  0.00    0.000000           0        71        31 stat
  0.00    0.000000           0        36           fstat
  0.00    0.000000           0        12           lseek
  0.00    0.000000           0        72           mmap
  0.00    0.000000           0        36           mprotect
  0.00    0.000000           0         9           munmap
  0.00    0.000000           0        19           rt_sigreturn
  0.00    0.000000           0         2         2 ioctl
  0.00    0.000000           0        40        31 access
  0.00    0.000000           0        19           pipe
  0.00    0.000000           0         8           dup2
  0.00    0.000000           0         2           getpid
  0.00    0.000000           0        20           clone
  0.00    0.000000           0        38        19 wait4
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        24         2 fcntl
  0.00    0.000000           0        16           getdents
  0.00    0.000000           0         2           readlink
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        10           getuid
  0.00    0.000000           0        10           geteuid
  0.00    0.000000           0        10           getegid
  0.00    0.000000           0         2           getppid
  0.00    0.000000           0         2           getpgrp
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0        10         2 openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.001721                  1795       113 total


On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com>> wrote:
Hi..

I have installed hadoop 0.23.5 and is working fine on two of my installations.  In a new installation on Windows Azure VMs, I am seeing an inordinate delay (of 1 minute) when doing a simple dfs ls command.

This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre

command : bin/hadoop dfs -ls /

With log set to DEBUG, following is the output on the console.  The logs in bold show the sleep/timeout/delay of 1 minute.  Any pointers on what to check for / what is the reason for this delay?

13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins
and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and
latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
13/03/31 05:49:55 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUni
xGroupsMapping; cacheTimeout=300000
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser
13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser (auth:SIMPLE)
13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol ClientProtocol
13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000


Thanks
Gopi




Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by MARCOS MEDRADO RUBINELLI <ma...@buscapecompany.com>.
Gopi,

The namenode is essentially running bash -c "id -Gn hduser" and waiting for the response. You could try executing from the shell it to see if it does take a long time in Azure, or if the output is too complex to parse.

Regards,
Marcos

In 04-04-2013 09:33, Gopi Krishna M wrote:
in case anybody has some idea on what I should look for in debugging this.. below is a strace of the simple bin/hadoop dfs -ls / command run on the machine running namenode

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 49.56    0.000853          45        19        14 execve
 36.20    0.000623           1       925           rt_sigprocmask
  5.75    0.000099           2        62           brk
  5.29    0.000091           9        10           getgid
  3.20    0.000055           1        81           rt_sigaction
  0.00    0.000000           0        71           read
  0.00    0.000000           0         3           write
  0.00    0.000000           0        49        11 open
  0.00    0.000000           0        87           close
  0.00    0.000000           0        71        31 stat
  0.00    0.000000           0        36           fstat
  0.00    0.000000           0        12           lseek
  0.00    0.000000           0        72           mmap
  0.00    0.000000           0        36           mprotect
  0.00    0.000000           0         9           munmap
  0.00    0.000000           0        19           rt_sigreturn
  0.00    0.000000           0         2         2 ioctl
  0.00    0.000000           0        40        31 access
  0.00    0.000000           0        19           pipe
  0.00    0.000000           0         8           dup2
  0.00    0.000000           0         2           getpid
  0.00    0.000000           0        20           clone
  0.00    0.000000           0        38        19 wait4
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        24         2 fcntl
  0.00    0.000000           0        16           getdents
  0.00    0.000000           0         2           readlink
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        10           getuid
  0.00    0.000000           0        10           geteuid
  0.00    0.000000           0        10           getegid
  0.00    0.000000           0         2           getppid
  0.00    0.000000           0         2           getpgrp
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0        10         2 openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.001721                  1795       113 total


On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com>> wrote:
Hi..

I have installed hadoop 0.23.5 and is working fine on two of my installations.  In a new installation on Windows Azure VMs, I am seeing an inordinate delay (of 1 minute) when doing a simple dfs ls command.

This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre

command : bin/hadoop dfs -ls /

With log set to DEBUG, following is the output on the console.  The logs in bold show the sleep/timeout/delay of 1 minute.  Any pointers on what to check for / what is the reason for this delay?

13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins
and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and
latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
13/03/31 05:49:55 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUni
xGroupsMapping; cacheTimeout=300000
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser
13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser (auth:SIMPLE)
13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol ClientProtocol
13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000


Thanks
Gopi




Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by MARCOS MEDRADO RUBINELLI <ma...@buscapecompany.com>.
Gopi,

The namenode is essentially running bash -c "id -Gn hduser" and waiting for the response. You could try executing from the shell it to see if it does take a long time in Azure, or if the output is too complex to parse.

Regards,
Marcos

In 04-04-2013 09:33, Gopi Krishna M wrote:
in case anybody has some idea on what I should look for in debugging this.. below is a strace of the simple bin/hadoop dfs -ls / command run on the machine running namenode

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 49.56    0.000853          45        19        14 execve
 36.20    0.000623           1       925           rt_sigprocmask
  5.75    0.000099           2        62           brk
  5.29    0.000091           9        10           getgid
  3.20    0.000055           1        81           rt_sigaction
  0.00    0.000000           0        71           read
  0.00    0.000000           0         3           write
  0.00    0.000000           0        49        11 open
  0.00    0.000000           0        87           close
  0.00    0.000000           0        71        31 stat
  0.00    0.000000           0        36           fstat
  0.00    0.000000           0        12           lseek
  0.00    0.000000           0        72           mmap
  0.00    0.000000           0        36           mprotect
  0.00    0.000000           0         9           munmap
  0.00    0.000000           0        19           rt_sigreturn
  0.00    0.000000           0         2         2 ioctl
  0.00    0.000000           0        40        31 access
  0.00    0.000000           0        19           pipe
  0.00    0.000000           0         8           dup2
  0.00    0.000000           0         2           getpid
  0.00    0.000000           0        20           clone
  0.00    0.000000           0        38        19 wait4
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        24         2 fcntl
  0.00    0.000000           0        16           getdents
  0.00    0.000000           0         2           readlink
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        10           getuid
  0.00    0.000000           0        10           geteuid
  0.00    0.000000           0        10           getegid
  0.00    0.000000           0         2           getppid
  0.00    0.000000           0         2           getpgrp
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0        10         2 openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.001721                  1795       113 total


On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com>> wrote:
Hi..

I have installed hadoop 0.23.5 and is working fine on two of my installations.  In a new installation on Windows Azure VMs, I am seeing an inordinate delay (of 1 minute) when doing a simple dfs ls command.

This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre

command : bin/hadoop dfs -ls /

With log set to DEBUG, following is the output on the console.  The logs in bold show the sleep/timeout/delay of 1 minute.  Any pointers on what to check for / what is the reason for this delay?

13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins
and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and
latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
13/03/31 05:49:55 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUni
xGroupsMapping; cacheTimeout=300000
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser
13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser (auth:SIMPLE)
13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol ClientProtocol
13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000


Thanks
Gopi




Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
in case anybody has some idea on what I should look for in debugging this..
below is a strace of the simple bin/hadoop dfs -ls / command run on the
machine running namenode

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 49.56    0.000853          45        19        14 execve
 36.20    0.000623           1       925           rt_sigprocmask
  5.75    0.000099           2        62           brk
  5.29    0.000091           9        10           getgid
  3.20    0.000055           1        81           rt_sigaction
  0.00    0.000000           0        71           read
  0.00    0.000000           0         3           write
  0.00    0.000000           0        49        11 open
  0.00    0.000000           0        87           close
  0.00    0.000000           0        71        31 stat
  0.00    0.000000           0        36           fstat
  0.00    0.000000           0        12           lseek
  0.00    0.000000           0        72           mmap
  0.00    0.000000           0        36           mprotect
  0.00    0.000000           0         9           munmap
  0.00    0.000000           0        19           rt_sigreturn
  0.00    0.000000           0         2         2 ioctl
  0.00    0.000000           0        40        31 access
  0.00    0.000000           0        19           pipe
  0.00    0.000000           0         8           dup2
  0.00    0.000000           0         2           getpid
  0.00    0.000000           0        20           clone
  0.00    0.000000           0        38        19 wait4
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        24         2 fcntl
  0.00    0.000000           0        16           getdents
  0.00    0.000000           0         2           readlink
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        10           getuid
  0.00    0.000000           0        10           geteuid
  0.00    0.000000           0        10           getegid
  0.00    0.000000           0         2           getppid
  0.00    0.000000           0         2           getpgrp
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0        10         2 openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.001721                  1795       113 total


On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com> wrote:

> Hi..
>
> I have installed hadoop 0.23.5 and is working fine on two of my
> installations.  In a new installation on Windows Azure VMs, I am seeing an
> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>
> This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>
> command : bin/hadoop dfs -ls /
>
> With log set to DEBUG, following is the output on the console.  The logs
> in bold show the sleep/timeout/delay of *1 minute*.  Any pointers on what
> to check for / what is the reason for this delay?
>
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> successful kerberos logins
> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> failed kerberos logins and
> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
> related metrics
> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUni*
> *xGroupsMapping; cacheTimeout=300000*
> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
> user:UnixPrincipal: hduser
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
> loginUser:hduser (auth:SIMPLE)
> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
> ClientProtocol
> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>
>
> Thanks
> Gopi
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Daryn Sharp <da...@yahoo-inc.com>.
Hi Gopi,

Check if you can run "groups -F hduser".  I think it's causing the delay.

Daryn

On Mar 31, 2013, at 1:17 AM, Gopi Krishna M wrote:

Hi..

I have installed hadoop 0.23.5 and is working fine on two of my installations.  In a new installation on Windows Azure VMs, I am seeing an inordinate delay (of 1 minute) when doing a simple dfs ls command.

This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre

command : bin/hadoop dfs -ls /

With log set to DEBUG, following is the output on the console.  The logs in bold show the sleep/timeout/delay of 1 minute.  Any pointers on what to check for / what is the reason for this delay?

13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins
and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and
latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
13/03/31 05:49:55 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUni
xGroupsMapping; cacheTimeout=300000
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser
13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser (auth:SIMPLE)
13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol ClientProtocol
13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000


Thanks
Gopi



Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
in case anybody has some idea on what I should look for in debugging this..
below is a strace of the simple bin/hadoop dfs -ls / command run on the
machine running namenode

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 49.56    0.000853          45        19        14 execve
 36.20    0.000623           1       925           rt_sigprocmask
  5.75    0.000099           2        62           brk
  5.29    0.000091           9        10           getgid
  3.20    0.000055           1        81           rt_sigaction
  0.00    0.000000           0        71           read
  0.00    0.000000           0         3           write
  0.00    0.000000           0        49        11 open
  0.00    0.000000           0        87           close
  0.00    0.000000           0        71        31 stat
  0.00    0.000000           0        36           fstat
  0.00    0.000000           0        12           lseek
  0.00    0.000000           0        72           mmap
  0.00    0.000000           0        36           mprotect
  0.00    0.000000           0         9           munmap
  0.00    0.000000           0        19           rt_sigreturn
  0.00    0.000000           0         2         2 ioctl
  0.00    0.000000           0        40        31 access
  0.00    0.000000           0        19           pipe
  0.00    0.000000           0         8           dup2
  0.00    0.000000           0         2           getpid
  0.00    0.000000           0        20           clone
  0.00    0.000000           0        38        19 wait4
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        24         2 fcntl
  0.00    0.000000           0        16           getdents
  0.00    0.000000           0         2           readlink
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        10           getuid
  0.00    0.000000           0        10           geteuid
  0.00    0.000000           0        10           getegid
  0.00    0.000000           0         2           getppid
  0.00    0.000000           0         2           getpgrp
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0        10         2 openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.001721                  1795       113 total


On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com> wrote:

> Hi..
>
> I have installed hadoop 0.23.5 and is working fine on two of my
> installations.  In a new installation on Windows Azure VMs, I am seeing an
> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>
> This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>
> command : bin/hadoop dfs -ls /
>
> With log set to DEBUG, following is the output on the console.  The logs
> in bold show the sleep/timeout/delay of *1 minute*.  Any pointers on what
> to check for / what is the reason for this delay?
>
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> successful kerberos logins
> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> failed kerberos logins and
> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
> related metrics
> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUni*
> *xGroupsMapping; cacheTimeout=300000*
> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
> user:UnixPrincipal: hduser
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
> loginUser:hduser (auth:SIMPLE)
> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
> ClientProtocol
> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>
>
> Thanks
> Gopi
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Daryn Sharp <da...@yahoo-inc.com>.
Hi Gopi,

Check if you can run "groups -F hduser".  I think it's causing the delay.

Daryn

On Mar 31, 2013, at 1:17 AM, Gopi Krishna M wrote:

Hi..

I have installed hadoop 0.23.5 and is working fine on two of my installations.  In a new installation on Windows Azure VMs, I am seeing an inordinate delay (of 1 minute) when doing a simple dfs ls command.

This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre

command : bin/hadoop dfs -ls /

With log set to DEBUG, following is the output on the console.  The logs in bold show the sleep/timeout/delay of 1 minute.  Any pointers on what to check for / what is the reason for this delay?

13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins
and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.
hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and
latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
13/03/31 05:49:55 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUni
xGroupsMapping; cacheTimeout=300000
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login
13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser
13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser (auth:SIMPLE)
13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol ClientProtocol
13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000


Thanks
Gopi



Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
in case anybody has some idea on what I should look for in debugging this..
below is a strace of the simple bin/hadoop dfs -ls / command run on the
machine running namenode

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 49.56    0.000853          45        19        14 execve
 36.20    0.000623           1       925           rt_sigprocmask
  5.75    0.000099           2        62           brk
  5.29    0.000091           9        10           getgid
  3.20    0.000055           1        81           rt_sigaction
  0.00    0.000000           0        71           read
  0.00    0.000000           0         3           write
  0.00    0.000000           0        49        11 open
  0.00    0.000000           0        87           close
  0.00    0.000000           0        71        31 stat
  0.00    0.000000           0        36           fstat
  0.00    0.000000           0        12           lseek
  0.00    0.000000           0        72           mmap
  0.00    0.000000           0        36           mprotect
  0.00    0.000000           0         9           munmap
  0.00    0.000000           0        19           rt_sigreturn
  0.00    0.000000           0         2         2 ioctl
  0.00    0.000000           0        40        31 access
  0.00    0.000000           0        19           pipe
  0.00    0.000000           0         8           dup2
  0.00    0.000000           0         2           getpid
  0.00    0.000000           0        20           clone
  0.00    0.000000           0        38        19 wait4
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        24         2 fcntl
  0.00    0.000000           0        16           getdents
  0.00    0.000000           0         2           readlink
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        10           getuid
  0.00    0.000000           0        10           geteuid
  0.00    0.000000           0        10           getegid
  0.00    0.000000           0         2           getppid
  0.00    0.000000           0         2           getpgrp
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0        10         2 openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.001721                  1795       113 total


On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com> wrote:

> Hi..
>
> I have installed hadoop 0.23.5 and is working fine on two of my
> installations.  In a new installation on Windows Azure VMs, I am seeing an
> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>
> This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>
> command : bin/hadoop dfs -ls /
>
> With log set to DEBUG, following is the output on the console.  The logs
> in bold show the sleep/timeout/delay of *1 minute*.  Any pointers on what
> to check for / what is the reason for this delay?
>
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> successful kerberos logins
> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> failed kerberos logins and
> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
> related metrics
> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUni*
> *xGroupsMapping; cacheTimeout=300000*
> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
> user:UnixPrincipal: hduser
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
> loginUser:hduser (auth:SIMPLE)
> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
> ClientProtocol
> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>
>
> Thanks
> Gopi
>
>

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

Posted by Gopi Krishna M <mg...@gmail.com>.
in case anybody has some idea on what I should look for in debugging this..
below is a strace of the simple bin/hadoop dfs -ls / command run on the
machine running namenode

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 49.56    0.000853          45        19        14 execve
 36.20    0.000623           1       925           rt_sigprocmask
  5.75    0.000099           2        62           brk
  5.29    0.000091           9        10           getgid
  3.20    0.000055           1        81           rt_sigaction
  0.00    0.000000           0        71           read
  0.00    0.000000           0         3           write
  0.00    0.000000           0        49        11 open
  0.00    0.000000           0        87           close
  0.00    0.000000           0        71        31 stat
  0.00    0.000000           0        36           fstat
  0.00    0.000000           0        12           lseek
  0.00    0.000000           0        72           mmap
  0.00    0.000000           0        36           mprotect
  0.00    0.000000           0         9           munmap
  0.00    0.000000           0        19           rt_sigreturn
  0.00    0.000000           0         2         2 ioctl
  0.00    0.000000           0        40        31 access
  0.00    0.000000           0        19           pipe
  0.00    0.000000           0         8           dup2
  0.00    0.000000           0         2           getpid
  0.00    0.000000           0        20           clone
  0.00    0.000000           0        38        19 wait4
  0.00    0.000000           0         2           uname
  0.00    0.000000           0        24         2 fcntl
  0.00    0.000000           0        16           getdents
  0.00    0.000000           0         2           readlink
  0.00    0.000000           0         2           gettimeofday
  0.00    0.000000           0         5           getrlimit
  0.00    0.000000           0        10           getuid
  0.00    0.000000           0        10           geteuid
  0.00    0.000000           0        10           getegid
  0.00    0.000000           0         2           getppid
  0.00    0.000000           0         2           getpgrp
  0.00    0.000000           0         5           arch_prctl
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0        10         2 openat
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.001721                  1795       113 total


On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mg...@gmail.com> wrote:

> Hi..
>
> I have installed hadoop 0.23.5 and is working fine on two of my
> installations.  In a new installation on Windows Azure VMs, I am seeing an
> inordinate delay (of 1 minute) when doing a simple dfs ls command.
>
> This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre
>
> command : bin/hadoop dfs -ls /
>
> With log set to DEBUG, following is the output on the console.  The logs
> in bold show the sleep/timeout/delay of *1 minute*.  Any pointers on what
> to check for / what is the reason for this delay?
>
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> successful kerberos logins
> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
> with annotation @org.apache.
> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of
> failed kerberos logins and
> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops)
> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
> related metrics
> 13/03/31 05:49:55 DEBUG security.Groups:  Creating new Groups object
> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUni*
> *xGroupsMapping; cacheTimeout=300000*
> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login*
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local
> user:UnixPrincipal: hduser
> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI
> loginUser:hduser (auth:SIMPLE)
> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false
> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms.
> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol
> ClientProtocol
> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000
>
>
> Thanks
> Gopi
>
>