You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Amit Kabra <am...@gmail.com> on 2014/04/20 11:57:49 UTC

All datanodes are bad. Aborting ...

Hello,

I am facing one issue where while running map reduce job ( terasort ),
I see few task failing with the error "All datanodes
10.230.229.76:50010 are bad."
The job however finishes successfully since failed tasks are spawned
again. But yet I have seen these tasks failing multiple times.

Test Setup :
=========

Terasort for 300 GB , Number of reducers : 100, number of maps : 2400.
Cluster is running fine ( before / after the error )
12dn setup ( only zookeeper / hdfs / mapreduce running )
Map/Reduce container memory : 5gb
Only few full gc's (2-3) that too with less than 0.02 sec pause time.



Debugging:
========

Error on console:

14/04/19 10:17:09 [main] INFO  mapreduce.Job(1425): Task Id :
attempt_1397901041097_0001_r_000025_0, Status : FAILED

Error: java.io.IOException: All datanodes 10.230.229.76:50010 are bad. Aborting…

atorg.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:960)

atorg.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:780)

atorg.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)


Node Manager / Resource Manager shows no error.


Datanode / NameNode  shows following error


Though job is finished , I cann't access my job since it gives the
following "Not Found: job_1397901041097_0001".This on debugging found
that it could be because of following lines in Resource Manager log
which I am not sure why is it happening


19-Apr-2014 10:26:12  [1086033982@qtp-802928030-12] INFO
org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet[327] - sfdc
is accessing unchecked
http://blitzhbase02-mnds1-1-crd.eng.sfdc.net:19888/jobhistory/job/job_1397901041097_0001/mapreduce/job/job_1397901041097_0001
which is the app master GUI of application_1397901041097_0001 owned by
sfdc

19-Apr-2014 10:32:18  [501790067@qtp-802928030-22] INFO
org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet[327] - sfdc
is accessing unchecked
http://blitzhbase02-mnds1-1-crd.eng.sfdc.net:19888/jobhistory/job/job_1397901041097_0001/mapreduce/job/job_1397901041097_0001
which is the app master GUI of application_1397901041097_0001 owned by
sfdc

19-Apr-2014 10:33:43  [Delegation Token Canceler] INFO
org.apache.hadoop.hdfs.DFSClient[898] - Cancelling
HDFS_DELEGATION_TOKEN token 235 for sfdc on
ha-hdfs:crd-dev-blitzhbase02



Has anyone seen this earlier or any input on this would be helpful.

Note : Sometimes , I also see these errors , which seems to be known
one. I restarted the cluster for this one.

Amit.

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

fsck showed cluster to be healthy that time.

On Wed, Apr 23, 2014 at 8:25 AM, Shumin Guo <gs...@gmail.com> wrote:
> Did you do fsck? And what's the result?
>
>
> On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra <am...@gmail.com>
> wrote:
>>
>> 1) ulimit -a
>>
>> core file size          (blocks, -c) 0
>> data seg size           (kbytes, -d) unlimited
>> scheduling priority             (-e) 0
>> file size               (blocks, -f) unlimited
>> pending signals                 (-i) 513921
>> max locked memory       (kbytes, -l) 64
>> max memory size         (kbytes, -m) unlimited
>> open files                      (-n) 65536
>> pipe size            (512 bytes, -p) 8
>> POSIX message queues     (bytes, -q) 819200
>> real-time priority              (-r) 0
>> stack size              (kbytes, -s) 10240
>> cpu time               (seconds, -t) unlimited
>> max user processes              (-u) 32000
>> virtual memory          (kbytes, -v) unlimited
>> file locks                      (-x) unlimited
>>
>> 2) dfs.datanode.max.xcievers = 4096
>>
>> 3) dfs.datanode.max.transfer.threads = 4096
>>
>>
>>
>> On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com>
>> wrote:
>> > check with  open file descriptor limit in data nodes and namenode.
>> >
>> > $ ulimit -a
>> >
>> > and
>> > check with 'dfs.datanode.max.xcievers or
>> > dfs.datanode.max.transfer.threads'
>> > property in hdfs-site.xml
>> >
>> >
>> >
>> >
>> > On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com>
>> > wrote:
>> >>
>> >> Yes, error logs here : http://pastebin.com/RBdN5Euf
>> >>
>> >> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky
>> >> <ha...@gmail.com>
>> >> wrote:
>> >> > Do you see any errors in datanodes logs?
>> >> >
>> >> > Sent from my iPhone
>> >> >
>> >> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >> number
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Regards,
>> > ...sudhakara
>> >
>
>

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

fsck showed cluster to be healthy that time.

On Wed, Apr 23, 2014 at 8:25 AM, Shumin Guo <gs...@gmail.com> wrote:
> Did you do fsck? And what's the result?
>
>
> On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra <am...@gmail.com>
> wrote:
>>
>> 1) ulimit -a
>>
>> core file size          (blocks, -c) 0
>> data seg size           (kbytes, -d) unlimited
>> scheduling priority             (-e) 0
>> file size               (blocks, -f) unlimited
>> pending signals                 (-i) 513921
>> max locked memory       (kbytes, -l) 64
>> max memory size         (kbytes, -m) unlimited
>> open files                      (-n) 65536
>> pipe size            (512 bytes, -p) 8
>> POSIX message queues     (bytes, -q) 819200
>> real-time priority              (-r) 0
>> stack size              (kbytes, -s) 10240
>> cpu time               (seconds, -t) unlimited
>> max user processes              (-u) 32000
>> virtual memory          (kbytes, -v) unlimited
>> file locks                      (-x) unlimited
>>
>> 2) dfs.datanode.max.xcievers = 4096
>>
>> 3) dfs.datanode.max.transfer.threads = 4096
>>
>>
>>
>> On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com>
>> wrote:
>> > check with  open file descriptor limit in data nodes and namenode.
>> >
>> > $ ulimit -a
>> >
>> > and
>> > check with 'dfs.datanode.max.xcievers or
>> > dfs.datanode.max.transfer.threads'
>> > property in hdfs-site.xml
>> >
>> >
>> >
>> >
>> > On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com>
>> > wrote:
>> >>
>> >> Yes, error logs here : http://pastebin.com/RBdN5Euf
>> >>
>> >> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky
>> >> <ha...@gmail.com>
>> >> wrote:
>> >> > Do you see any errors in datanodes logs?
>> >> >
>> >> > Sent from my iPhone
>> >> >
>> >> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >> number
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Regards,
>> > ...sudhakara
>> >
>
>

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

fsck showed cluster to be healthy that time.

On Wed, Apr 23, 2014 at 8:25 AM, Shumin Guo <gs...@gmail.com> wrote:
> Did you do fsck? And what's the result?
>
>
> On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra <am...@gmail.com>
> wrote:
>>
>> 1) ulimit -a
>>
>> core file size          (blocks, -c) 0
>> data seg size           (kbytes, -d) unlimited
>> scheduling priority             (-e) 0
>> file size               (blocks, -f) unlimited
>> pending signals                 (-i) 513921
>> max locked memory       (kbytes, -l) 64
>> max memory size         (kbytes, -m) unlimited
>> open files                      (-n) 65536
>> pipe size            (512 bytes, -p) 8
>> POSIX message queues     (bytes, -q) 819200
>> real-time priority              (-r) 0
>> stack size              (kbytes, -s) 10240
>> cpu time               (seconds, -t) unlimited
>> max user processes              (-u) 32000
>> virtual memory          (kbytes, -v) unlimited
>> file locks                      (-x) unlimited
>>
>> 2) dfs.datanode.max.xcievers = 4096
>>
>> 3) dfs.datanode.max.transfer.threads = 4096
>>
>>
>>
>> On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com>
>> wrote:
>> > check with  open file descriptor limit in data nodes and namenode.
>> >
>> > $ ulimit -a
>> >
>> > and
>> > check with 'dfs.datanode.max.xcievers or
>> > dfs.datanode.max.transfer.threads'
>> > property in hdfs-site.xml
>> >
>> >
>> >
>> >
>> > On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com>
>> > wrote:
>> >>
>> >> Yes, error logs here : http://pastebin.com/RBdN5Euf
>> >>
>> >> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky
>> >> <ha...@gmail.com>
>> >> wrote:
>> >> > Do you see any errors in datanodes logs?
>> >> >
>> >> > Sent from my iPhone
>> >> >
>> >> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >> number
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Regards,
>> > ...sudhakara
>> >
>
>

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

fsck showed cluster to be healthy that time.

On Wed, Apr 23, 2014 at 8:25 AM, Shumin Guo <gs...@gmail.com> wrote:
> Did you do fsck? And what's the result?
>
>
> On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra <am...@gmail.com>
> wrote:
>>
>> 1) ulimit -a
>>
>> core file size          (blocks, -c) 0
>> data seg size           (kbytes, -d) unlimited
>> scheduling priority             (-e) 0
>> file size               (blocks, -f) unlimited
>> pending signals                 (-i) 513921
>> max locked memory       (kbytes, -l) 64
>> max memory size         (kbytes, -m) unlimited
>> open files                      (-n) 65536
>> pipe size            (512 bytes, -p) 8
>> POSIX message queues     (bytes, -q) 819200
>> real-time priority              (-r) 0
>> stack size              (kbytes, -s) 10240
>> cpu time               (seconds, -t) unlimited
>> max user processes              (-u) 32000
>> virtual memory          (kbytes, -v) unlimited
>> file locks                      (-x) unlimited
>>
>> 2) dfs.datanode.max.xcievers = 4096
>>
>> 3) dfs.datanode.max.transfer.threads = 4096
>>
>>
>>
>> On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com>
>> wrote:
>> > check with  open file descriptor limit in data nodes and namenode.
>> >
>> > $ ulimit -a
>> >
>> > and
>> > check with 'dfs.datanode.max.xcievers or
>> > dfs.datanode.max.transfer.threads'
>> > property in hdfs-site.xml
>> >
>> >
>> >
>> >
>> > On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com>
>> > wrote:
>> >>
>> >> Yes, error logs here : http://pastebin.com/RBdN5Euf
>> >>
>> >> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky
>> >> <ha...@gmail.com>
>> >> wrote:
>> >> > Do you see any errors in datanodes logs?
>> >> >
>> >> > Sent from my iPhone
>> >> >
>> >> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >> number
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Regards,
>> > ...sudhakara
>> >
>
>

Re: All datanodes are bad. Aborting ...

Posted by Shumin Guo <gs...@gmail.com>.

Did you do fsck? And what's the result?


On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra <am...@gmail.com>wrote:

> 1) ulimit -a
>
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 513921
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 65536
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 10240
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 32000
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> 2) dfs.datanode.max.xcievers = 4096
>
> 3) dfs.datanode.max.transfer.threads = 4096
>
>
>
> On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com>
> wrote:
> > check with  open file descriptor limit in data nodes and namenode.
> >
> > $ ulimit -a
> >
> > and
> > check with 'dfs.datanode.max.xcievers or
> dfs.datanode.max.transfer.threads'
> > property in hdfs-site.xml
> >
> >
> >
> >
> > On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com>
> wrote:
> >>
> >> Yes, error logs here : http://pastebin.com/RBdN5Euf
> >>
> >> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <hadoop.ca@gmail.com
> >
> >> wrote:
> >> > Do you see any errors in datanodes logs?
> >> >
> >> > Sent from my iPhone
> >> >
> >> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com>
> wrote:
> >> >>
> >> >> number
> >
> >
> >
> >
> > --
> >
> > Regards,
> > ...sudhakara
> >
>

Re: All datanodes are bad. Aborting ...

Posted by Shumin Guo <gs...@gmail.com>.

Did you do fsck? And what's the result?


On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra <am...@gmail.com>wrote:

> 1) ulimit -a
>
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 513921
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 65536
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 10240
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 32000
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> 2) dfs.datanode.max.xcievers = 4096
>
> 3) dfs.datanode.max.transfer.threads = 4096
>
>
>
> On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com>
> wrote:
> > check with  open file descriptor limit in data nodes and namenode.
> >
> > $ ulimit -a
> >
> > and
> > check with 'dfs.datanode.max.xcievers or
> dfs.datanode.max.transfer.threads'
> > property in hdfs-site.xml
> >
> >
> >
> >
> > On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com>
> wrote:
> >>
> >> Yes, error logs here : http://pastebin.com/RBdN5Euf
> >>
> >> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <hadoop.ca@gmail.com
> >
> >> wrote:
> >> > Do you see any errors in datanodes logs?
> >> >
> >> > Sent from my iPhone
> >> >
> >> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com>
> wrote:
> >> >>
> >> >> number
> >
> >
> >
> >
> > --
> >
> > Regards,
> > ...sudhakara
> >
>

Re: All datanodes are bad. Aborting ...

Posted by Shumin Guo <gs...@gmail.com>.

Did you do fsck? And what's the result?


On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra <am...@gmail.com>wrote:

> 1) ulimit -a
>
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 513921
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 65536
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 10240
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 32000
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> 2) dfs.datanode.max.xcievers = 4096
>
> 3) dfs.datanode.max.transfer.threads = 4096
>
>
>
> On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com>
> wrote:
> > check with  open file descriptor limit in data nodes and namenode.
> >
> > $ ulimit -a
> >
> > and
> > check with 'dfs.datanode.max.xcievers or
> dfs.datanode.max.transfer.threads'
> > property in hdfs-site.xml
> >
> >
> >
> >
> > On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com>
> wrote:
> >>
> >> Yes, error logs here : http://pastebin.com/RBdN5Euf
> >>
> >> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <hadoop.ca@gmail.com
> >
> >> wrote:
> >> > Do you see any errors in datanodes logs?
> >> >
> >> > Sent from my iPhone
> >> >
> >> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com>
> wrote:
> >> >>
> >> >> number
> >
> >
> >
> >
> > --
> >
> > Regards,
> > ...sudhakara
> >
>

Re: All datanodes are bad. Aborting ...

Posted by Shumin Guo <gs...@gmail.com>.

Did you do fsck? And what's the result?


On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra <am...@gmail.com>wrote:

> 1) ulimit -a
>
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 513921
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 65536
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 10240
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 32000
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> 2) dfs.datanode.max.xcievers = 4096
>
> 3) dfs.datanode.max.transfer.threads = 4096
>
>
>
> On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com>
> wrote:
> > check with  open file descriptor limit in data nodes and namenode.
> >
> > $ ulimit -a
> >
> > and
> > check with 'dfs.datanode.max.xcievers or
> dfs.datanode.max.transfer.threads'
> > property in hdfs-site.xml
> >
> >
> >
> >
> > On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com>
> wrote:
> >>
> >> Yes, error logs here : http://pastebin.com/RBdN5Euf
> >>
> >> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <hadoop.ca@gmail.com
> >
> >> wrote:
> >> > Do you see any errors in datanodes logs?
> >> >
> >> > Sent from my iPhone
> >> >
> >> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com>
> wrote:
> >> >>
> >> >> number
> >
> >
> >
> >
> > --
> >
> > Regards,
> > ...sudhakara
> >
>

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

1) ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 513921
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

2) dfs.datanode.max.xcievers = 4096

3) dfs.datanode.max.transfer.threads = 4096



On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com> wrote:
> check with  open file descriptor limit in data nodes and namenode.
>
> $ ulimit -a
>
> and
> check with 'dfs.datanode.max.xcievers or dfs.datanode.max.transfer.threads'
> property in hdfs-site.xml
>
>
>
>
> On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com> wrote:
>>
>> Yes, error logs here : http://pastebin.com/RBdN5Euf
>>
>> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com>
>> wrote:
>> > Do you see any errors in datanodes logs?
>> >
>> > Sent from my iPhone
>> >
>> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
>> >>
>> >> number
>
>
>
>
> --
>
> Regards,
> ...sudhakara
>

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

1) ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 513921
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

2) dfs.datanode.max.xcievers = 4096

3) dfs.datanode.max.transfer.threads = 4096



On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com> wrote:
> check with  open file descriptor limit in data nodes and namenode.
>
> $ ulimit -a
>
> and
> check with 'dfs.datanode.max.xcievers or dfs.datanode.max.transfer.threads'
> property in hdfs-site.xml
>
>
>
>
> On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com> wrote:
>>
>> Yes, error logs here : http://pastebin.com/RBdN5Euf
>>
>> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com>
>> wrote:
>> > Do you see any errors in datanodes logs?
>> >
>> > Sent from my iPhone
>> >
>> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
>> >>
>> >> number
>
>
>
>
> --
>
> Regards,
> ...sudhakara
>

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

1) ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 513921
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

2) dfs.datanode.max.xcievers = 4096

3) dfs.datanode.max.transfer.threads = 4096



On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com> wrote:
> check with  open file descriptor limit in data nodes and namenode.
>
> $ ulimit -a
>
> and
> check with 'dfs.datanode.max.xcievers or dfs.datanode.max.transfer.threads'
> property in hdfs-site.xml
>
>
>
>
> On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com> wrote:
>>
>> Yes, error logs here : http://pastebin.com/RBdN5Euf
>>
>> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com>
>> wrote:
>> > Do you see any errors in datanodes logs?
>> >
>> > Sent from my iPhone
>> >
>> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
>> >>
>> >> number
>
>
>
>
> --
>
> Regards,
> ...sudhakara
>

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

1) ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 513921
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

2) dfs.datanode.max.xcievers = 4096

3) dfs.datanode.max.transfer.threads = 4096



On Sun, Apr 20, 2014 at 10:36 PM, sudhakara st <su...@gmail.com> wrote:
> check with  open file descriptor limit in data nodes and namenode.
>
> $ ulimit -a
>
> and
> check with 'dfs.datanode.max.xcievers or dfs.datanode.max.transfer.threads'
> property in hdfs-site.xml
>
>
>
>
> On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com> wrote:
>>
>> Yes, error logs here : http://pastebin.com/RBdN5Euf
>>
>> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com>
>> wrote:
>> > Do you see any errors in datanodes logs?
>> >
>> > Sent from my iPhone
>> >
>> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
>> >>
>> >> number
>
>
>
>
> --
>
> Regards,
> ...sudhakara
>

Re: All datanodes are bad. Aborting ...

Posted by sudhakara st <su...@gmail.com>.

check with  open file descriptor limit in data nodes and namenode.

$ ulimit -a

and
check with 'dfs.datanode.max.xcievers or
dfs.datanode.max.transfer.threads' property in hdfs-site.xml



On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com> wrote:

> Yes, error logs here : http://pastebin.com/RBdN5Euf
>
> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com>
> wrote:
> > Do you see any errors in datanodes logs?
> >
> > Sent from my iPhone
> >
> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
> >>
> >> number
>



-- 

Regards,
...sudhakara

Re: All datanodes are bad. Aborting ...

Posted by sudhakara st <su...@gmail.com>.

check with  open file descriptor limit in data nodes and namenode.

$ ulimit -a

and
check with 'dfs.datanode.max.xcievers or
dfs.datanode.max.transfer.threads' property in hdfs-site.xml



On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com> wrote:

> Yes, error logs here : http://pastebin.com/RBdN5Euf
>
> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com>
> wrote:
> > Do you see any errors in datanodes logs?
> >
> > Sent from my iPhone
> >
> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
> >>
> >> number
>



-- 

Regards,
...sudhakara

Re: All datanodes are bad. Aborting ...

Posted by sudhakara st <su...@gmail.com>.

check with  open file descriptor limit in data nodes and namenode.

$ ulimit -a

and
check with 'dfs.datanode.max.xcievers or
dfs.datanode.max.transfer.threads' property in hdfs-site.xml



On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com> wrote:

> Yes, error logs here : http://pastebin.com/RBdN5Euf
>
> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com>
> wrote:
> > Do you see any errors in datanodes logs?
> >
> > Sent from my iPhone
> >
> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
> >>
> >> number
>



-- 

Regards,
...sudhakara

Re: All datanodes are bad. Aborting ...

Posted by sudhakara st <su...@gmail.com>.

check with  open file descriptor limit in data nodes and namenode.

$ ulimit -a

and
check with 'dfs.datanode.max.xcievers or
dfs.datanode.max.transfer.threads' property in hdfs-site.xml



On Sun, Apr 20, 2014 at 9:40 PM, Amit Kabra <am...@gmail.com> wrote:

> Yes, error logs here : http://pastebin.com/RBdN5Euf
>
> On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com>
> wrote:
> > Do you see any errors in datanodes logs?
> >
> > Sent from my iPhone
> >
> >> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
> >>
> >> number
>



-- 

Regards,
...sudhakara

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

Yes, error logs here : http://pastebin.com/RBdN5Euf

On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com> wrote:
> Do you see any errors in datanodes logs?
>
> Sent from my iPhone
>
>> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
>>
>> number

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

Yes, error logs here : http://pastebin.com/RBdN5Euf

On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com> wrote:
> Do you see any errors in datanodes logs?
>
> Sent from my iPhone
>
>> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
>>
>> number

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

Yes, error logs here : http://pastebin.com/RBdN5Euf

On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com> wrote:
> Do you see any errors in datanodes logs?
>
> Sent from my iPhone
>
>> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
>>
>> number

Re: All datanodes are bad. Aborting ...

Posted by Amit Kabra <am...@gmail.com>.

Yes, error logs here : http://pastebin.com/RBdN5Euf

On Sun, Apr 20, 2014 at 8:14 PM, Serge Blazhievsky <ha...@gmail.com> wrote:
> Do you see any errors in datanodes logs?
>
> Sent from my iPhone
>
>> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
>>
>> number

Re: All datanodes are bad. Aborting ...

Posted by Serge Blazhievsky <ha...@gmail.com>.

Do you see any errors in datanodes logs? 

Sent from my iPhone

> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
> 
> number

Re: All datanodes are bad. Aborting ...

Posted by Serge Blazhievsky <ha...@gmail.com>.

Do you see any errors in datanodes logs? 

Sent from my iPhone

> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
> 
> number

Re: All datanodes are bad. Aborting ...

Posted by Serge Blazhievsky <ha...@gmail.com>.

Do you see any errors in datanodes logs? 

Sent from my iPhone

> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
> 
> number

Re: All datanodes are bad. Aborting ...

Posted by Serge Blazhievsky <ha...@gmail.com>.

Do you see any errors in datanodes logs? 

Sent from my iPhone

> On Apr 20, 2014, at 2:57, Amit Kabra <am...@gmail.com> wrote:
> 
> number