You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Divya R <av...@gmail.com> on 2013/11/07 09:34:45 UTC

Re: Exception in mid of reading files.

Hi Chris,

  Thanks a lot for the help. But after lot of investigation I found that
the issue was with the cached socket connection which was raised as a bug
by Nicholas. Bug details are as follows,

HDFS-3373 <https://issues.apache.org/jira/browse/HDFS-3373> FileContext
HDFS implementation can leak socket caches

When I executed command netstat -a |grep 50010 the count was approximately
52000. This issue is fixed in
0.20.3<https://issues.apache.org/jira/browse/HDFS/fixforversion/12314814>,
0.20.205.0<https://issues.apache.org/jira/browse/HDFS/fixforversion/12316392>,
but its not present in hadoop-1.2.X. Could you please guide me as to what
could I do.?

-Divya


On Sat, Oct 26, 2013 at 12:38 AM, Chris Nauroth <cn...@hortonworks.com>wrote:

> Hi Divya,
>
> The exceptions indicate that the HDFS client failed to establish a network
> connection to a datanode hosting a block that the client is trying to read.
>  After too many of these failures (default 3, but configurable), the HDFS
> client aborts the read and this bubbles up to the caller with the "could
> not obtain block" error.
>
> I recommend troubleshooting this as a network connectivity issue.  This
> wiki page includes a few tips as a starting point:
>
> http://wiki.apache.org/hadoop/TroubleShooting
>
> Hope this helps,
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Fri, Oct 25, 2013 at 4:53 AM, Divya R <av...@gmail.com> wrote:
>
> > Hi Guys,
> >
> >    I'm indexing data (~50 -100GB per day) from hadoop. Hadoop is Running
> in
> > cluster mode (having 2 dataNaodes currently). After every two or three
> > hours I'm getting this exception. But both Data nodes are up and running.
> > Can any one please guide me as to what I should do or  If I'm doing
> wrong.
> >
> > Code Snippet:
> > public InitHadoop()  {
> >
> >         configuration = new Configuration();
> >         configuration.set("fs.default.name", "hdfs://<<namenode
> > IP>>:54310"); // Is this write to specify on namenode IP.?
> >         configuration.set("mapred.job.tracker", "hdfs://<<namenode
> > IP>>:54311");
> >
> >         try {
> >             fileSystem = FileSystem.get(configuration);
> >         } catch (IOException e) {
> >             e.printStackTrace();
> >         }
> > }
> > private void indexDocument(FSDataInputStream file) {
> >
> >             Scanner scanner = new Scanner(file);
> >
> >             while (scanner.hasNext() != null) {
> >                   //   Indexing code
> >             }
> >       }
> > }
> >
> > Logs:
> >
> > 2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
> > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
> > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:57 INFO  DFSClient:2432 - Could not obtain block
> > blk_-8795538519317154213_432897 from any node: java.io.IOException: No
> live
> > nodes contain current block. Will get new block locations from namenode
> and
> > retry...
> > 2013-10-25 09:37:58 WARN  DFSClient:2266 - Failed to connect to
> > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:58 WARN  DFSClient:2266 - Failed to connect to
> > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:58 INFO  DFSClient:2432 - Could not obtain block
> > blk_-5974673190155585497_432671 from any node: java.io.IOException: No
> live
> > nodes contain current block. Will get new block locations from namenode
> and
> > retry...
> > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:59 INFO  DFSClient:2432 - Could not obtain block
> > blk_-1662761320365439855_431653 from any node: java.io.IOException: No
> live
> > nodes contain current block. Will get new block locations from namenode
> and
> > retry...
> > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:59 WARN  DFSClient:2400 - DFS Read: java.io.IOException:
> > Could not obtain block: blk_8826777676488299245_432528
> > file=/flume/<<File.Name>>.1382639351042
> >     at
> >
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:2426)
> >     at
> >
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2218)
> >     at
> > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2381)
> >     at java.io.DataInputStream.read(DataInputStream.java:149)
> >     at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
> >     at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
> >     at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
> >     at java.io.InputStreamReader.read(InputStreamReader.java:184)
> >
> > Regards,
> > -Divya
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Exception in mid of reading files.

Posted by Chris Nauroth <cn...@hortonworks.com>.
Divya, thank you for reporting back on this.  Nicholas and I had an offline
conversation and came to the conclusion that this is likely to be a
different problem from HDFS-3373.  Although the symptoms look similar, the
socket caching code mentioned in HDFS-3373 is not present in branch-1.

I filed a new issue for your bug report: HDFS-5493.  Nicholas pointed out a
spot in the DFSClient code where we may have a socket leak.

https://issues.apache.org/jira/browse/HDFS-5493

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Thu, Nov 7, 2013 at 12:34 AM, Divya R <av...@gmail.com> wrote:

> Hi Chris,
>
>   Thanks a lot for the help. But after lot of investigation I found that
> the issue was with the cached socket connection which was raised as a bug
> by Nicholas. Bug details are as follows,
>
> HDFS-3373 <https://issues.apache.org/jira/browse/HDFS-3373> FileContext
> HDFS implementation can leak socket caches
>
> When I executed command netstat -a |grep 50010 the count was approximately
> 52000. This issue is fixed in
> 0.20.3<https://issues.apache.org/jira/browse/HDFS/fixforversion/12314814>,
> 0.20.205.0<
> https://issues.apache.org/jira/browse/HDFS/fixforversion/12316392>,
> but its not present in hadoop-1.2.X. Could you please guide me as to what
> could I do.?
>
> -Divya
>
>
> On Sat, Oct 26, 2013 at 12:38 AM, Chris Nauroth <cnauroth@hortonworks.com
> >wrote:
>
> > Hi Divya,
> >
> > The exceptions indicate that the HDFS client failed to establish a
> network
> > connection to a datanode hosting a block that the client is trying to
> read.
> >  After too many of these failures (default 3, but configurable), the HDFS
> > client aborts the read and this bubbles up to the caller with the "could
> > not obtain block" error.
> >
> > I recommend troubleshooting this as a network connectivity issue.  This
> > wiki page includes a few tips as a starting point:
> >
> > http://wiki.apache.org/hadoop/TroubleShooting
> >
> > Hope this helps,
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Fri, Oct 25, 2013 at 4:53 AM, Divya R <av...@gmail.com> wrote:
> >
> > > Hi Guys,
> > >
> > >    I'm indexing data (~50 -100GB per day) from hadoop. Hadoop is
> Running
> > in
> > > cluster mode (having 2 dataNaodes currently). After every two or three
> > > hours I'm getting this exception. But both Data nodes are up and
> running.
> > > Can any one please guide me as to what I should do or  If I'm doing
> > wrong.
> > >
> > > Code Snippet:
> > > public InitHadoop()  {
> > >
> > >         configuration = new Configuration();
> > >         configuration.set("fs.default.name", "hdfs://<<namenode
> > > IP>>:54310"); // Is this write to specify on namenode IP.?
> > >         configuration.set("mapred.job.tracker", "hdfs://<<namenode
> > > IP>>:54311");
> > >
> > >         try {
> > >             fileSystem = FileSystem.get(configuration);
> > >         } catch (IOException e) {
> > >             e.printStackTrace();
> > >         }
> > > }
> > > private void indexDocument(FSDataInputStream file) {
> > >
> > >             Scanner scanner = new Scanner(file);
> > >
> > >             while (scanner.hasNext() != null) {
> > >                   //   Indexing code
> > >             }
> > >       }
> > > }
> > >
> > > Logs:
> > >
> > > 2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:57 INFO  DFSClient:2432 - Could not obtain block
> > > blk_-8795538519317154213_432897 from any node: java.io.IOException: No
> > live
> > > nodes contain current block. Will get new block locations from namenode
> > and
> > > retry...
> > > 2013-10-25 09:37:58 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:58 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:58 INFO  DFSClient:2432 - Could not obtain block
> > > blk_-5974673190155585497_432671 from any node: java.io.IOException: No
> > live
> > > nodes contain current block. Will get new block locations from namenode
> > and
> > > retry...
> > > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:59 INFO  DFSClient:2432 - Could not obtain block
> > > blk_-1662761320365439855_431653 from any node: java.io.IOException: No
> > live
> > > nodes contain current block. Will get new block locations from namenode
> > and
> > > retry...
> > > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > > /<<IP>>:50010, add to deadNodes and continuejava.net.BindException:
> > Cannot
> > > assign requested address
> > > 2013-10-25 09:37:59 WARN  DFSClient:2400 - DFS Read:
> java.io.IOException:
> > > Could not obtain block: blk_8826777676488299245_432528
> > > file=/flume/<<File.Name>>.1382639351042
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:2426)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2218)
> > >     at
> > >
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2381)
> > >     at java.io.DataInputStream.read(DataInputStream.java:149)
> > >     at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
> > >     at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
> > >     at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
> > >     at java.io.InputStreamReader.read(InputStreamReader.java:184)
> > >
> > > Regards,
> > > -Divya
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.