You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Stas Oskin <st...@gmail.com> on 2009/05/25 16:37:46 UTC

LeaseExpiredException Exception

Hi.

I have a process that writes to file on DFS from time to time, using
OutputStream.
After some time of writing, I'm starting getting the exception below, and
the write fails. The DFSClient retries several times, and then fails.

Copying the file from local disk to DFS via CopyLocalFile() works fine.

Can anyone advice on the matter?

I'm using Hadoop 0.18.3.

Thanks in advance.


09/05/25 15:35:35 INFO dfs.DFSClient: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on /test/test.bin File
does not exist. Holder DFSClient_-951664265 does not have any open files.

            at
org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1172)

            at
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1103
)

            at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)

            at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)

            at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
)

            at java.lang.reflect.Method.invoke(Method.java:597)

            at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)

            at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)



            at org.apache.hadoop.ipc.Client.call(Client.java:716)

            at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)

            at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

            at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)

            at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
)

            at java.lang.reflect.Method.invoke(Method.java:597)

            at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
)

            at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
)

            at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

            at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
)

            at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
)

            at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745
)

            at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922
)

Re: LeaseExpiredException Exception

Posted by Mehul Sutariya <me...@gmail.com>.

Hey Jason,

I use Hadoop 0.20.1 and I had seen the lease expired exception RecordWriter
was closed manually, which means I had my customized OutputFormat. So, after
closing the writer, framework tries to close the writer as well and fails.
My best guess here is that somewhere in your job, you are closing the writer
yourself rather than allowing the framework to do so.

Mehul.

On Tue, Dec 8, 2009 at 11:43 AM, Ken Krugler <kk...@transpac.com>wrote:

> Hi Jason,
> Hi Jason,
>
> Thanks for the info - it's good to hear from somebody else who's run into
> this :)
>
> I tried again with a bigger box for the master, and wound up with the same
> results.
>
> I guess the framework could be killing it - but no idea why. This is during
> a very simple "write out the results" phase, so very high I/O but not much
> computation, and nothing should be hung.
>
> Any particular configuration values you had to tweak? I'm running this in
> Elastic MapReduce (EMR) so most settings are whatever they provide by
> default. I override a few things in my JobConf, but (for example) anything
> related to HDFS/MR framework will be locked & loaded by the time my job is
> executing.
>
> Thanks!
>
> -- Ken
>
>
> On Dec 8, 2009, at 9:34am, Jason Venner wrote:
>
>  Is it possible that this is occurring in a task that is being killed by
>> the
>> framework.
>> Sometimes there is a little lag, between the time the tracker 'kills a
>> task'
>> and the task fully dies, you could be getting into a situation like that
>> where the task is in the process of dying but the last write is still in
>> progress.
>> I see this situation happen when the task tracker machine is heavily
>> loaded.
>> In once case there was a 15 minute lag between the timestamp in the
>> tracker
>> for killing task XYZ, and the task actually going away.
>>
>> It took me a while to work this out as I had to merge the tracker and task
>> logs by time to actually see the pattern.
>> The host machines where under very heavy io pressure, and may have been
>> paging also. The code and configuration issues that triggered this have
>> been
>> resolved, so I don't see it anymore.
>>
>> On Tue, Dec 8, 2009 at 8:32 AM, Ken Krugler <kkrugler_lists@transpac.com
>> >wrote:
>>
>>  Hi all,
>>>
>>> In searching the mail/web archives, I see occasionally questions from
>>> people (like me) who run into the LeaseExpiredException (in my case, on
>>> 0.18.3 while running a 50 server cluster in EMR).
>>>
>>> Unfortunately I don't see any responses, other than Dennis Kubes saying
>>> that he thought some work had been done in this area of Hadoop "a while
>>> back". And this was in 2007, so it hopefully doesn't apply to my
>>> situation.
>>>
>>> java.io.IOException: Stream closed.
>>>      at
>>>
>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.isClosed(DFSClient.java:2245)
>>>      at
>>>
>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2481)
>>>      at
>>>
>>> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:155)
>>>      at
>>> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
>>>      at
>>> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
>>>      at
>>> org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
>>>      at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
>>>      at
>>>
>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
>>>      at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>>      at
>>>
>>> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.writeBuffer(SequenceFile.java:1260)
>>>      at
>>>
>>> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.sync(SequenceFile.java:1277)
>>>      at
>>>
>>> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.close(SequenceFile.java:1295)
>>>      at
>>>
>>> org.apache.hadoop.mapred.SequenceFileOutputFormat$1.close(SequenceFileOutputFormat.java:73)
>>>      at
>>>
>>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.close(MapTask.java:276)
>>>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:238)
>>>      at
>>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2216)
>>>
>>> This issue seemed related, but would have been fixed in the 0.18.3
>>> release.
>>>
>>> http://issues.apache.org/jira/browse/HADOOP-3760
>>>
>>> I saw a similar HBase issue -
>>> https://issues.apache.org/jira/browse/HBASE-529 - but they "fixed" it by
>>> retrying a failure case.
>>>
>>> These exceptions occur during "write storms", where lots of files are
>>> being
>>> written out. Though "lots" is relative, e.g. 10-20M.
>>>
>>> It's repeatable, in that it fails on the same step of a series of chained
>>> MR jobs.
>>>
>>> Is it possible I need to be running a bigger box for my namenode server?
>>> Any other ideas?
>>>
>>> Thanks,
>>>
>>> -- Ken
>>>
>>>
>>> On May 25, 2009, at 7:37am, Stas Oskin wrote:
>>>
>>> Hi.
>>>
>>>>
>>>> I have a process that writes to file on DFS from time to time, using
>>>> OutputStream.
>>>> After some time of writing, I'm starting getting the exception below,
>>>> and
>>>> the write fails. The DFSClient retries several times, and then fails.
>>>>
>>>> Copying the file from local disk to DFS via CopyLocalFile() works fine.
>>>>
>>>> Can anyone advice on the matter?
>>>>
>>>> I'm using Hadoop 0.18.3.
>>>>
>>>> Thanks in advance.
>>>>
>>>>
>>>> 09/05/25 15:35:35 INFO dfs.DFSClient:
>>>> org.apache.hadoop.ipc.RemoteException:
>>>> org.apache.hadoop.dfs.LeaseExpiredException: No lease on /test/test.bin
>>>> File
>>>> does not exist. Holder DFSClient_-951664265 does not have any open
>>>> files.
>>>>
>>>>         at
>>>> org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1172)
>>>>
>>>>         at
>>>>
>>>>
>>>> org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1103
>>>> )
>>>>
>>>>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>>>>
>>>>         at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>>>>
>>>>         at
>>>>
>>>>
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
>>>> )
>>>>
>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>>
>>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>>>>
>>>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)
>>>>
>>>>
>>>>
>>>>         at org.apache.hadoop.ipc.Client.call(Client.java:716)
>>>>
>>>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>>
>>>>         at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
>>>>
>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>
>>>>         at
>>>>
>>>>
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
>>>> )
>>>>
>>>>         at
>>>>
>>>>
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
>>>> )
>>>>
>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>>
>>>>         at
>>>>
>>>>
>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
>>>> )
>>>>
>>>>         at
>>>>
>>>>
>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
>>>> )
>>>>
>>>>         at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
>>>>
>>>>         at
>>>>
>>>>
>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
>>>> )
>>>>
>>>>         at
>>>>
>>>>
>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
>>>> )
>>>>
>>>>         at
>>>>
>>>>
>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745
>>>> )
>>>>
>>>>         at
>>>>
>>>>
>>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922
>>>> )
>>>>
>>>>
>>>>  --------------------------------------------
>>> Ken Krugler
>>> +1 530-210-6378
>>> http://bixolabs.com
>>> e l a s t i c   w e b   m i n i n g
>>>
>>>
>>>
>>>
>>>
>>>
>>> --------------------------------------------
>>> Ken Krugler
>>> +1 530-210-6378
>>> http://bixolabs.com
>>> e l a s t i c   w e b   m i n i n g
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
>> www.prohadoopbook.com a community for Hadoop Professionals
>>
>
> --------------------------------------------
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> e l a s t i c   w e b   m i n i n g
>
>
>
>
>

Re: LeaseExpiredException Exception

Posted by Ken Krugler <kk...@transpac.com>.

Hi Jason,
Hi Jason,

Thanks for the info - it's good to hear from somebody else who's run  
into this :)

I tried again with a bigger box for the master, and wound up with the  
same results.

I guess the framework could be killing it - but no idea why. This is  
during a very simple "write out the results" phase, so very high I/O  
but not much computation, and nothing should be hung.

Any particular configuration values you had to tweak? I'm running this  
in Elastic MapReduce (EMR) so most settings are whatever they provide  
by default. I override a few things in my JobConf, but (for example)  
anything related to HDFS/MR framework will be locked & loaded by the  
time my job is executing.

Thanks!

-- Ken

On Dec 8, 2009, at 9:34am, Jason Venner wrote:

> Is it possible that this is occurring in a task that is being killed  
> by the
> framework.
> Sometimes there is a little lag, between the time the tracker 'kills  
> a task'
> and the task fully dies, you could be getting into a situation like  
> that
> where the task is in the process of dying but the last write is  
> still in
> progress.
> I see this situation happen when the task tracker machine is heavily  
> loaded.
> In once case there was a 15 minute lag between the timestamp in the  
> tracker
> for killing task XYZ, and the task actually going away.
>
> It took me a while to work this out as I had to merge the tracker  
> and task
> logs by time to actually see the pattern.
> The host machines where under very heavy io pressure, and may have  
> been
> paging also. The code and configuration issues that triggered this  
> have been
> resolved, so I don't see it anymore.
>
> On Tue, Dec 8, 2009 at 8:32 AM, Ken Krugler <kkrugler_lists@transpac.com 
> >wrote:
>
>> Hi all,
>>
>> In searching the mail/web archives, I see occasionally questions from
>> people (like me) who run into the LeaseExpiredException (in my  
>> case, on
>> 0.18.3 while running a 50 server cluster in EMR).
>>
>> Unfortunately I don't see any responses, other than Dennis Kubes  
>> saying
>> that he thought some work had been done in this area of Hadoop "a  
>> while
>> back". And this was in 2007, so it hopefully doesn't apply to my  
>> situation.
>>
>> I see these LeaseExpiredException errors showing up in the logs  
>> around the
>> same time as IOException errors, eg:
>>
>> java.io.IOException: Stream closed.
>>       at
>> org.apache.hadoop.dfs.DFSClient 
>> $DFSOutputStream.isClosed(DFSClient.java:2245)
>>       at
>> org.apache.hadoop.dfs.DFSClient 
>> $DFSOutputStream.writeChunk(DFSClient.java:2481)
>>       at
>> org 
>> .apache 
>> .hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:155)
>>       at
>> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java: 
>> 132)
>>       at
>> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java: 
>> 121)
>>       at
>> org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
>>       at  
>> org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
>>       at
>> org.apache.hadoop.fs.FSDataOutputStream 
>> $PositionCache.write(FSDataOutputStream.java:49)
>>       at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>       at
>> org.apache.hadoop.io.SequenceFile 
>> $BlockCompressWriter.writeBuffer(SequenceFile.java:1260)
>>       at
>> org.apache.hadoop.io.SequenceFile 
>> $BlockCompressWriter.sync(SequenceFile.java:1277)
>>       at
>> org.apache.hadoop.io.SequenceFile 
>> $BlockCompressWriter.close(SequenceFile.java:1295)
>>       at
>> org.apache.hadoop.mapred.SequenceFileOutputFormat 
>> $1.close(SequenceFileOutputFormat.java:73)
>>       at
>> org.apache.hadoop.mapred.MapTask 
>> $DirectMapOutputCollector.close(MapTask.java:276)
>>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:238)
>>       at
>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java: 
>> 2216)
>>
>> This issue seemed related, but would have been fixed in the 0.18.3  
>> release.
>>
>> http://issues.apache.org/jira/browse/HADOOP-3760
>>
>> I saw a similar HBase issue -
>> https://issues.apache.org/jira/browse/HBASE-529 - but they "fixed"  
>> it by
>> retrying a failure case.
>>
>> These exceptions occur during "write storms", where lots of files  
>> are being
>> written out. Though "lots" is relative, e.g. 10-20M.
>>
>> It's repeatable, in that it fails on the same step of a series of  
>> chained
>> MR jobs.
>>
>> Is it possible I need to be running a bigger box for my namenode  
>> server?
>> Any other ideas?
>>
>> Thanks,
>>
>> -- Ken
>>
>>
>> On May 25, 2009, at 7:37am, Stas Oskin wrote:
>>
>> Hi.
>>>
>>> I have a process that writes to file on DFS from time to time, using
>>> OutputStream.
>>> After some time of writing, I'm starting getting the exception  
>>> below, and
>>> the write fails. The DFSClient retries several times, and then  
>>> fails.
>>>
>>> Copying the file from local disk to DFS via CopyLocalFile() works  
>>> fine.
>>>
>>> Can anyone advice on the matter?
>>>
>>> I'm using Hadoop 0.18.3.
>>>
>>> Thanks in advance.
>>>
>>>
>>> 09/05/25 15:35:35 INFO dfs.DFSClient:
>>> org.apache.hadoop.ipc.RemoteException:
>>> org.apache.hadoop.dfs.LeaseExpiredException: No lease on /test/ 
>>> test.bin
>>> File
>>> does not exist. Holder DFSClient_-951664265 does not have any open  
>>> files.
>>>
>>>          at
>>> org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java: 
>>> 1172)
>>>
>>>          at
>>>
>>> org 
>>> .apache 
>>> .hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1103
>>> )
>>>
>>>          at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java: 
>>> 330)
>>>
>>>          at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown  
>>> Source)
>>>
>>>          at
>>>
>>> sun 
>>> .reflect 
>>> .DelegatingMethodAccessorImpl 
>>> .invoke(DelegatingMethodAccessorImpl.java:25
>>> )
>>>
>>>          at java.lang.reflect.Method.invoke(Method.java:597)
>>>
>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>>>
>>>          at org.apache.hadoop.ipc.Server$Handler.run(Server.java: 
>>> 890)
>>>
>>>
>>>
>>>          at org.apache.hadoop.ipc.Client.call(Client.java:716)
>>>
>>>          at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>>
>>>          at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
>>>
>>>          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native  
>>> Method)
>>>
>>>          at
>>>
>>> sun 
>>> .reflect 
>>> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
>>> )
>>>
>>>          at
>>>
>>> sun 
>>> .reflect 
>>> .DelegatingMethodAccessorImpl 
>>> .invoke(DelegatingMethodAccessorImpl.java:25
>>> )
>>>
>>>          at java.lang.reflect.Method.invoke(Method.java:597)
>>>
>>>          at
>>>
>>> org 
>>> .apache 
>>> .hadoop 
>>> .io 
>>> .retry 
>>> .RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
>>> )
>>>
>>>          at
>>>
>>> org 
>>> .apache 
>>> .hadoop 
>>> .io 
>>> .retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
>>> )
>>>
>>>          at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
>>>
>>>          at
>>>
>>> org.apache.hadoop.dfs.DFSClient 
>>> $DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
>>> )
>>>
>>>          at
>>>
>>> org.apache.hadoop.dfs.DFSClient 
>>> $DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
>>> )
>>>
>>>          at
>>>
>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
>>> $1800(DFSClient.java:1745
>>> )
>>>
>>>          at
>>>
>>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
>>> $DataStreamer.run(DFSClient.java:1922
>>> )
>>>
>>>
>> --------------------------------------------
>> Ken Krugler
>> +1 530-210-6378
>> http://bixolabs.com
>> e l a s t i c   w e b   m i n i n g
>>
>>
>>
>>
>>
>>
>> --------------------------------------------
>> Ken Krugler
>> +1 530-210-6378
>> http://bixolabs.com
>> e l a s t i c   w e b   m i n i n g
>>
>>
>>
>>
>>
>
>
> -- 
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals

--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g

Re: LeaseExpiredException Exception

Posted by Jason Venner <ja...@gmail.com>.

Is it possible that this is occurring in a task that is being killed by the
framework.
Sometimes there is a little lag, between the time the tracker 'kills a task'
and the task fully dies, you could be getting into a situation like that
where the task is in the process of dying but the last write is still in
progress.
I see this situation happen when the task tracker machine is heavily loaded.
In once case there was a 15 minute lag between the timestamp in the tracker
for killing task XYZ, and the task actually going away.

It took me a while to work this out as I had to merge the tracker and task
logs by time to actually see the pattern.
The host machines where under very heavy io pressure, and may have been
paging also. The code and configuration issues that triggered this have been
resolved, so I don't see it anymore.

On Tue, Dec 8, 2009 at 8:32 AM, Ken Krugler <kk...@transpac.com>wrote:

> Hi all,
>
> In searching the mail/web archives, I see occasionally questions from
> people (like me) who run into the LeaseExpiredException (in my case, on
> 0.18.3 while running a 50 server cluster in EMR).
>
> Unfortunately I don't see any responses, other than Dennis Kubes saying
> that he thought some work had been done in this area of Hadoop "a while
> back". And this was in 2007, so it hopefully doesn't apply to my situation.
>
> I see these LeaseExpiredException errors showing up in the logs around the
> same time as IOException errors, eg:
>
> java.io.IOException: Stream closed.
>        at
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.isClosed(DFSClient.java:2245)
>        at
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2481)
>        at
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:155)
>        at
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
>        at
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
>        at
> org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
>        at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
>        at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>        at
> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.writeBuffer(SequenceFile.java:1260)
>        at
> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.sync(SequenceFile.java:1277)
>        at
> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.close(SequenceFile.java:1295)
>        at
> org.apache.hadoop.mapred.SequenceFileOutputFormat$1.close(SequenceFileOutputFormat.java:73)
>        at
> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.close(MapTask.java:276)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:238)
>        at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2216)
>
> This issue seemed related, but would have been fixed in the 0.18.3 release.
>
> http://issues.apache.org/jira/browse/HADOOP-3760
>
> I saw a similar HBase issue -
> https://issues.apache.org/jira/browse/HBASE-529 - but they "fixed" it by
> retrying a failure case.
>
> These exceptions occur during "write storms", where lots of files are being
> written out. Though "lots" is relative, e.g. 10-20M.
>
> It's repeatable, in that it fails on the same step of a series of chained
> MR jobs.
>
> Is it possible I need to be running a bigger box for my namenode server?
> Any other ideas?
>
> Thanks,
>
> -- Ken
>
>
> On May 25, 2009, at 7:37am, Stas Oskin wrote:
>
>  Hi.
>>
>> I have a process that writes to file on DFS from time to time, using
>> OutputStream.
>> After some time of writing, I'm starting getting the exception below, and
>> the write fails. The DFSClient retries several times, and then fails.
>>
>> Copying the file from local disk to DFS via CopyLocalFile() works fine.
>>
>> Can anyone advice on the matter?
>>
>> I'm using Hadoop 0.18.3.
>>
>> Thanks in advance.
>>
>>
>> 09/05/25 15:35:35 INFO dfs.DFSClient:
>> org.apache.hadoop.ipc.RemoteException:
>> org.apache.hadoop.dfs.LeaseExpiredException: No lease on /test/test.bin
>> File
>> does not exist. Holder DFSClient_-951664265 does not have any open files.
>>
>>           at
>> org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1172)
>>
>>           at
>>
>> org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1103
>> )
>>
>>           at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>>
>>           at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
>>
>>           at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
>> )
>>
>>           at java.lang.reflect.Method.invoke(Method.java:597)
>>
>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>>
>>           at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)
>>
>>
>>
>>           at org.apache.hadoop.ipc.Client.call(Client.java:716)
>>
>>           at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>
>>           at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
>>
>>           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>>           at
>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
>> )
>>
>>           at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
>> )
>>
>>           at java.lang.reflect.Method.invoke(Method.java:597)
>>
>>           at
>>
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
>> )
>>
>>           at
>>
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
>> )
>>
>>           at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
>>
>>           at
>>
>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
>> )
>>
>>           at
>>
>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
>> )
>>
>>           at
>>
>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745
>> )
>>
>>           at
>>
>> org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922
>> )
>>
>>
> --------------------------------------------
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> e l a s t i c   w e b   m i n i n g
>
>
>
>
>
>
> --------------------------------------------
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> e l a s t i c   w e b   m i n i n g
>
>
>
>
>


-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: LeaseExpiredException Exception

Posted by Ken Krugler <kk...@transpac.com>.

Hi all,

In searching the mail/web archives, I see occasionally questions from  
people (like me) who run into the LeaseExpiredException (in my case,  
on 0.18.3 while running a 50 server cluster in EMR).

Unfortunately I don't see any responses, other than Dennis Kubes  
saying that he thought some work had been done in this area of Hadoop  
"a while back". And this was in 2007, so it hopefully doesn't apply to  
my situation.

I see these LeaseExpiredException errors showing up in the logs around  
the same time as IOException errors, eg:

java.io.IOException: Stream closed.
	at org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.isClosed(DFSClient.java:2245)
	at org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.writeChunk(DFSClient.java:2481)
	at  
org 
.apache 
.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:155)
	at  
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
	at  
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
	at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
	at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
	at org.apache.hadoop.fs.FSDataOutputStream 
$PositionCache.write(FSDataOutputStream.java:49)
	at java.io.DataOutputStream.write(DataOutputStream.java:90)
	at org.apache.hadoop.io.SequenceFile 
$BlockCompressWriter.writeBuffer(SequenceFile.java:1260)
	at org.apache.hadoop.io.SequenceFile 
$BlockCompressWriter.sync(SequenceFile.java:1277)
	at org.apache.hadoop.io.SequenceFile 
$BlockCompressWriter.close(SequenceFile.java:1295)
	at org.apache.hadoop.mapred.SequenceFileOutputFormat 
$1.close(SequenceFileOutputFormat.java:73)
	at org.apache.hadoop.mapred.MapTask 
$DirectMapOutputCollector.close(MapTask.java:276)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:238)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java: 
2216)

This issue seemed related, but would have been fixed in the 0.18.3  
release.

http://issues.apache.org/jira/browse/HADOOP-3760

I saw a similar HBase issue - https://issues.apache.org/jira/browse/HBASE-529 
  - but they "fixed" it by retrying a failure case.

These exceptions occur during "write storms", where lots of files are  
being written out. Though "lots" is relative, e.g. 10-20M.

It's repeatable, in that it fails on the same step of a series of  
chained MR jobs.

Is it possible I need to be running a bigger box for my namenode  
server? Any other ideas?

Thanks,

-- Ken

On May 25, 2009, at 7:37am, Stas Oskin wrote:

> Hi.
>
> I have a process that writes to file on DFS from time to time, using
> OutputStream.
> After some time of writing, I'm starting getting the exception  
> below, and
> the write fails. The DFSClient retries several times, and then fails.
>
> Copying the file from local disk to DFS via CopyLocalFile() works  
> fine.
>
> Can anyone advice on the matter?
>
> I'm using Hadoop 0.18.3.
>
> Thanks in advance.
>
>
> 09/05/25 15:35:35 INFO dfs.DFSClient:  
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.dfs.LeaseExpiredException: No lease on /test/ 
> test.bin File
> does not exist. Holder DFSClient_-951664265 does not have any open  
> files.
>
>            at
> org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1172)
>
>            at
> org 
> .apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 
> 1103
> )
>
>            at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java: 
> 330)
>
>            at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown  
> Source)
>
>            at
> sun 
> .reflect 
> .DelegatingMethodAccessorImpl 
> .invoke(DelegatingMethodAccessorImpl.java:25
> )
>
>            at java.lang.reflect.Method.invoke(Method.java:597)
>
>            at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>
>            at org.apache.hadoop.ipc.Server$Handler.run(Server.java: 
> 890)
>
>
>
>            at org.apache.hadoop.ipc.Client.call(Client.java:716)
>
>            at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>
>            at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
>
>            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native  
> Method)
>
>            at
> sun 
> .reflect 
> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
>
>            at
> sun 
> .reflect 
> .DelegatingMethodAccessorImpl 
> .invoke(DelegatingMethodAccessorImpl.java:25
> )
>
>            at java.lang.reflect.Method.invoke(Method.java:597)
>
>            at
> org 
> .apache 
> .hadoop 
> .io 
> .retry 
> .RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
> )
>
>            at
> org 
> .apache 
> .hadoop 
> .io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
> )
>
>            at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
>
>            at
> org.apache.hadoop.dfs.DFSClient 
> $DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
> )
>
>            at
> org.apache.hadoop.dfs.DFSClient 
> $DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
> )
>
>            at
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
> $1800(DFSClient.java:1745
> )
>
>            at
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
> $DataStreamer.run(DFSClient.java:1922
> )
>

--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g