You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Ted Yu <yu...@gmail.com> on 2010/05/17 18:57:43 UTC

datanode goes down, maybe due to Unexpected problem in creating temporary file

Hi,
We use CDH2 hadoop-0.20.2+228 which crashed on datanode smsrv10.ciq.com

I found this in datanode log:

2010-05-15 07:37:35,955 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_926027507678171558_3620 src: /10.32.56.170:53378 dest: /
10.32.56.171:50010
2010-05-15 07:37:35,956 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_926027507678171558_3620 received exception java.io.IOException:
Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
2010-05-15 07:37:35,956 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.32.56.171:50010,
storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
        at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
        at java.lang.Thread.run(Thread.java:619)

Can someone provide some clue ?

Thanks

Re: datanode goes down, maybe due to Unexpected problem in creating temporary file

Posted by Todd Lipcon <to...@cloudera.com>.
Have you disabled the statechange log on the NN? This block has to be in
there.

Also, are you by any chance running with append enabled on unpatched 0.20?

-Todd

On Mon, May 17, 2010 at 12:40 PM, Ted Yu <yu...@gmail.com> wrote:

> That blk doesn't appear in NameNode log.
>
> For datanode,
> 2010-05-15 00:09:31,023 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_926027507678171558_3620 src: /10.32.56.170:49172 dest: /
> 10.32.56.171:50010
> 2010-05-15 00:09:31,024 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_926027507678171558_3620 received exception java.io.IOException:
> Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
>
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
> 2010-05-15 00:09:31,024 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_-5814095875968936685_2910 received exception java.io.IOException:
> Unexpected problem in creating temporary file for
> blk_-5814095875968936685_2910.  File
>
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_-5814095875968936685
> should not be present, but is.
> 2010-05-15 00:09:31,025 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 10.32.56.171:50010,
> storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.io.IOException: Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
>
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>        at java.lang.Thread.run(Thread.java:619)
> 2010-05-15 00:09:31,025 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 10.32.56.171:50010,
> storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
> ipcPort=50020):DataXceiver
>
> 2010-05-15 00:19:28,334 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_926027507678171558_3620 src: /10.32.56.170:36887 dest: /
> 10.32.56.171:50010
> 2010-05-15 00:19:28,334 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_926027507678171558_3620 received exception java.io.IOException:
> Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
>
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
> 2010-05-15 00:19:28,334 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 10.32.56.171:50010,
> storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.io.IOException: Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
>
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>        at java.lang.Thread.run(Thread.java:619)
> 2010-05-15 00:29:25,635 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_926027507678171558_3620 src: /10.32.56.170:34823 dest: /
> 10.32.56.171:50010
>
> On Mon, May 17, 2010 at 11:43 AM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > Hi Ted,
> >
> > Can you please grep your NN and DN logs for blk_926027507678171558 and
> > pastebin the results?
> >
> > -Todd
> >
> > On Mon, May 17, 2010 at 9:57 AM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Hi,
> > > We use CDH2 hadoop-0.20.2+228 which crashed on datanode
> smsrv10.ciq.com
> > >
> > > I found this in datanode log:
> > >
> > > 2010-05-15 07:37:35,955 INFO
> > > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> > > blk_926027507678171558_3620 src: /10.32.56.170:53378 dest: /
> > > 10.32.56.171:50010
> > > 2010-05-15 07:37:35,956 INFO
> > > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> > > blk_926027507678171558_3620 received exception java.io.IOException:
> > > Unexpected problem in creating temporary file for
> > > blk_926027507678171558_3620.  File
> > >
> > >
> >
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> > > should not be present, but is.
> > > 2010-05-15 07:37:35,956 ERROR
> > > org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> > > 10.32.56.171:50010,
> > > storageID=DS-1723593983-10.32.56.171-50010-1273792791835,
> infoPort=50075,
> > > ipcPort=50020):DataXceiver
> > > java.io.IOException: Unexpected problem in creating temporary file for
> > > blk_926027507678171558_3620.  File
> > >
> > >
> >
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> > > should not be present, but is.
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
> > >        at java.lang.Thread.run(Thread.java:619)
> > >
> > > Can someone provide some clue ?
> > >
> > > Thanks
> > >
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: datanode goes down, maybe due to Unexpected problem in creating temporary file

Posted by Ted Yu <yu...@gmail.com>.
[sjc1-qa-certiq1.sjc1:hadoop 25763]df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                     303346048 281730720   6206176  98% /

In hadoop-hadoop-jobtracker- log repeated for over 50 minutes:
2010-05-18 00:57:38,451 INFO org.apache.hadoop.mapred.JobTracker: Cleaning
up the system directory
2010-05-18 00:57:38,452 INFO org.apache.hadoop.mapred.JobTracker: problem
cleaning system directory: hdfs://
sjc1-qa-certiq1.sjc1.carrieriq.com:9000/tmp/hadoop/mapred/system
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
/tmp/hadoop/mapred/system. Name node is in safe mode.
The reported blocks 793 needs additional 2 blocks to reach the threshold
0.9990 of total blocks 796. Safe mode will be turned off automatically.
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1711)
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1691)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:565)
    at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)

    at org.apache.hadoop.ipc.Client.call(Client.java:740)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
    at $Proxy4.delete(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy4.delete(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:586)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:227)
    at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1710)
    at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:188)
    at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:180)
    at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3735)

In datanode log two entries were written after 793 blocks getting processed:
2010-05-17 16:56:22,135 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_5132109619501575576_2882
2010-05-17 17:06:22,245 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_-4733675789688469676_1685
2010-05-17 17:16:22,383 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_5502091787302369348_2833
2010-05-17 17:26:22,507 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_-6935269688635441525_1778
2010-05-17 17:36:22,709 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_8901766802925964345_1804
2010-05-17 17:41:43,929 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: *BlockReport of 793 blocks
got processed in 107 msecs*
2010-05-17 17:46:23,065 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_-8536166246203803199_2384
2010-05-17 17:56:22,445 INFO
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
succeeded for blk_3069643184898592926_1667

I wonder what prevented from the last two blocks from being recognized by
Namenode ?

Here is the log from the other datanode:

2010-05-17 14:38:10,838 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.32.56.170:50010,
storageID=DS-1307210060-10.32.56.170-50010-1274132290796, infoPort=50075,
ipcPort=50020)In DataNode.run, data =
FSDataset{dirpath='/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/current'}
2010-05-17 14:38:10,839 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL
of 3600000msec Initial delay: 0msec
2010-05-17 14:38:10,887 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 blocks got
processed in 5 msecs
2010-05-17 14:38:10,888 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic block
scanner.
2010-05-17 15:15:51,037 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 blocks got
processed in 3 msecs

Currently our QA testing is blocked.

Can someone provide help ?

Thanks

On Mon, May 17, 2010 at 12:40 PM, Ted Yu <yu...@gmail.com> wrote:

> That blk doesn't appear in NameNode log.
>
> For datanode,
> 2010-05-15 00:09:31,023 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_926027507678171558_3620 src: /10.32.56.170:49172 dest: /
> 10.32.56.171:50010
> 2010-05-15 00:09:31,024 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_926027507678171558_3620 received exception java.io.IOException:
> Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
> 2010-05-15 00:09:31,024 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_-5814095875968936685_2910 received exception java.io.IOException:
> Unexpected problem in creating temporary file for
> blk_-5814095875968936685_2910.  File
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_-5814095875968936685
> should not be present, but is.
> 2010-05-15 00:09:31,025 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 10.32.56.171:50010,
> storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
> ipcPort=50020):DataXceiver
>
> java.io.IOException: Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
>         at
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
>         at
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
>         at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
>         at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
>         at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
>         at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
>         at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>         at java.lang.Thread.run(Thread.java:619)
> 2010-05-15 00:09:31,025 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 10.32.56.171:50010,
> storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
> ipcPort=50020):DataXceiver
>
> 2010-05-15 00:19:28,334 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_926027507678171558_3620 src: /10.32.56.170:36887 dest: /
> 10.32.56.171:50010
> 2010-05-15 00:19:28,334 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_926027507678171558_3620 received exception java.io.IOException:
> Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
> 2010-05-15 00:19:28,334 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 10.32.56.171:50010,
> storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
> ipcPort=50020):DataXceiver
>
> java.io.IOException: Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
>         at
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
>         at
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
>         at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
>         at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
>         at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
>         at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
>         at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>         at java.lang.Thread.run(Thread.java:619)
> 2010-05-15 00:29:25,635 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_926027507678171558_3620 src: /10.32.56.170:34823 dest: /
> 10.32.56.171:50010
>
>
> On Mon, May 17, 2010 at 11:43 AM, Todd Lipcon <to...@cloudera.com> wrote:
>
>> Hi Ted,
>>
>> Can you please grep your NN and DN logs for blk_926027507678171558 and
>> pastebin the results?
>>
>> -Todd
>>
>> On Mon, May 17, 2010 at 9:57 AM, Ted Yu <yu...@gmail.com> wrote:
>>
>> > Hi,
>> > We use CDH2 hadoop-0.20.2+228 which crashed on datanode smsrv10.ciq.com
>> >
>> > I found this in datanode log:
>> >
>> > 2010-05-15 07:37:35,955 INFO
>> > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
>> > blk_926027507678171558_3620 src: /10.32.56.170:53378 dest: /
>> > 10.32.56.171:50010
>> > 2010-05-15 07:37:35,956 INFO
>> > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
>> > blk_926027507678171558_3620 received exception java.io.IOException:
>> > Unexpected problem in creating temporary file for
>> > blk_926027507678171558_3620.  File
>> >
>> >
>> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
>> > should not be present, but is.
>> > 2010-05-15 07:37:35,956 ERROR
>> > org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
>> > 10.32.56.171:50010,
>> > storageID=DS-1723593983-10.32.56.171-50010-1273792791835,
>> infoPort=50075,
>> > ipcPort=50020):DataXceiver
>> > java.io.IOException: Unexpected problem in creating temporary file for
>> > blk_926027507678171558_3620.  File
>> >
>> >
>> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
>> > should not be present, but is.
>> >        at
>> >
>> >
>> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
>> >        at
>> >
>> >
>> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
>> >        at
>> >
>> >
>> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
>> >        at
>> >
>> >
>> org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
>> >        at
>> >
>> >
>> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
>> >        at
>> >
>> >
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
>> >        at
>> >
>> >
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>> >        at java.lang.Thread.run(Thread.java:619)
>> >
>> > Can someone provide some clue ?
>> >
>> > Thanks
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>
>

Re: datanode goes down, maybe due to Unexpected problem in creating temporary file

Posted by Ted Yu <yu...@gmail.com>.
That blk doesn't appear in NameNode log.

For datanode,
2010-05-15 00:09:31,023 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_926027507678171558_3620 src: /10.32.56.170:49172 dest: /
10.32.56.171:50010
2010-05-15 00:09:31,024 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_926027507678171558_3620 received exception java.io.IOException:
Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
2010-05-15 00:09:31,024 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_-5814095875968936685_2910 received exception java.io.IOException:
Unexpected problem in creating temporary file for
blk_-5814095875968936685_2910.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_-5814095875968936685
should not be present, but is.
2010-05-15 00:09:31,025 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.32.56.171:50010,
storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
        at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
        at java.lang.Thread.run(Thread.java:619)
2010-05-15 00:09:31,025 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.32.56.171:50010,
storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
ipcPort=50020):DataXceiver

2010-05-15 00:19:28,334 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_926027507678171558_3620 src: /10.32.56.170:36887 dest: /
10.32.56.171:50010
2010-05-15 00:19:28,334 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_926027507678171558_3620 received exception java.io.IOException:
Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
2010-05-15 00:19:28,334 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.32.56.171:50010,
storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Unexpected problem in creating temporary file for
blk_926027507678171558_3620.  File
/home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
should not be present, but is.
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
        at
org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
        at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
        at java.lang.Thread.run(Thread.java:619)
2010-05-15 00:29:25,635 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_926027507678171558_3620 src: /10.32.56.170:34823 dest: /
10.32.56.171:50010

On Mon, May 17, 2010 at 11:43 AM, Todd Lipcon <to...@cloudera.com> wrote:

> Hi Ted,
>
> Can you please grep your NN and DN logs for blk_926027507678171558 and
> pastebin the results?
>
> -Todd
>
> On Mon, May 17, 2010 at 9:57 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > Hi,
> > We use CDH2 hadoop-0.20.2+228 which crashed on datanode smsrv10.ciq.com
> >
> > I found this in datanode log:
> >
> > 2010-05-15 07:37:35,955 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> > blk_926027507678171558_3620 src: /10.32.56.170:53378 dest: /
> > 10.32.56.171:50010
> > 2010-05-15 07:37:35,956 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> > blk_926027507678171558_3620 received exception java.io.IOException:
> > Unexpected problem in creating temporary file for
> > blk_926027507678171558_3620.  File
> >
> >
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> > should not be present, but is.
> > 2010-05-15 07:37:35,956 ERROR
> > org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> > 10.32.56.171:50010,
> > storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
> > ipcPort=50020):DataXceiver
> > java.io.IOException: Unexpected problem in creating temporary file for
> > blk_926027507678171558_3620.  File
> >
> >
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> > should not be present, but is.
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
> >        at
> >
> >
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
> >        at java.lang.Thread.run(Thread.java:619)
> >
> > Can someone provide some clue ?
> >
> > Thanks
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: datanode goes down, maybe due to Unexpected problem in creating temporary file

Posted by Todd Lipcon <to...@cloudera.com>.
Hi Ted,

Can you please grep your NN and DN logs for blk_926027507678171558 and
pastebin the results?

-Todd

On Mon, May 17, 2010 at 9:57 AM, Ted Yu <yu...@gmail.com> wrote:

> Hi,
> We use CDH2 hadoop-0.20.2+228 which crashed on datanode smsrv10.ciq.com
>
> I found this in datanode log:
>
> 2010-05-15 07:37:35,955 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
> blk_926027507678171558_3620 src: /10.32.56.170:53378 dest: /
> 10.32.56.171:50010
> 2010-05-15 07:37:35,956 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> blk_926027507678171558_3620 received exception java.io.IOException:
> Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
>
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
> 2010-05-15 07:37:35,956 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 10.32.56.171:50010,
> storageID=DS-1723593983-10.32.56.171-50010-1273792791835, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.io.IOException: Unexpected problem in creating temporary file for
> blk_926027507678171558_3620.  File
>
> /home/hadoop/m2m_3.0.x/3.0.trunk.39-270238/data/hadoop-data/dfs/data/tmp/blk_926027507678171558
> should not be present, but is.
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:398)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.createTmpFile(FSDataset.java:376)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTmpFile(FSDataset.java:1133)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1022)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:98)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
>        at
>
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>        at java.lang.Thread.run(Thread.java:619)
>
> Can someone provide some clue ?
>
> Thanks
>



-- 
Todd Lipcon
Software Engineer, Cloudera