You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flume.apache.org by "Anes Mukhametov (Jira)" <ji...@apache.org> on 2020/05/11 14:36:00 UTC

[jira] [Updated] (FLUME-3366) hdfs sink can't close (and rename?) file after hdfs.DFSClien timeout

     [ https://issues.apache.org/jira/browse/FLUME-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anes Mukhametov updated FLUME-3366:
-----------------------------------
    Summary: hdfs sink can't close (and rename?) file after hdfs.DFSClien timeout  (was: hdfs sink can't rename file after hdfs.DFSClien timeout)

> hdfs sink can't close (and rename?) file after hdfs.DFSClien timeout
> --------------------------------------------------------------------
>
>                 Key: FLUME-3366
>                 URL: https://issues.apache.org/jira/browse/FLUME-3366
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.6.0
>         Environment: cdh 5.16.2
> Sink configuration
> 2hdfs.sinks.scribe2hdfsprd.type         = hdfs
> 2hdfs.sinks.scribe2hdfsprd.channel      = kafka
> 2hdfs.sinks.scribe2hdfsprd.hdfs.path = hdfs://clname/data/logs/
> 2hdfs.sinks.scribe2hdfsprd.hdfs.filePrefix = logone-%Y-%m-%d-%H
> 2hdfs.sinks.scribe2hdfsprd.hdfs.rollSize = 268435456
> 2hdfs.sinks.scribe2hdfsprd.hdfs.rollInterval = 3600
> 2hdfs.sinks.scribe2hdfsprd.hdfs.rollCount = 0
> 2hdfs.sinks.scribe2hdfsprd.hdfs.idleTimeout = 300
> 2hdfs.sinks.scribe2hdfsprd.hdfs.batchSize = 1000
> 2hdfs.sinks.scribe2hdfsprd.hdfs.minBlockReplicas = 3
> 2hdfs.sinks.scribe2hdfsprd.hdfs.threadsPoolSize = 600
> 2hdfs.sinks.scribe2hdfsprd.hdfs.rollTimerPoolSize = 10
> 2hdfs.sinks.scribe2hdfsprd.hdfs.callTimeout = 300000
> 2hdfs.sinks.scribe2hdfsprd.hdfs.fileType = DataStream
> 2hdfs.sinks.scribe2hdfsprd.hdfs.writeFormat = Text
> 2hdfs.sinks.scribe2hdfsprd.hdfs.proxyUser = stat
> 2hdfs.sinks.scribe2hdfsprd.sink.serializer = text
> 2hdfs.sinks.scribe2hdfsprd.sink.serializer.appendNewline = false
>            Reporter: Anes Mukhametov
>            Priority: Major
>
> There seems to be a problem related to hdfs timeouts.
> {{
> 2020-05-03 20:50:39,819 INFO org.apache.flume.sink.hdfs.BucketWriter: Creating hdfs://clname/data/logs/logone-2020-05-03-20.1588528239807.tmp
> 2020-05-03 21:01:52,208 INFO org.apache.flume.sink.hdfs.BucketWriter: Closing idle bucketWriter hdfs://clname/data/logs/logone-2020-05-03-20.1588528239807.tmp at 1588528912208
> 2020-05-03 21:01:52,208 INFO org.apache.flume.sink.hdfs.HDFSEventSink: Writer callback called.
> 2020-05-03 21:01:52,208 INFO org.apache.flume.sink.hdfs.BucketWriter: Closing hdfs://clname/data/logs/logone-2020-05-03-20.1588528239807.tmp
> 2020-05-03 21:03:07,282 WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for block BP-1441209260-10.30.0.19-1387400921010:blk_5841169679_1121097069614
> java.net.SocketTimeoutException: 75000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.30.3.187:40062 remote=/10.30.3.187:50010]
>         at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>         at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
>         at java.io.FilterInputStream.read(FilterInputStream.java:83)
>         at java.io.FilterInputStream.read(FilterInputStream.java:83)
>         at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2354)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:235)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:1093)
> 2020-05-03 21:03:07,283 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-1441209260-10.30.0.19-1387400921010:blk_5841169679_1121097069614 in pipeline DatanodeInfoWithStorage[10.30.3.187:50010,DS-306b4cb3-ba0d-4970-80cd-894083759467,DISK], DatanodeInfoWithStorage[10.30.3.196:50010,DS-bdc91c20-e8d0-4d43-a1b0-41cabb8306b3,DISK], DatanodeInfoWithStorage[10.30.3.10:50010,DS-5592761f-80c3-473c-b34b-536d52f3908e,DISK]: bad datanode DatanodeInfoWithStorage[10.30.3.187:50010,DS-306b4cb3-ba0d-4970-80cd-894083759467,DISK]
> 020-05-03 21:05:22,463 WARN org.apache.flume.sink.hdfs.BucketWriter: Closing file: hdfs://clname/data/logs/logone-2020-05-03-20.1588528239807.tmp failed. Will retry again in 180 seconds.
> java.io.IOException: All datanodes DatanodeInfoWithStorage[10.30.3.10:50010,DS-5592761f-80c3-473c-b34b-536d52f3908e,DISK] are bad. Aborting...
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1512)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1254)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:739)
> 2020-05-03 21:05:22,474 INFO org.apache.flume.sink.hdfs.BucketWriter: Renaming hdfs://prdhdp/data/mt/stream-flume/gekko_bahromator/prdhdp881.g/gekko_bahromator-2020-05-03-20.1588528239807.tmp to hdfs://prdhdp/data/mt/stream-flume/gekko_bahromator/prdhdp881.g/gekko_bahromator-2020-05-03-20.1588528239807
> 2020-05-03 21:08:22,466 WARN org.apache.flume.sink.hdfs.BucketWriter: Closing file: hdfs://clname/data/logs/logone-2020-05-03-20.1588528239807.tmp failed. Will retry again in 180 seconds.
> java.io.IOException: All datanodes DatanodeInfoWithStorage[10.30.3.10:50010,DS-5592761f-80c3-473c-b34b-536d52f3908e,DISK] are bad. Aborting...
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1512)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1254)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:739)
> 2020-05-03 21:11:22,467 WARN org.apache.flume.sink.hdfs.BucketWriter: Closing file: hdfs://clname/data/logs/logone-2020-05-03-20.1588528239807.tmp failed. Will retry again in 180 seconds.
> java.io.IOException: All datanodes DatanodeInfoWithStorage[10.30.3.10:50010,DS-5592761f-80c3-473c-b34b-536d52f3908e,DISK] are bad. Aborting...
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1512)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1254)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:739)
> ...
> 2020-05-10 13:14:08,742 WARN org.apache.flume.sink.hdfs.BucketWriter: Closing file: hdfs://clname/data/logs/logone-2020-05-03-20.1588528239807.tmp failed. Will retry again in 180 seconds.
> java.io.IOException: All datanodes DatanodeInfoWithStorage[10.121.3.10:50010,DS-5592761f-80c3-473c-b34b-536d52f3908e,DISK] are bad. Aborting...
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1512)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1254)
>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:739)
> }}
> Flume infinitely tries to close the file.
> First of all, the file is renamed from .tmp to it's normal name already. But still logged with tmp suffix.
> {{
> -rw-r--r--   3 flume flume    5054814 2020-05-03 20:50 hdfs://clname/data/logs/logone-2020-05-03-20.1588528239807
> }}
> It is in open state also
> {{
> $ hdfs dfsadmin -listOpenFiles|grep 5-03-20.1588528239807
> 10.121.3.187            DFSClient_NONMAPREDUCE_833895359_50     /data/logs/logone-2020-05-03-20.1588528239807
> }}
> hdfs fsck -files -blocks -locations on this file runs without any error, but with no any block listed.
> Could this be a problem 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@flume.apache.org
For additional commands, e-mail: issues-help@flume.apache.org