You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by no jihun <je...@gmail.com> on 2016/07/20 07:03:25 UTC

File left as OPEN_FOR_WRITE state.

Hi.

I found some files on hdfs left as OPEN_FOR_WRITE state.

*This is flume's log about the file.*


01  18 7 2016 16:12:02,765 INFO
>  [SinkRunner-PollingRunner-DefaultSinkProcessor]
> (org.apache.flume.sink.hdfs.BucketWriter.open:234)

02 - Creating 1468825922758.avro.tmp


> 03  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)

04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812


> 05  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
> (org.apache.flume.sink.hdfs.BucketWriter.close:363)

06 - Closing 1468825922758.avro.tmp


> 07  18 7 2016 16:22:49,813 WARN  [hdfs-hdfs2-roll-timer-0]
> (org.apache.flume.sink.hdfs.BucketWriter.close:370)

08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp).
> Exception follows.

09 java.io.IOException: Callable timed out after 10000 ms on file:
> 1468825922758.avro.tmp


> 10  18 7 2016 16:22:49,816 INFO  [hdfs-hdfs2-call-runner-7]
> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)

11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro


- seems close never retried
- flume just renamed which still opened.


*2 day later I've found that file by this command*

hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep
> "2016/07/18" | sed 's/\/data\/flume\//\n\/data\/flume\//g' | grep -v
> ".avro.tmp" | sed -n 's/.*\(\/data\/flume\/.*avro\).*/\1/p'



*So, reverseLease-ed*

hdfs debug recoverLease -path 1468825922758.avro -retries 3
> recoverLease returned false.
> Retrying in 5000 ms...
> Retry #1
> recoverLease SUCCEEDED on 1468825922758.avro



*My hdfs sink configuration*

hadoop2.sinks.hdfs2.type = hdfs
> hadoop2.sinks.hdfs2.channel = fileCh1
> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
> hadoop2.sinks.hdfs2.serializer = ....
> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
> hadoop2.sinks.hdfs2.hdfs.rollCount = 0
> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300


hdfs.closeTries, retryInterval both not set.


*My question is  *
why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to .avro
succesufully.
Is this expected behavior? so , what should I do to eliminate these anomal
OPENFORWRITE files?

Regards,
Jihun.

Re: File left as OPEN_FOR_WRITE state.

Posted by Mike Percy <mp...@apache.org>.
I believe that is supported as of Flume 1.5.0:
http://flume.apache.org/FlumeUserGuide.html#hdfs-sink

See hdfs.retryInterval

If you think there is a problem with that behavior, please file a bug.

Regards,
Mike

On Wed, Jul 20, 2016 at 1:43 AM, no jihun <je...@gmail.com> wrote:

> In fact looking at your error the timeout looks like the hdfs.callTimeout,
>> so that's where I'd focus. Is your HDFS cluster particularily unperformant?
>> 10s to respond to a call is pretty slow.
>
> you are right.
>
> At that time hdfs disks fully utiliized by Map/Reduce jobs.
> I expected even flume failed to close files, a while later, disk under
> utilized , close retry processed by flume, then close file succefully.
>
> 2016-07-20 17:36 GMT+09:00 no jihun <je...@gmail.com>:
>
>> I know about idleTimeout. rollingSize, rollingCount ( which about roll
>> over writing file).
>>
>> I didn't set callTimeout, so the default 10s will be applied.
>> also closeTries, retryInterval haven't set too.
>>
>> So, I think even close failed one time, close retries will be retried
>> after 180s(default retryInterval)
>> But as you can see at the logs above, close retry never happen.
>>
>> am I wrong?
>>
>> 2016-07-20 17:25 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:
>>
>>> You could look at tuning either hdfs.idleTimeout, hdfs.callTimeout, or
>>> hdfs.retryInterval which can all be found at:
>>> http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
>>>
>>> --
>>> Chris Horrocks
>>>
>>>
>>> On Wed, Jul 20, 2016 at 9:01 am, no jihun <'jeesim2@gmail.com'> wrote:
>>>
>>> @chirs If you meant hdfs.callTimeout
>>> Now I am doing a test on that.
>>>
>>> I can increase the value.
>>> When timeout occur while close, It will never retried? ( as logs above )
>>>
>>> 2016-07-20 16:50 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:
>>>
>>>> Have you tried increasing the HDFS sink timeouts?
>>>>
>>>> --
>>>> Chris Horrocks
>>>>
>>>>
>>>> On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jeesim2@gmail.com'> wrote:
>>>>
>>>> Hi.
>>>>
>>>> I found some files on hdfs left as OPEN_FOR_WRITE state.
>>>>
>>>> *This is flume's log about the file.*
>>>>
>>>>
>>>> 01  18 7 2016 16:12:02,765 INFO
>>>>>  [SinkRunner-PollingRunner-DefaultSinkProcessor]
>>>>> (org.apache.flume.sink.hdfs.BucketWriter.open:234)
>>>>
>>>> 02 - Creating 1468825922758.avro.tmp
>>>>
>>>>
>>>>> 03  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>>>>> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)
>>>>
>>>> 04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812
>>>>
>>>>
>>>>> 05  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>>>>> (org.apache.flume.sink.hdfs.BucketWriter.close:363)
>>>>
>>>> 06 - Closing 1468825922758.avro.tmp
>>>>
>>>>
>>>>> 07  18 7 2016 16:22:49,813 WARN  [hdfs-hdfs2-roll-timer-0]
>>>>> (org.apache.flume.sink.hdfs.BucketWriter.close:370)
>>>>
>>>> 08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp).
>>>>> Exception follows.
>>>>
>>>> 09 java.io.IOException: Callable timed out after 10000 ms on file:
>>>>> 1468825922758.avro.tmp
>>>>
>>>>
>>>>> 10  18 7 2016 16:22:49,816 INFO  [hdfs-hdfs2-call-runner-7]
>>>>> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)
>>>>
>>>> 11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro
>>>>
>>>>
>>>> - seems close never retried
>>>> - flume just renamed which still opened.
>>>>
>>>>
>>>> *2 day later I've found that file by this command*
>>>>
>>>> hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep
>>>>> "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" |
>>>>> sed -n 's/.*(/data/flume/.*avro).*/ /p'
>>>>
>>>>
>>>>
>>>> *So, reverseLease-ed*
>>>>
>>>> hdfs debug recoverLease -path 1468825922758.avro -retries 3
>>>>> recoverLease returned false.
>>>>> Retrying in 5000 ms...
>>>>> Retry #1
>>>>> recoverLease SUCCEEDED on 1468825922758.avro
>>>>
>>>>
>>>>
>>>> *My hdfs sink configuration*
>>>>
>>>> hadoop2.sinks.hdfs2.type = hdfs
>>>>> hadoop2.sinks.hdfs2.channel = fileCh1
>>>>> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
>>>>> hadoop2.sinks.hdfs2.serializer = ....
>>>>> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
>>>>> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
>>>>> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
>>>>> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
>>>>> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
>>>>> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
>>>>> hadoop2.sinks.hdfs2.hdfs.rollCount = 0
>>>>> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
>>>>> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300
>>>>
>>>>
>>>> hdfs.closeTries, retryInterval both not set.
>>>>
>>>>
>>>> *My question is  *
>>>> why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to
>>>> .avro succesufully.
>>>> Is this expected behavior? so , what should I do to eliminate these
>>>> anomal OPENFORWRITE files?
>>>>
>>>> Regards,
>>>> Jihun.
>>>>
>>>>
>>>
>>>
>>> --
>>> ----------------------------------------------
>>> Jihun No ( 노지훈 )
>>> ----------------------------------------------
>>> Twitter          : @nozisim
>>> Facebook       : nozisim
>>> Website         : http://jeesim2.godohosting.com
>>>
>>> ---------------------------------------------------------------------------------
>>> Market Apps   : android market products.
>>> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>
>>>
>>>
>>
>>
>> --
>> ----------------------------------------------
>> Jihun No ( 노지훈 )
>> ----------------------------------------------
>> Twitter          : @nozisim
>> Facebook       : nozisim
>> Website         : http://jeesim2.godohosting.com
>>
>> ---------------------------------------------------------------------------------
>> Market Apps   : android market products.
>> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>
>>
>
>
>
> --
> ----------------------------------------------
> Jihun No ( 노지훈 )
> ----------------------------------------------
> Twitter          : @nozisim
> Facebook       : nozisim
> Website         : http://jeesim2.godohosting.com
>
> ---------------------------------------------------------------------------------
> Market Apps   : android market products.
> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>
>

Re: File left as OPEN_FOR_WRITE state.

Posted by no jihun <je...@gmail.com>.
>
> In fact looking at your error the timeout looks like the hdfs.callTimeout,
> so that's where I'd focus. Is your HDFS cluster particularily unperformant?
> 10s to respond to a call is pretty slow.

you are right.

At that time hdfs disks fully utiliized by Map/Reduce jobs.
I expected even flume failed to close files, a while later, disk under
utilized , close retry processed by flume, then close file succefully.

2016-07-20 17:36 GMT+09:00 no jihun <je...@gmail.com>:

> I know about idleTimeout. rollingSize, rollingCount ( which about roll
> over writing file).
>
> I didn't set callTimeout, so the default 10s will be applied.
> also closeTries, retryInterval haven't set too.
>
> So, I think even close failed one time, close retries will be retried
> after 180s(default retryInterval)
> But as you can see at the logs above, close retry never happen.
>
> am I wrong?
>
> 2016-07-20 17:25 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:
>
>> You could look at tuning either hdfs.idleTimeout, hdfs.callTimeout, or
>> hdfs.retryInterval which can all be found at:
>> http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
>>
>> --
>> Chris Horrocks
>>
>>
>> On Wed, Jul 20, 2016 at 9:01 am, no jihun <'jeesim2@gmail.com'> wrote:
>>
>> @chirs If you meant hdfs.callTimeout
>> Now I am doing a test on that.
>>
>> I can increase the value.
>> When timeout occur while close, It will never retried? ( as logs above )
>>
>> 2016-07-20 16:50 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:
>>
>>> Have you tried increasing the HDFS sink timeouts?
>>>
>>> --
>>> Chris Horrocks
>>>
>>>
>>> On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jeesim2@gmail.com'> wrote:
>>>
>>> Hi.
>>>
>>> I found some files on hdfs left as OPEN_FOR_WRITE state.
>>>
>>> *This is flume's log about the file.*
>>>
>>>
>>> 01  18 7 2016 16:12:02,765 INFO
>>>>  [SinkRunner-PollingRunner-DefaultSinkProcessor]
>>>> (org.apache.flume.sink.hdfs.BucketWriter.open:234)
>>>
>>> 02 - Creating 1468825922758.avro.tmp
>>>
>>>
>>>> 03  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>>>> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)
>>>
>>> 04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812
>>>
>>>
>>>> 05  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>>>> (org.apache.flume.sink.hdfs.BucketWriter.close:363)
>>>
>>> 06 - Closing 1468825922758.avro.tmp
>>>
>>>
>>>> 07  18 7 2016 16:22:49,813 WARN  [hdfs-hdfs2-roll-timer-0]
>>>> (org.apache.flume.sink.hdfs.BucketWriter.close:370)
>>>
>>> 08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp).
>>>> Exception follows.
>>>
>>> 09 java.io.IOException: Callable timed out after 10000 ms on file:
>>>> 1468825922758.avro.tmp
>>>
>>>
>>>> 10  18 7 2016 16:22:49,816 INFO  [hdfs-hdfs2-call-runner-7]
>>>> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)
>>>
>>> 11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro
>>>
>>>
>>> - seems close never retried
>>> - flume just renamed which still opened.
>>>
>>>
>>> *2 day later I've found that file by this command*
>>>
>>> hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep
>>>> "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" |
>>>> sed -n 's/.*(/data/flume/.*avro).*/ /p'
>>>
>>>
>>>
>>> *So, reverseLease-ed*
>>>
>>> hdfs debug recoverLease -path 1468825922758.avro -retries 3
>>>> recoverLease returned false.
>>>> Retrying in 5000 ms...
>>>> Retry #1
>>>> recoverLease SUCCEEDED on 1468825922758.avro
>>>
>>>
>>>
>>> *My hdfs sink configuration*
>>>
>>> hadoop2.sinks.hdfs2.type = hdfs
>>>> hadoop2.sinks.hdfs2.channel = fileCh1
>>>> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
>>>> hadoop2.sinks.hdfs2.serializer = ....
>>>> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
>>>> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
>>>> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
>>>> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
>>>> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
>>>> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
>>>> hadoop2.sinks.hdfs2.hdfs.rollCount = 0
>>>> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
>>>> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300
>>>
>>>
>>> hdfs.closeTries, retryInterval both not set.
>>>
>>>
>>> *My question is  *
>>> why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to
>>> .avro succesufully.
>>> Is this expected behavior? so , what should I do to eliminate these
>>> anomal OPENFORWRITE files?
>>>
>>> Regards,
>>> Jihun.
>>>
>>>
>>
>>
>> --
>> ----------------------------------------------
>> Jihun No ( 노지훈 )
>> ----------------------------------------------
>> Twitter          : @nozisim
>> Facebook       : nozisim
>> Website         : http://jeesim2.godohosting.com
>>
>> ---------------------------------------------------------------------------------
>> Market Apps   : android market products.
>> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>
>>
>>
>
>
> --
> ----------------------------------------------
> Jihun No ( 노지훈 )
> ----------------------------------------------
> Twitter          : @nozisim
> Facebook       : nozisim
> Website         : http://jeesim2.godohosting.com
>
> ---------------------------------------------------------------------------------
> Market Apps   : android market products.
> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>
>



-- 
----------------------------------------------
Jihun No ( 노지훈 )
----------------------------------------------
Twitter          : @nozisim
Facebook       : nozisim
Website         : http://jeesim2.godohosting.com
---------------------------------------------------------------------------------
Market Apps   : android market products.
<https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>

Re: File left as OPEN_FOR_WRITE state.

Posted by no jihun <je...@gmail.com>.
>
> In fact looking at your error the timeout looks like the hdfs.callTimeout,
> so that's where I'd focus. Is your HDFS cluster particularily unperformant?
> 10s to respond to a call is pretty slow.

you are right.

At that time hdfs disks fully utiliized by Map/Reduce jobs.
I expected that even flume failed to close file one time , a while later,
when disk under utilized , then close retry processed by flume and file
closed succefully.

2016-07-20 17:36 GMT+09:00 no jihun <je...@gmail.com>:

> I know about idleTimeout. rollingSize, rollingCount ( which about roll
> over writing file).
>
> I didn't set callTimeout, so the default 10s will be applied.
> also closeTries, retryInterval haven't set too.
>
> So, I think even close failed one time, close retries will be retried
> after 180s(default retryInterval)
> But as you can see at the logs above, close retry never happen.
>
> am I wrong?
>
> 2016-07-20 17:25 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:
>
>> You could look at tuning either hdfs.idleTimeout, hdfs.callTimeout, or
>> hdfs.retryInterval which can all be found at:
>> http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
>>
>> --
>> Chris Horrocks
>>
>>
>> On Wed, Jul 20, 2016 at 9:01 am, no jihun <'jeesim2@gmail.com'> wrote:
>>
>> @chirs If you meant hdfs.callTimeout
>> Now I am doing a test on that.
>>
>> I can increase the value.
>> When timeout occur while close, It will never retried? ( as logs above )
>>
>> 2016-07-20 16:50 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:
>>
>>> Have you tried increasing the HDFS sink timeouts?
>>>
>>> --
>>> Chris Horrocks
>>>
>>>
>>> On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jeesim2@gmail.com'> wrote:
>>>
>>> Hi.
>>>
>>> I found some files on hdfs left as OPEN_FOR_WRITE state.
>>>
>>> *This is flume's log about the file.*
>>>
>>>
>>> 01  18 7 2016 16:12:02,765 INFO
>>>>  [SinkRunner-PollingRunner-DefaultSinkProcessor]
>>>> (org.apache.flume.sink.hdfs.BucketWriter.open:234)
>>>
>>> 02 - Creating 1468825922758.avro.tmp
>>>
>>>
>>>> 03  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>>>> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)
>>>
>>> 04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812
>>>
>>>
>>>> 05  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>>>> (org.apache.flume.sink.hdfs.BucketWriter.close:363)
>>>
>>> 06 - Closing 1468825922758.avro.tmp
>>>
>>>
>>>> 07  18 7 2016 16:22:49,813 WARN  [hdfs-hdfs2-roll-timer-0]
>>>> (org.apache.flume.sink.hdfs.BucketWriter.close:370)
>>>
>>> 08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp).
>>>> Exception follows.
>>>
>>> 09 java.io.IOException: Callable timed out after 10000 ms on file:
>>>> 1468825922758.avro.tmp
>>>
>>>
>>>> 10  18 7 2016 16:22:49,816 INFO  [hdfs-hdfs2-call-runner-7]
>>>> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)
>>>
>>> 11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro
>>>
>>>
>>> - seems close never retried
>>> - flume just renamed which still opened.
>>>
>>>
>>> *2 day later I've found that file by this command*
>>>
>>> hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep
>>>> "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" |
>>>> sed -n 's/.*(/data/flume/.*avro).*/ /p'
>>>
>>>
>>>
>>> *So, reverseLease-ed*
>>>
>>> hdfs debug recoverLease -path 1468825922758.avro -retries 3
>>>> recoverLease returned false.
>>>> Retrying in 5000 ms...
>>>> Retry #1
>>>> recoverLease SUCCEEDED on 1468825922758.avro
>>>
>>>
>>>
>>> *My hdfs sink configuration*
>>>
>>> hadoop2.sinks.hdfs2.type = hdfs
>>>> hadoop2.sinks.hdfs2.channel = fileCh1
>>>> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
>>>> hadoop2.sinks.hdfs2.serializer = ....
>>>> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
>>>> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
>>>> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
>>>> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
>>>> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
>>>> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
>>>> hadoop2.sinks.hdfs2.hdfs.rollCount = 0
>>>> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
>>>> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300
>>>
>>>
>>> hdfs.closeTries, retryInterval both not set.
>>>
>>>
>>> *My question is  *
>>> why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to
>>> .avro succesufully.
>>> Is this expected behavior? so , what should I do to eliminate these
>>> anomal OPENFORWRITE files?
>>>
>>> Regards,
>>> Jihun.
>>>
>>>
>>
>>
>> --
>> ----------------------------------------------
>> Jihun No ( 노지훈 )
>> ----------------------------------------------
>> Twitter          : @nozisim
>> Facebook       : nozisim
>> Website         : http://jeesim2.godohosting.com
>>
>> ---------------------------------------------------------------------------------
>> Market Apps   : android market products.
>> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>
>>
>>
>
>
> --
> ----------------------------------------------
> Jihun No ( 노지훈 )
> ----------------------------------------------
> Twitter          : @nozisim
> Facebook       : nozisim
> Website         : http://jeesim2.godohosting.com
>
> ---------------------------------------------------------------------------------
> Market Apps   : android market products.
> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>
>



-- 
----------------------------------------------
Jihun No ( 노지훈 )
----------------------------------------------
Twitter          : @nozisim
Facebook       : nozisim
Website         : http://jeesim2.godohosting.com
---------------------------------------------------------------------------------
Market Apps   : android market products.
<https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>

Re: File left as OPEN_FOR_WRITE state.

Posted by no jihun <je...@gmail.com>.
I know about idleTimeout. rollingSize, rollingCount ( which about roll over
writing file).

I didn't set callTimeout, so the default 10s will be applied.
also closeTries, retryInterval haven't set too.

So, I think even close failed one time, close retries will be retried after
180s(default retryInterval)
But as you can see at the logs above, close retry never happen.

am I wrong?

2016-07-20 17:25 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:

> You could look at tuning either hdfs.idleTimeout, hdfs.callTimeout, or
> hdfs.retryInterval which can all be found at:
> http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
>
> --
> Chris Horrocks
>
>
> On Wed, Jul 20, 2016 at 9:01 am, no jihun <'jeesim2@gmail.com'> wrote:
>
> @chirs If you meant hdfs.callTimeout
> Now I am doing a test on that.
>
> I can increase the value.
> When timeout occur while close, It will never retried? ( as logs above )
>
> 2016-07-20 16:50 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:
>
>> Have you tried increasing the HDFS sink timeouts?
>>
>> --
>> Chris Horrocks
>>
>>
>> On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jeesim2@gmail.com'> wrote:
>>
>> Hi.
>>
>> I found some files on hdfs left as OPEN_FOR_WRITE state.
>>
>> *This is flume's log about the file.*
>>
>>
>> 01  18 7 2016 16:12:02,765 INFO
>>>  [SinkRunner-PollingRunner-DefaultSinkProcessor]
>>> (org.apache.flume.sink.hdfs.BucketWriter.open:234)
>>
>> 02 - Creating 1468825922758.avro.tmp
>>
>>
>>> 03  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>>> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)
>>
>> 04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812
>>
>>
>>> 05  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>>> (org.apache.flume.sink.hdfs.BucketWriter.close:363)
>>
>> 06 - Closing 1468825922758.avro.tmp
>>
>>
>>> 07  18 7 2016 16:22:49,813 WARN  [hdfs-hdfs2-roll-timer-0]
>>> (org.apache.flume.sink.hdfs.BucketWriter.close:370)
>>
>> 08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp).
>>> Exception follows.
>>
>> 09 java.io.IOException: Callable timed out after 10000 ms on file:
>>> 1468825922758.avro.tmp
>>
>>
>>> 10  18 7 2016 16:22:49,816 INFO  [hdfs-hdfs2-call-runner-7]
>>> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)
>>
>> 11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro
>>
>>
>> - seems close never retried
>> - flume just renamed which still opened.
>>
>>
>> *2 day later I've found that file by this command*
>>
>> hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep
>>> "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" |
>>> sed -n 's/.*(/data/flume/.*avro).*/ /p'
>>
>>
>>
>> *So, reverseLease-ed*
>>
>> hdfs debug recoverLease -path 1468825922758.avro -retries 3
>>> recoverLease returned false.
>>> Retrying in 5000 ms...
>>> Retry #1
>>> recoverLease SUCCEEDED on 1468825922758.avro
>>
>>
>>
>> *My hdfs sink configuration*
>>
>> hadoop2.sinks.hdfs2.type = hdfs
>>> hadoop2.sinks.hdfs2.channel = fileCh1
>>> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
>>> hadoop2.sinks.hdfs2.serializer = ....
>>> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
>>> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
>>> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
>>> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
>>> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
>>> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
>>> hadoop2.sinks.hdfs2.hdfs.rollCount = 0
>>> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
>>> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300
>>
>>
>> hdfs.closeTries, retryInterval both not set.
>>
>>
>> *My question is  *
>> why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to
>> .avro succesufully.
>> Is this expected behavior? so , what should I do to eliminate these
>> anomal OPENFORWRITE files?
>>
>> Regards,
>> Jihun.
>>
>>
>
>
> --
> ----------------------------------------------
> Jihun No ( 노지훈 )
> ----------------------------------------------
> Twitter          : @nozisim
> Facebook       : nozisim
> Website         : http://jeesim2.godohosting.com
>
> ---------------------------------------------------------------------------------
> Market Apps   : android market products.
> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>
>
>


-- 
----------------------------------------------
Jihun No ( 노지훈 )
----------------------------------------------
Twitter          : @nozisim
Facebook       : nozisim
Website         : http://jeesim2.godohosting.com
---------------------------------------------------------------------------------
Market Apps   : android market products.
<https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>

Re: File left as OPEN_FOR_WRITE state.

Posted by Chris Horrocks <ch...@hor.rocks>.
In fact looking at your error the timeout looks like the hdfs.callTimeout, so that's where I'd focus. Is your HDFS cluster particularily unperformant? 10s to respond to a call is pretty slow.


--
Chris Horrocks


On Wed, Jul 20, 2016 at 9:25 am, Chris Horrocks <'chris@hor.rocks'> wrote:

You could look at tuning either hdfs.idleTimeout, hdfs.callTimeout, or hdfs.retryInterval which can all be found at: http://flume.apache.org/FlumeUserGuide.html#hdfs-sink


--
Chris Horrocks


On Wed, Jul 20, 2016 at 9:01 am, no jihun <'jeesim2@gmail.com'> wrote:

@chirs If you meant hdfs.callTimeout
Now I am doing a test on that.

I can increase the value.
When timeout occur while close, It will never retried? ( as logs above )

2016-07-20 16:50 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:

Have you tried increasing the HDFS sink timeouts?


--
Chris Horrocks



On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jeesim2@gmail.com'> wrote:
Hi.



I found some files on hdfs left as OPEN_FOR_WRITE state.


This is flume's log about the file.




01  18 7 2016 16:12:02,765 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:234)  02  - Creating 1468825922758.avro.tmp
03   18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)  04   - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812
05   18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter.close:363)  06   - Closing 1468825922758.avro.tmp
07   18 7 2016 16:22:49,813 WARN [hdfs-hdfs2-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter.close:370)  08   - failed to close() HDFSWriter for file (1468825922758.avro.tmp). Exception follows.  09  java.io.IOException: Callable timed out after 10000 ms on file: 1468825922758.avro.tmp
10   18 7 2016 16:22:49,816 INFO [hdfs-hdfs2-call-runner-7] (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)   11    - Renaming 1468825922758.avro.tmp to 1468825922758.avro


- seems close never retried
- flume just renamed which still opened.




2 day later I've found that file by this command


hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" | sed -n 's/.*(/data/flume/.*avro).*/ /p'






So, reverseLease-ed


hdfs debug recoverLease -path 1468825922758.avro -retries 3
recoverLease returned false.
Retrying in 5000 ms...
Retry #1
recoverLease SUCCEEDED on 1468825922758.avro


My hdfs sink configuration


hadoop2.sinks.hdfs2.type = hdfs
hadoop2.sinks.hdfs2.channel = fileCh1
hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
hadoop2.sinks.hdfs2.serializer = ....
hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
#hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
hadoop2.sinks.hdfs2.hdfs.rollCount = 0
hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300

hdfs.closeTries, retryInterval both not set.


My question is
why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to .avro succesufully.


Is this expected behavior? so , what should I do to eliminate these anomal OPENFORWRITE files?


Regards,
Jihun.



--

----------------------------------------------
Jihun No ( 노지훈 )
----------------------------------------------
Twitter : @nozisim
Facebook : nozisim
Website : [http://jeesim2.godohosting.com](http://jeesim2.godohosting.com/)
---------------------------------------------------------------------------------
Market Apps : [android market products.](https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88)

Re: File left as OPEN_FOR_WRITE state.

Posted by Chris Horrocks <ch...@hor.rocks>.
You could look at tuning either hdfs.idleTimeout, hdfs.callTimeout, or hdfs.retryInterval which can all be found at: http://flume.apache.org/FlumeUserGuide.html#hdfs-sink


--
Chris Horrocks


On Wed, Jul 20, 2016 at 9:01 am, no jihun <'jeesim2@gmail.com'> wrote:

@chirs If you meant hdfs.callTimeout
Now I am doing a test on that.

I can increase the value.
When timeout occur while close, It will never retried? ( as logs above )

2016-07-20 16:50 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:

Have you tried increasing the HDFS sink timeouts?


--
Chris Horrocks



On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jeesim2@gmail.com'> wrote:
Hi.



I found some files on hdfs left as OPEN_FOR_WRITE state.


This is flume's log about the file.




01  18 7 2016 16:12:02,765 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:234)  02  - Creating 1468825922758.avro.tmp
03   18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)  04   - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812
05   18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter.close:363)  06   - Closing 1468825922758.avro.tmp
07   18 7 2016 16:22:49,813 WARN [hdfs-hdfs2-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter.close:370)  08   - failed to close() HDFSWriter for file (1468825922758.avro.tmp). Exception follows.  09  java.io.IOException: Callable timed out after 10000 ms on file: 1468825922758.avro.tmp
10   18 7 2016 16:22:49,816 INFO [hdfs-hdfs2-call-runner-7] (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)   11    - Renaming 1468825922758.avro.tmp to 1468825922758.avro


- seems close never retried
- flume just renamed which still opened.




2 day later I've found that file by this command


hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" | sed -n 's/.*(/data/flume/.*avro).*/ /p'






So, reverseLease-ed


hdfs debug recoverLease -path 1468825922758.avro -retries 3
recoverLease returned false.
Retrying in 5000 ms...
Retry #1
recoverLease SUCCEEDED on 1468825922758.avro


My hdfs sink configuration


hadoop2.sinks.hdfs2.type = hdfs
hadoop2.sinks.hdfs2.channel = fileCh1
hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
hadoop2.sinks.hdfs2.serializer = ....
hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
#hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
hadoop2.sinks.hdfs2.hdfs.rollCount = 0
hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300

hdfs.closeTries, retryInterval both not set.


My question is
why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to .avro succesufully.


Is this expected behavior? so , what should I do to eliminate these anomal OPENFORWRITE files?


Regards,
Jihun.



--

----------------------------------------------
Jihun No ( 노지훈 )
----------------------------------------------
Twitter : @nozisim
Facebook : nozisim
Website : [http://jeesim2.godohosting.com](http://jeesim2.godohosting.com/)
---------------------------------------------------------------------------------
Market Apps : [android market products.](https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88)

Re: File left as OPEN_FOR_WRITE state.

Posted by no jihun <je...@gmail.com>.
@chirs If you meant hdfs.callTimeout
Now I am doing a test on that.

I can increase the value.
When timeout occur while close, It will never retried? ( as logs above )

2016-07-20 16:50 GMT+09:00 Chris Horrocks <ch...@hor.rocks>:

> Have you tried increasing the HDFS sink timeouts?
>
> --
> Chris Horrocks
>
>
> On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jeesim2@gmail.com'> wrote:
>
> Hi.
>
> I found some files on hdfs left as OPEN_FOR_WRITE state.
>
> *This is flume's log about the file.*
>
>
> 01  18 7 2016 16:12:02,765 INFO
>>  [SinkRunner-PollingRunner-DefaultSinkProcessor]
>> (org.apache.flume.sink.hdfs.BucketWriter.open:234)
>
> 02 - Creating 1468825922758.avro.tmp
>
>
>> 03  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)
>
> 04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812
>
>
>> 05  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>> (org.apache.flume.sink.hdfs.BucketWriter.close:363)
>
> 06 - Closing 1468825922758.avro.tmp
>
>
>> 07  18 7 2016 16:22:49,813 WARN  [hdfs-hdfs2-roll-timer-0]
>> (org.apache.flume.sink.hdfs.BucketWriter.close:370)
>
> 08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp).
>> Exception follows.
>
> 09 java.io.IOException: Callable timed out after 10000 ms on file:
>> 1468825922758.avro.tmp
>
>
>> 10  18 7 2016 16:22:49,816 INFO  [hdfs-hdfs2-call-runner-7]
>> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)
>
> 11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro
>
>
> - seems close never retried
> - flume just renamed which still opened.
>
>
> *2 day later I've found that file by this command*
>
> hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep
>> "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" |
>> sed -n 's/.*(/data/flume/.*avro).*/ /p'
>
>
>
> *So, reverseLease-ed*
>
> hdfs debug recoverLease -path 1468825922758.avro -retries 3
>> recoverLease returned false.
>> Retrying in 5000 ms...
>> Retry #1
>> recoverLease SUCCEEDED on 1468825922758.avro
>
>
>
> *My hdfs sink configuration*
>
> hadoop2.sinks.hdfs2.type = hdfs
>> hadoop2.sinks.hdfs2.channel = fileCh1
>> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
>> hadoop2.sinks.hdfs2.serializer = ....
>> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
>> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
>> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
>> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
>> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
>> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
>> hadoop2.sinks.hdfs2.hdfs.rollCount = 0
>> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
>> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300
>
>
> hdfs.closeTries, retryInterval both not set.
>
>
> *My question is  *
> why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to
> .avro succesufully.
> Is this expected behavior? so , what should I do to eliminate these
> anomal OPENFORWRITE files?
>
> Regards,
> Jihun.
>
>


-- 
----------------------------------------------
Jihun No ( 노지훈 )
----------------------------------------------
Twitter          : @nozisim
Facebook       : nozisim
Website         : http://jeesim2.godohosting.com
---------------------------------------------------------------------------------
Market Apps   : android market products.
<https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>

Re: File left as OPEN_FOR_WRITE state.

Posted by Chris Horrocks <ch...@hor.rocks>.
Have you tried increasing the HDFS sink timeouts?


--
Chris Horrocks


On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jeesim2@gmail.com'> wrote:

Hi.



I found some files on hdfs left as OPEN_FOR_WRITE state.


This is flume's log about the file.




01  18 7 2016 16:12:02,765 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:234)  02  - Creating 1468825922758.avro.tmp
03   18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)  04   - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812
05   18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter.close:363)  06   - Closing 1468825922758.avro.tmp
07   18 7 2016 16:22:49,813 WARN [hdfs-hdfs2-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter.close:370)  08   - failed to close() HDFSWriter for file (1468825922758.avro.tmp). Exception follows.  09  java.io.IOException: Callable timed out after 10000 ms on file: 1468825922758.avro.tmp
10   18 7 2016 16:22:49,816 INFO [hdfs-hdfs2-call-runner-7] (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)   11    - Renaming 1468825922758.avro.tmp to 1468825922758.avro


- seems close never retried
- flume just renamed which still opened.




2 day later I've found that file by this command


hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" | sed -n 's/.*(/data/flume/.*avro).*//p'




So, reverseLease-ed


hdfs debug recoverLease -path 1468825922758.avro -retries 3
recoverLease returned false.
Retrying in 5000 ms...
Retry #1
recoverLease SUCCEEDED on 1468825922758.avro


My hdfs sink configuration


hadoop2.sinks.hdfs2.type = hdfs
hadoop2.sinks.hdfs2.channel = fileCh1
hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
hadoop2.sinks.hdfs2.serializer = ....
hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
#hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
hadoop2.sinks.hdfs2.hdfs.rollCount = 0
hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300

hdfs.closeTries, retryInterval both not set.


My question is
why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to .avro succesufully.

Is this expected behavior? so , what should I do to eliminate these anomal OPENFORWRITE files?


Regards,
Jihun.