You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Jeremy Karlson <je...@gmail.com> on 2013/07/18 00:24:59 UTC

Flume Data Directory Cleanup

Hi All,

I have a very busy channel that has about 100,000 events queued up.  My
data directory has about 50 data files, each about 1.6 GB.  I don't believe
my 100k events could be consuming that much space, so I'm jumping to
conclusions and assuming that most of these files are old and due for
cleanup (but I suppose it's possible).  I'm not finding much guidance in
the user guide on how often these files are cleaned up / removed /
compacted / etc.

Any thoughts on what's going on here, or what settings I should look for?
 Thanks.

-- Jeremy

Re: Flume Data Directory Cleanup

Posted by Jeremy Karlson <je...@gmail.com>.
I did a hard delete.  (I was out of disk space.)  I ended up just deleting
the whole channel directory and starting fresh.

I am running a very recent version, so I don't think I'd be affected by the
file removal bug...  And obviously my files were still in use, for reasons
I don't understand yet.

-- Jeremy


On Thu, Jul 18, 2013 at 11:09 AM, Hari Shreedharan <
hshreedharan@cloudera.com> wrote:

> Flume's deletion strategy is quite conservative. We do wait for 2
> checkpoints after all data was removed from a file before the files are
> deleted. In this case, it does look like the data was actually still
> referenced. We had a bug sometime back that caused files to not be deleted
> - but that was fixed quite a while back.
>
>
> Hari
>
>
> Thanks,
> Hari
>
> On Thursday, July 18, 2013 at 10:56 AM, Camp, Roy wrote:
>
>  We have noticed a few times that cleanup did not happen properly but a
> restart generally forced a cleanup.  ****
>
> ** **
>
> I would recommend putting the data files back unless you did a hard
> delete.  Alternatively, make sure you remove (backup first) the checkpoint
> files if you delete the data files.  That should put flume back to a fresh
> state. ****
>
> ** **
>
> Roy****
>
> ** **
>
> ** **
>
> ** **
>
> *From:* Jeremy Karlson [mailto:jeremykarlson@gmail.com<je...@gmail.com>]
>
> *Sent:* Thursday, July 18, 2013 10:42 AM
> *To:* user@flume.apache.org
> *Subject:* Re: Flume Data Directory Cleanup****
>
> ** **
>
> Thank you for your suggestion.  I took a careful look at that, and I'm not
> sure it describes my situation.  That refers to the sink, while my problem
> is with the channel.  I'm looking at a dramatic accumulation of log / meta
> files within the channel data directory.
>
> Additionally, I did try doing a manual cleanup of the channel directory,
> deleting the oldest log / meta files.  (This was my experiment.)  Flume
> really did not like that.  If it is required in the channel as well, the
> cutoff point at which the files go from being used to unused is not clear
> to me.****
>
> ** **
>
> -- Jeremy****
>
> ** **
>
> On Thu, Jul 18, 2013 at 10:13 AM, Lenin Raj <em...@gmail.com> wrote:*
> ***
>
> Hi Jeremy,
>
> Regarding cleanup, it was discussed already once.
>
>
> http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%3CA7B08BAB-C8B8-4B55-B3EC-A80AB4EBB438@gmail.com%3E
> ****
>
> You have to do it manually.****
>
>
> ****
>
>
> Thanks,
> Lenin****
>
> ** **
>
> On Thu, Jul 18, 2013 at 10:36 PM, Jeremy Karlson <je...@gmail.com>
> wrote:****
>
> To follow up:****
>
> ** **
>
> My Flume agent ran out of disk space last night and appeared to stop
> processing.  I shut it down and as an experiment (it's a test machine, why
> not?) I deleted the oldest 10 data files, to see if Flume actually needed
> these when it restarted.****
>
> ** **
>
> Flume was not happy with my choices.****
>
> ** **
>
> It spit out a lot of this:****
>
> ** **
>
> 2013-07-18 00:00:00,013 ERROR [pool-40-thread-1]        o.a.f.s.AvroSource
> Avro source mySource: Unable to process event batch. Exception follows.
> java.lang.IllegalStateException: Channel closed [channel=myFileChannel].
> Due to java.lang.NullPointerException: null****
>
>         at
> org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:353)
> ****
>
>         at
> org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122)
> ****
>
>         ...****
>
> Caused by: java.lang.NullPointerException****
>
>         at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:895)
> ****
>
>         at org.apache.flume.channel.file.Log.replay(Log.java:406)****
>
>         at
> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:303)****
>
>         ...****
>
> ** **
>
> So it seems like these files were actually in use, and not just leftover
> cruft.  A worthwhile thing to know, but I'd like to understand why.  My
> events are probably at most 1k of text, so it seems kind of odd to me that
> they'd consume more than 50GB of disk space in the channel.****
>
> ** **
>
> -- Jeremy****
>
> ** **
>
> ** **
>
> On Wed, Jul 17, 2013 at 3:24 PM, Jeremy Karlson <je...@gmail.com>
> wrote:****
>
> Hi All,****
>
> ** **
>
> I have a very busy channel that has about 100,000 events queued up.  My
> data directory has about 50 data files, each about 1.6 GB.  I don't believe
> my 100k events could be consuming that much space, so I'm jumping to
> conclusions and assuming that most of these files are old and due for
> cleanup (but I suppose it's possible).  I'm not finding much guidance in
> the user guide on how often these files are cleaned up / removed /
> compacted / etc.****
>
> ** **
>
> Any thoughts on what's going on here, or what settings I should look for?
>  Thanks.****
>
> ** **
>
> -- Jeremy****
>
> ** **
>
> ** **
>
> ** **
>
>
>

Re: Flume Data Directory Cleanup

Posted by Hari Shreedharan <hs...@cloudera.com>.
Flume's deletion strategy is quite conservative. We do wait for 2 checkpoints after all data was removed from a file before the files are deleted. In this case, it does look like the data was actually still referenced. We had a bug sometime back that caused files to not be deleted - but that was fixed quite a while back.


Hari 


Thanks,
Hari


On Thursday, July 18, 2013 at 10:56 AM, Camp, Roy wrote:

> We have noticed a few times that cleanup did not happen properly but a restart generally forced a cleanup.  
>  
> I would recommend putting the data files back unless you did a hard delete.  Alternatively, make sure you remove (backup first) the checkpoint files if you delete the data files.  That should put flume back to a fresh state. 
>  
> Roy
>  
>  
>  
> From: Jeremy Karlson [mailto:jeremykarlson@gmail.com] 
> Sent: Thursday, July 18, 2013 10:42 AM
> To: user@flume.apache.org (mailto:user@flume.apache.org)
> Subject: Re: Flume Data Directory Cleanup 
>  
> Thank you for your suggestion.  I took a careful look at that, and I'm not sure it describes my situation.  That refers to the sink, while my problem is with the channel.  I'm looking at a dramatic accumulation of log / meta files within the channel data directory.
> 
> Additionally, I did try doing a manual cleanup of the channel directory, deleting the oldest log / meta files.  (This was my experiment.)  Flume really did not like that.  If it is required in the channel as well, the cutoff point at which the files go from being used to unused is not clear to me. 
>  
> 
> -- Jeremy
> 
> 
>  
> On Thu, Jul 18, 2013 at 10:13 AM, Lenin Raj <emaillenin@gmail.com (mailto:emaillenin@gmail.com)> wrote:
> Hi Jeremy,
> 
> Regarding cleanup, it was discussed already once.
> 
> http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%3CA7B08BAB-C8B8-4B55-B3EC-A80AB4EBB438@gmail.com%3E 
> You have to do it manually.
> 
> 
> 
> Thanks,
> Lenin 
>  
> On Thu, Jul 18, 2013 at 10:36 PM, Jeremy Karlson <jeremykarlson@gmail.com (mailto:jeremykarlson@gmail.com)> wrote:
> To follow up:
>  
> 
> My Flume agent ran out of disk space last night and appeared to stop processing.  I shut it down and as an experiment (it's a test machine, why not?) I deleted the oldest 10 data files, to see if Flume actually needed these when it restarted.
> 
>  
> 
> Flume was not happy with my choices.
> 
>  
> 
> It spit out a lot of this:
> 
>  
> 
> 2013-07-18 00:00:00,013 ERROR [pool-40-thread-1]        o.a.f.s.AvroSource Avro source mySource: Unable to process event batch. Exception follows. java.lang.IllegalStateException: Channel closed [channel=myFileChannel]. Due to java.lang.NullPointerException: null
> 
>         at org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:353)
> 
>         at org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122)
> 
> 
>         ...
> 
> Caused by: java.lang.NullPointerException
> 
>         at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:895)
> 
>         at org.apache.flume.channel.file.Log.replay(Log.java:406)
> 
>         at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:303)
> 
> 
>         ...
> 
>  
> 
> So it seems like these files were actually in use, and not just leftover cruft.  A worthwhile thing to know, but I'd like to understand why.  My events are probably at most 1k of text, so it seems kind of odd to me that they'd consume more than 50GB of disk space in the channel.
> 
>  
> 
> -- Jeremy
> 
>  
> 
> 
>  
> On Wed, Jul 17, 2013 at 3:24 PM, Jeremy Karlson <jeremykarlson@gmail.com (mailto:jeremykarlson@gmail.com)> wrote:
> Hi All,
>  
> 
> I have a very busy channel that has about 100,000 events queued up.  My data directory has about 50 data files, each about 1.6 GB.  I don't believe my 100k events could be consuming that much space, so I'm jumping to conclusions and assuming that most of these files are old and due for cleanup (but I suppose it's possible).  I'm not finding much guidance in the user guide on how often these files are cleaned up / removed / compacted / etc.
> 
>  
> 
> Any thoughts on what's going on here, or what settings I should look for?  Thanks.
> 
>  
> 
> -- Jeremy
> 
> 
> 
>  
> 
> 
> 
> 
>  
> 
> 
> 
> 
>  
> 
> 
> 
> 



RE: Flume Data Directory Cleanup

Posted by "Camp, Roy" <rc...@ebay.com>.
We have noticed a few times that cleanup did not happen properly but a restart generally forced a cleanup.

I would recommend putting the data files back unless you did a hard delete.  Alternatively, make sure you remove (backup first) the checkpoint files if you delete the data files.  That should put flume back to a fresh state.

Roy



From: Jeremy Karlson [mailto:jeremykarlson@gmail.com]
Sent: Thursday, July 18, 2013 10:42 AM
To: user@flume.apache.org
Subject: Re: Flume Data Directory Cleanup

Thank you for your suggestion.  I took a careful look at that, and I'm not sure it describes my situation.  That refers to the sink, while my problem is with the channel.  I'm looking at a dramatic accumulation of log / meta files within the channel data directory.

Additionally, I did try doing a manual cleanup of the channel directory, deleting the oldest log / meta files.  (This was my experiment.)  Flume really did not like that.  If it is required in the channel as well, the cutoff point at which the files go from being used to unused is not clear to me.

-- Jeremy

On Thu, Jul 18, 2013 at 10:13 AM, Lenin Raj <em...@gmail.com>> wrote:
Hi Jeremy,

Regarding cleanup, it was discussed already once.

http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%3CA7B08BAB-C8B8-4B55-B3EC-A80AB4EBB438@gmail.com%3E
You have to do it manually.


Thanks,
Lenin

On Thu, Jul 18, 2013 at 10:36 PM, Jeremy Karlson <je...@gmail.com>> wrote:
To follow up:

My Flume agent ran out of disk space last night and appeared to stop processing.  I shut it down and as an experiment (it's a test machine, why not?) I deleted the oldest 10 data files, to see if Flume actually needed these when it restarted.

Flume was not happy with my choices.

It spit out a lot of this:

2013-07-18 00:00:00,013 ERROR [pool-40-thread-1]        o.a.f.s.AvroSource Avro source mySource: Unable to process event batch. Exception follows. java.lang.IllegalStateException: Channel closed [channel=myFileChannel]. Due to java.lang.NullPointerException: null
        at org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:353)
        at org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122)
        ...
Caused by: java.lang.NullPointerException
        at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:895)
        at org.apache.flume.channel.file.Log.replay(Log.java:406)
        at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:303)
        ...

So it seems like these files were actually in use, and not just leftover cruft.  A worthwhile thing to know, but I'd like to understand why.  My events are probably at most 1k of text, so it seems kind of odd to me that they'd consume more than 50GB of disk space in the channel.

-- Jeremy


On Wed, Jul 17, 2013 at 3:24 PM, Jeremy Karlson <je...@gmail.com>> wrote:
Hi All,

I have a very busy channel that has about 100,000 events queued up.  My data directory has about 50 data files, each about 1.6 GB.  I don't believe my 100k events could be consuming that much space, so I'm jumping to conclusions and assuming that most of these files are old and due for cleanup (but I suppose it's possible).  I'm not finding much guidance in the user guide on how often these files are cleaned up / removed / compacted / etc.

Any thoughts on what's going on here, or what settings I should look for?  Thanks.

-- Jeremy




Re: Flume Data Directory Cleanup

Posted by Jeremy Karlson <je...@gmail.com>.
Thank you for your suggestion.  I took a careful look at that, and I'm not
sure it describes my situation.  That refers to the sink, while my problem
is with the channel.  I'm looking at a dramatic accumulation of log / meta
files within the channel data directory.

Additionally, I did try doing a manual cleanup of the channel directory,
deleting the oldest log / meta files.  (This was my experiment.)  Flume
really did not like that.  If it is required in the channel as well, the
cutoff point at which the files go from being used to unused is not clear
to me.

-- Jeremy


On Thu, Jul 18, 2013 at 10:13 AM, Lenin Raj <em...@gmail.com> wrote:

> Hi Jeremy,
>
> Regarding cleanup, it was discussed already once.
>
>
> http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%3CA7B08BAB-C8B8-4B55-B3EC-A80AB4EBB438@gmail.com%3E
>
> You have to do it manually.
>
>
> Thanks,
> Lenin
>
>
> On Thu, Jul 18, 2013 at 10:36 PM, Jeremy Karlson <je...@gmail.com>wrote:
>
>> To follow up:
>>
>> My Flume agent ran out of disk space last night and appeared to stop
>> processing.  I shut it down and as an experiment (it's a test machine, why
>> not?) I deleted the oldest 10 data files, to see if Flume actually needed
>> these when it restarted.
>>
>> Flume was not happy with my choices.
>>
>> It spit out a lot of this:
>>
>> 2013-07-18 00:00:00,013 ERROR [pool-40-thread-1]
>>  o.a.f.s.AvroSource Avro source mySource: Unable to process event batch.
>> Exception follows. java.lang.IllegalStateException: Channel closed
>> [channel=myFileChannel]. Due to java.lang.NullPointerException: null
>>         at
>> org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:353)
>>         at
>> org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122)
>>         ...
>> Caused by: java.lang.NullPointerException
>>         at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:895)
>>         at org.apache.flume.channel.file.Log.replay(Log.java:406)
>>         at
>> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:303)
>>         ...
>>
>> So it seems like these files were actually in use, and not just leftover
>> cruft.  A worthwhile thing to know, but I'd like to understand why.  My
>> events are probably at most 1k of text, so it seems kind of odd to me that
>> they'd consume more than 50GB of disk space in the channel.
>>
>> -- Jeremy
>>
>>
>>
>> On Wed, Jul 17, 2013 at 3:24 PM, Jeremy Karlson <je...@gmail.com>wrote:
>>
>>> Hi All,
>>>
>>> I have a very busy channel that has about 100,000 events queued up.  My
>>> data directory has about 50 data files, each about 1.6 GB.  I don't believe
>>> my 100k events could be consuming that much space, so I'm jumping to
>>> conclusions and assuming that most of these files are old and due for
>>> cleanup (but I suppose it's possible).  I'm not finding much guidance in
>>> the user guide on how often these files are cleaned up / removed /
>>> compacted / etc.
>>>
>>> Any thoughts on what's going on here, or what settings I should look
>>> for?  Thanks.
>>>
>>> -- Jeremy
>>>
>>
>>
>

Re: Flume Data Directory Cleanup

Posted by Lenin Raj <em...@gmail.com>.
Hi Jeremy,

Regarding cleanup, it was discussed already once.

http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%3CA7B08BAB-C8B8-4B55-B3EC-A80AB4EBB438@gmail.com%3E

You have to do it manually.


Thanks,
Lenin


On Thu, Jul 18, 2013 at 10:36 PM, Jeremy Karlson <je...@gmail.com>wrote:

> To follow up:
>
> My Flume agent ran out of disk space last night and appeared to stop
> processing.  I shut it down and as an experiment (it's a test machine, why
> not?) I deleted the oldest 10 data files, to see if Flume actually needed
> these when it restarted.
>
> Flume was not happy with my choices.
>
> It spit out a lot of this:
>
> 2013-07-18 00:00:00,013 ERROR [pool-40-thread-1]        o.a.f.s.AvroSource
> Avro source mySource: Unable to process event batch. Exception follows.
> java.lang.IllegalStateException: Channel closed [channel=myFileChannel].
> Due to java.lang.NullPointerException: null
>         at
> org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:353)
>         at
> org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122)
>         ...
> Caused by: java.lang.NullPointerException
>         at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:895)
>         at org.apache.flume.channel.file.Log.replay(Log.java:406)
>         at
> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:303)
>         ...
>
> So it seems like these files were actually in use, and not just leftover
> cruft.  A worthwhile thing to know, but I'd like to understand why.  My
> events are probably at most 1k of text, so it seems kind of odd to me that
> they'd consume more than 50GB of disk space in the channel.
>
> -- Jeremy
>
>
>
> On Wed, Jul 17, 2013 at 3:24 PM, Jeremy Karlson <je...@gmail.com>wrote:
>
>> Hi All,
>>
>> I have a very busy channel that has about 100,000 events queued up.  My
>> data directory has about 50 data files, each about 1.6 GB.  I don't believe
>> my 100k events could be consuming that much space, so I'm jumping to
>> conclusions and assuming that most of these files are old and due for
>> cleanup (but I suppose it's possible).  I'm not finding much guidance in
>> the user guide on how often these files are cleaned up / removed /
>> compacted / etc.
>>
>> Any thoughts on what's going on here, or what settings I should look for?
>>  Thanks.
>>
>> -- Jeremy
>>
>
>

Re: Flume Data Directory Cleanup

Posted by Jeremy Karlson <je...@gmail.com>.
To follow up:

My Flume agent ran out of disk space last night and appeared to stop
processing.  I shut it down and as an experiment (it's a test machine, why
not?) I deleted the oldest 10 data files, to see if Flume actually needed
these when it restarted.

Flume was not happy with my choices.

It spit out a lot of this:

2013-07-18 00:00:00,013 ERROR [pool-40-thread-1]        o.a.f.s.AvroSource
Avro source mySource: Unable to process event batch. Exception follows.
java.lang.IllegalStateException: Channel closed [channel=myFileChannel].
Due to java.lang.NullPointerException: null
        at
org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:353)
        at
org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122)
        ...
Caused by: java.lang.NullPointerException
        at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:895)
        at org.apache.flume.channel.file.Log.replay(Log.java:406)
        at
org.apache.flume.channel.file.FileChannel.start(FileChannel.java:303)
        ...

So it seems like these files were actually in use, and not just leftover
cruft.  A worthwhile thing to know, but I'd like to understand why.  My
events are probably at most 1k of text, so it seems kind of odd to me that
they'd consume more than 50GB of disk space in the channel.

-- Jeremy



On Wed, Jul 17, 2013 at 3:24 PM, Jeremy Karlson <je...@gmail.com>wrote:

> Hi All,
>
> I have a very busy channel that has about 100,000 events queued up.  My
> data directory has about 50 data files, each about 1.6 GB.  I don't believe
> my 100k events could be consuming that much space, so I'm jumping to
> conclusions and assuming that most of these files are old and due for
> cleanup (but I suppose it's possible).  I'm not finding much guidance in
> the user guide on how often these files are cleaned up / removed /
> compacted / etc.
>
> Any thoughts on what's going on here, or what settings I should look for?
>  Thanks.
>
> -- Jeremy
>