You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by "Huang, Zijian(Victor)" <zi...@etrade.com> on 2011/09/26 21:14:47 UTC

Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)

Hi,
   I have encountered this problem with Flume twice. Flume agent just keep sending the same log file again and again to the collector and filling up all the disk space in the collector host at the end. Do you guys know what exactly causes Flume to lost count of the lines and keep re-streaming. I saw it happen when I try to stream some binary logs, and I saw it happen today with normal logs(may contains some binary data). I can replicated the problem easily. I am using "tail" to stream the content over

Please let me know what are the potential causes.

Thanks

Vic



RE: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)

Posted by "Huang, Zijian(Victor)" <zi...@etrade.com>.
some how OS tail uses a lot more memory. i am getting the below error message after running flume for 2 mins.

======
2011-09-27 13:18:04,862 [ReaderThread (/usr/bin/tail -n +0 -F /webapp/log/production.log-STDOUT)] ERROR exec.ExecNioSource: Entry too long, truncating: 5022912 > 5000000(max event size)
Exception in thread "logicalNode agent-qa_selaf_web_controller_selaf8w82m7_access.log-30" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.BitSet.initWords(BitSet.java:164)
        at java.util.BitSet.<init>(BitSet.java:159)
        at com.cloudera.flume.handlers.thrift.ThriftFlumeEvent.<init>(ThriftFlumeEvent.java:130)
        at com.cloudera.flume.handlers.thrift.ThriftEventAdaptor.convert(ThriftEventAdaptor.java:136)
=====

Vic

________________________________
From: Liam Friel [mailto:liam.friel@gmail.com]
Sent: Tuesday, September 27, 2011 12:35 PM
To: flume-user@incubator.apache.org
Subject: RE: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)


Hi Vic,

don't know I'm afraid. When OS tail worked for me I moved onto my next problem ... :-)

Liam

On Sep 27, 2011 1:09 AM, "Huang, Zijian(Victor)" <zi...@etrade.com>> wrote:
> Thanks, Liam. This help me to move forward. do you aware of any difference than the Flume implementation of Tail?
> I really want to see this https://issues.cloudera.org/browse/FLUME-533?page=com.atlassian.streams.streams-jira-plugin%3Aactivity-stream-issue-tab fix to sovle the problem
>
>
> Vic
>
> ________________________________
> From: Liam Friel [mailto:liam.friel@gmail.com<ma...@gmail.com>]
> Sent: Monday, September 26, 2011 3:04 PM
> To: flume-user@incubator.apache.org<ma...@incubator.apache.org>
> Subject: Re: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)
>
>
>
> On Mon, Sep 26, 2011 at 10:50 PM, Huang, Zijian(Victor) <zi...@etrade.com>>> wrote:
> I think I found the cause. because one of the line is larger than the set limit, I tried to set the flume.event.max.size.bytes in the agent node and collector node but the system doesn't seem to take the values
> ====
> <property>
> <name>flume.event.max.size.bytes</name>
> <value>2076150</value>
> <description>The length of line content in byte.</description>
> </property>
> =====
>
> am I doing anything wrong?
>
> Thanks
>
> Vic
> ________________________________
>
> I had a similar issue before.
> I switched to using OS tail, rather than flume tail, and it made my problems go away.
>
> This thread: https://groups.google.com/a/cloudera.org/group/flume-user/browse_thread/thread/b2f9ba4609124311/f7a45e5a34f09f13 has details.
>
> Liam
>

RE: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)

Posted by Liam Friel <li...@gmail.com>.
Hi Vic,

don't know I'm afraid. When OS tail worked for me I moved onto my next
problem ... :-)

Liam
On Sep 27, 2011 1:09 AM, "Huang, Zijian(Victor)" <zi...@etrade.com>
wrote:
> Thanks, Liam. This help me to move forward. do you aware of any difference
than the Flume implementation of Tail?
> I really want to see this
https://issues.cloudera.org/browse/FLUME-533?page=com.atlassian.streams.streams-jira-plugin%3Aactivity-stream-issue-tabfix
to sovle the problem
>
>
> Vic
>
> ________________________________
> From: Liam Friel [mailto:liam.friel@gmail.com]
> Sent: Monday, September 26, 2011 3:04 PM
> To: flume-user@incubator.apache.org
> Subject: Re: Flume agent repeatedly streaming the same content forever
(Version: 0.9.3, r)
>
>
>
> On Mon, Sep 26, 2011 at 10:50 PM, Huang, Zijian(Victor) <
zijian.huang@etrade.com<ma...@etrade.com>> wrote:
> I think I found the cause. because one of the line is larger than the set
limit, I tried to set the flume.event.max.size.bytes in the agent node and
collector node but the system doesn't seem to take the values
> ====
> <property>
> <name>flume.event.max.size.bytes</name>
> <value>2076150</value>
> <description>The length of line content in byte.</description>
> </property>
> =====
>
> am I doing anything wrong?
>
> Thanks
>
> Vic
> ________________________________
>
> I had a similar issue before.
> I switched to using OS tail, rather than flume tail, and it made my
problems go away.
>
> This thread:
https://groups.google.com/a/cloudera.org/group/flume-user/browse_thread/thread/b2f9ba4609124311/f7a45e5a34f09f13has
details.
>
> Liam
>

Re: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)

Posted by Mingjie Lai <mj...@gmail.com>.
Vic.

There is patch already at flume-533. If you really want it right now, 
you can apply it and make a new patch. So a committers can commit it 
later on. (people seem busy recently)

-mingjie

On 09/26/2011 05:09 PM, Huang, Zijian(Victor) wrote:
> Thanks, Liam. This help me to move forward. do you aware of any
> difference than the Flume implementation of Tail?
> I really want to see this
> https://issues.cloudera.org/browse/FLUME-533?page=com.atlassian.streams.streams-jira-plugin%3Aactivity-stream-issue-tab
> fix to sovle the problem
> Vic
>
> ------------------------------------------------------------------------
> *From:* Liam Friel [mailto:liam.friel@gmail.com]
> *Sent:* Monday, September 26, 2011 3:04 PM
> *To:* flume-user@incubator.apache.org
> *Subject:* Re: Flume agent repeatedly streaming the same content forever
> (Version: 0.9.3, r)
>
>
>
> On Mon, Sep 26, 2011 at 10:50 PM, Huang, Zijian(Victor)
> <zijian.huang@etrade.com <ma...@etrade.com>> wrote:
>
>     __
>     I think I found the cause. because one of the line is larger than
>     the set limit, I tried to set the flume.event.max.size.bytes in the
>     agent node and collector node but the system doesn't seem to take
>     the values
>     ====
>     <property>
>     <name>flume.event.max.size.bytes</name>
>     <value>2076150</value>
>     <description>The length of line content in byte.</description>
>     </property>
>     =====
>     am I doing anything wrong?
>     Thanks
>     Vic
>     ------------------------------------------------------------------------
>
>
> I had a similar issue before.
> I switched to using OS tail, rather than flume tail, and it made my
> problems go away.
>
> This thread:
> https://groups.google.com/a/cloudera.org/group/flume-user/browse_thread/thread/b2f9ba4609124311/f7a45e5a34f09f13
> has details.
>
> Liam
>

RE: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)

Posted by "Huang, Zijian(Victor)" <zi...@etrade.com>.
Thanks, Liam. This help me to move forward. do you aware of any difference than the Flume implementation of Tail?
I really want to see this https://issues.cloudera.org/browse/FLUME-533?page=com.atlassian.streams.streams-jira-plugin%3Aactivity-stream-issue-tab fix to sovle the problem


Vic

________________________________
From: Liam Friel [mailto:liam.friel@gmail.com]
Sent: Monday, September 26, 2011 3:04 PM
To: flume-user@incubator.apache.org
Subject: Re: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)



On Mon, Sep 26, 2011 at 10:50 PM, Huang, Zijian(Victor) <zi...@etrade.com>> wrote:
I think I found the cause. because one of the line is larger than the set limit, I tried to set the flume.event.max.size.bytes in the agent node and collector node but the system doesn't seem to take the values
====
  <property>
    <name>flume.event.max.size.bytes</name>
    <value>2076150</value>
    <description>The length of line content in byte.</description>
  </property>
=====

am I doing anything wrong?

Thanks

Vic
________________________________

I had a similar issue before.
I switched to using OS tail, rather than flume tail, and it made my problems go away.

This thread: https://groups.google.com/a/cloudera.org/group/flume-user/browse_thread/thread/b2f9ba4609124311/f7a45e5a34f09f13 has details.

Liam


Re: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)

Posted by Liam Friel <li...@gmail.com>.
On Mon, Sep 26, 2011 at 10:50 PM, Huang, Zijian(Victor) <
zijian.huang@etrade.com> wrote:

> **
> I think I found the cause. because one of the line is larger than the set
> limit, I tried to set the flume.event.max.size.bytes in the agent node and
> collector node but the system doesn't seem to take the values
> ====
>   <property>
>     <name>flume.event.max.size.bytes</name>
>     <value>2076150</value>
>     <description>The length of line content in byte.</description>
>   </property>
> =====
>
> am I doing anything wrong?
>
> Thanks
>
> Vic
>  ------------------------------
>

I had a similar issue before.
I switched to using OS tail, rather than flume tail, and it made my problems
go away.

This thread:
https://groups.google.com/a/cloudera.org/group/flume-user/browse_thread/thread/b2f9ba4609124311/f7a45e5a34f09f13
has
details.

Liam

RE: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)

Posted by "Huang, Zijian(Victor)" <zi...@etrade.com>.
I think I found the cause. because one of the line is larger than the set limit, I tried to set the flume.event.max.size.bytes in the agent node and collector node but the system doesn't seem to take the values
====
  <property>
    <name>flume.event.max.size.bytes</name>
    <value>2076150</value>
    <description>The length of line content in byte.</description>
  </property>
=====

am I doing anything wrong?

Thanks

Vic
________________________________
From: Huang, Zijian(Victor) [mailto:zijian.huang@etrade.com]
Sent: Monday, September 26, 2011 12:15 PM
To: flume-user@incubator.apache.org
Subject: Flume agent repeatedly streaming the same content forever (Version: 0.9.3, r)

Hi,
   I have encountered this problem with Flume twice. Flume agent just keep sending the same log file again and again to the collector and filling up all the disk space in the collector host at the end. Do you guys know what exactly causes Flume to lost count of the lines and keep re-streaming. I saw it happen when I try to stream some binary logs, and I saw it happen today with normal logs(may contains some binary data). I can replicated the problem easily. I am using "tail" to stream the content over

Please let me know what are the potential causes.

Thanks

Vic