You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by liuyongbo <li...@baidu.com> on 2013/05/15 07:37:30 UTC
how to print the channel capacity
Hi:
I'm using flume to pass log data to mongodb, but I find that some
data lose when the pressure is in high level, so I want to know the max
request that flume can hold and need to print the capacity.but I can not
find the proper way to do this instead of change the source code. Any ideas?
thanks
Re: 答复: how to print the channel capacity
Posted by Nitin Pawar <ni...@gmail.com>.
instead of memory channel .. can you try file channel?
i think when you say exact point that can balance input and output .. you
want to figure out how many events can the memory channel buffer before you
start losing the events .. is that correct ?
from http://flume.apache.org/FlumeUserGuide.html#memory-channel
capacity100The max number of events stored in the channeltransactionCapacity
100The max number of events stored in the channel per
transactionkeep-alive3Timeout
in seconds for adding or removing an event
On Wed, May 15, 2013 at 5:09 PM, liuyongbo <li...@baidu.com> wrote:
> Thanks for your answer.****
>
> Additional,I’m using mem channel, write log to mongodb, when the input log
> is faster than consume(write into mongo), the queue is growing, when reach
> the max,the new input log is lost.****
>
> So, what I want to know is the exact point that can blance the input and
> output****
>
> ** **
>
> *发件人:* Nitin Pawar [mailto:nitinpawar432@gmail.com]
> *发送时间:* 2013年5月15日 16:49
> *收件人:* user@flume.apache.org
> *主题:* Re: how to print the channel capacity****
>
> ** **
>
> here is one example for the capacity defining flow ****
>
> https://cwiki.apache.org/FLUME/flume-ng-performance-measurements.html****
>
> ** **
>
> On Wed, May 15, 2013 at 2:16 PM, Nitin Pawar <ni...@gmail.com>
> wrote:****
>
> sorry pressed enter too soon ****
>
> ** **
>
> as for your question: how many events a flume agent can hold? ****
>
> sorry but I don't think there is any direct answer to that.... .I may be
> very well wrong there as I am myself pretty new with flume ****
>
> ** **
>
> there was a JIRA for the capacity of file channels FLUME-1571****
>
> ** **
>
> On Wed, May 15, 2013 at 1:50 PM, Nitin Pawar <ni...@gmail.com>
> wrote:****
>
> for maximum performance on your data flow two things which will matter
> most are: the channel and the transaction batch size.****
>
> when you say losing data, are you using memory channel? or file channel? *
> ***
>
> ** **
>
> Flume can batch events. The batch size is the maximum number of events
> that a sink or client will attempt to take from a channel in a single
> transaction.****
>
> ** **
>
> What is the channel type****
>
> do you have a slow sink so the # events written out are less than # event
> incoming to channels so over time it piles up ****
>
> ** **
>
> others may point out more things. ****
>
> Also your flume conf and if you are seeing any errors on flume then that
> will help people to find out the problem ****
>
> ** **
>
> On Wed, May 15, 2013 at 11:07 AM, liuyongbo <li...@baidu.com> wrote:**
> **
>
> Hi:****
>
> I’m using flume to pass log data to mongodb, but I find that some
> data lose when the pressure is in high level, so I want to know the max
> request that flume can hold and need to print the capacity.but I can not
> find the proper way to do this instead of change the source code. Any ideas?
> ****
>
> thanks****
>
>
>
> ****
>
> ** **
>
> --
> Nitin Pawar****
>
>
>
> ****
>
> ** **
>
> --
> Nitin Pawar****
>
>
>
> ****
>
> ** **
>
> --
> Nitin Pawar****
>
--
Nitin Pawar
答复: how to print the channel capacity
Posted by liuyongbo <li...@baidu.com>.
Thanks for your answer.
Additional,I’m using mem channel, write log to mongodb, when the input log
is faster than consume(write into mongo), the queue is growing, when reach
the max,the new input log is lost.
So, what I want to know is the exact point that can blance the input and
output
发件人: Nitin Pawar [mailto:nitinpawar432@gmail.com]
发送时间: 2013年5月15日 16:49
收件人: user@flume.apache.org
主题: Re: how to print the channel capacity
here is one example for the capacity defining flow
https://cwiki.apache.org/FLUME/flume-ng-performance-measurements.html
On Wed, May 15, 2013 at 2:16 PM, Nitin Pawar <ni...@gmail.com>
wrote:
sorry pressed enter too soon
as for your question: how many events a flume agent can hold?
sorry but I don't think there is any direct answer to that.... .I may be
very well wrong there as I am myself pretty new with flume
there was a JIRA for the capacity of file channels FLUME-1571
On Wed, May 15, 2013 at 1:50 PM, Nitin Pawar <ni...@gmail.com>
wrote:
for maximum performance on your data flow two things which will matter most
are: the channel and the transaction batch size.
when you say losing data, are you using memory channel? or file channel?
Flume can batch events. The batch size is the maximum number of events that
a sink or client will attempt to take from a channel in a single
transaction.
What is the channel type
do you have a slow sink so the # events written out are less than # event
incoming to channels so over time it piles up
others may point out more things.
Also your flume conf and if you are seeing any errors on flume then that
will help people to find out the problem
On Wed, May 15, 2013 at 11:07 AM, liuyongbo <li...@baidu.com> wrote:
Hi:
I’m using flume to pass log data to mongodb, but I find that some
data lose when the pressure is in high level, so I want to know the max
request that flume can hold and need to print the capacity.but I can not
find the proper way to do this instead of change the source code. Any ideas?
thanks
--
Nitin Pawar
--
Nitin Pawar
--
Nitin Pawar
Re: how to print the channel capacity
Posted by Nitin Pawar <ni...@gmail.com>.
here is one example for the capacity defining flow
https://cwiki.apache.org/FLUME/flume-ng-performance-measurements.html
On Wed, May 15, 2013 at 2:16 PM, Nitin Pawar <ni...@gmail.com>wrote:
> sorry pressed enter too soon
>
> as for your question: how many events a flume agent can hold?
> sorry but I don't think there is any direct answer to that.... .I may be
> very well wrong there as I am myself pretty new with flume
>
> there was a JIRA for the capacity of file channels FLUME-1571
>
>
> On Wed, May 15, 2013 at 1:50 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> for maximum performance on your data flow two things which will matter
>> most are: the channel and the transaction batch size.
>> when you say losing data, are you using memory channel? or file channel?
>>
>> Flume can batch events. The batch size is the maximum number of events
>> that a sink or client will attempt to take from a channel in a single
>> transaction.
>>
>> What is the channel type
>> do you have a slow sink so the # events written out are less than # event
>> incoming to channels so over time it piles up
>>
>> others may point out more things.
>> Also your flume conf and if you are seeing any errors on flume then that
>> will help people to find out the problem
>>
>>
>> On Wed, May 15, 2013 at 11:07 AM, liuyongbo <li...@baidu.com> wrote:
>>
>>> Hi:****
>>>
>>> I’m using flume to pass log data to mongodb, but I find that
>>> some data lose when the pressure is in high level, so I want to know the
>>> max request that flume can hold and need to print the capacity.but I can
>>> not find the proper way to do this instead of change the source code. Any
>>> ideas?****
>>>
>>> thanks****
>>>
>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>
>
> --
> Nitin Pawar
>
--
Nitin Pawar
Re: how to print the channel capacity
Posted by Nitin Pawar <ni...@gmail.com>.
sorry pressed enter too soon
as for your question: how many events a flume agent can hold?
sorry but I don't think there is any direct answer to that.... .I may be
very well wrong there as I am myself pretty new with flume
there was a JIRA for the capacity of file channels FLUME-1571
On Wed, May 15, 2013 at 1:50 PM, Nitin Pawar <ni...@gmail.com>wrote:
> for maximum performance on your data flow two things which will matter
> most are: the channel and the transaction batch size.
> when you say losing data, are you using memory channel? or file channel?
>
> Flume can batch events. The batch size is the maximum number of events
> that a sink or client will attempt to take from a channel in a single
> transaction.
>
> What is the channel type
> do you have a slow sink so the # events written out are less than # event
> incoming to channels so over time it piles up
>
> others may point out more things.
> Also your flume conf and if you are seeing any errors on flume then that
> will help people to find out the problem
>
>
> On Wed, May 15, 2013 at 11:07 AM, liuyongbo <li...@baidu.com> wrote:
>
>> Hi:****
>>
>> I’m using flume to pass log data to mongodb, but I find that
>> some data lose when the pressure is in high level, so I want to know the
>> max request that flume can hold and need to print the capacity.but I can
>> not find the proper way to do this instead of change the source code. Any
>> ideas?****
>>
>> thanks****
>>
>
>
>
> --
> Nitin Pawar
>
--
Nitin Pawar
Re: how to print the channel capacity
Posted by Nitin Pawar <ni...@gmail.com>.
for maximum performance on your data flow two things which will matter most
are: the channel and the transaction batch size.
when you say losing data, are you using memory channel? or file channel?
Flume can batch events. The batch size is the maximum number of events that
a sink or client will attempt to take from a channel in a single
transaction.
What is the channel type
do you have a slow sink so the # events written out are less than # event
incoming to channels so over time it piles up
others may point out more things.
Also your flume conf and if you are seeing any errors on flume then that
will help people to find out the problem
On Wed, May 15, 2013 at 11:07 AM, liuyongbo <li...@baidu.com> wrote:
> Hi:****
>
> I’m using flume to pass log data to mongodb, but I find that some
> data lose when the pressure is in high level, so I want to know the max
> request that flume can hold and need to print the capacity.but I can not
> find the proper way to do this instead of change the source code. Any ideas?
> ****
>
> thanks****
>
--
Nitin Pawar
答复: how to print the channel capacity
Posted by liuyongbo <li...@baidu.com>.
Thank u very much. The monitor is useful.
发件人: Paul Chavez [mailto:pchavez@verticalsearchworks.com]
发送时间: 2013年5月16日 0:36
收件人: user@flume.apache.org
主题: RE: how to print the channel capacity
There are a few ways to monitor flume in operation. We use the JSON
reporting, which is available via 'http://<agent address>:<port>/metrics'.
You need to start the agent with the following parameters to get this
interface:
-Dflume.monitoring.type=http -Dflume.monitoring.port=34545
We use cacti to graph channel size both as a percentage of maximum and
absolute number of events in channel. This provides warning if the sinks
cannot keep up with the sources.
We also graph ingress/egress event counts, much like a network bandwidth
graph, for some channels to get an idea of the throughput and to see if
sources/sinks are running at same speed.
_____
From: liuyongbo [mailto:liuyongbo@baidu.com]
Sent: Tuesday, May 14, 2013 10:38 PM
To: user@flume.apache.org
Subject: how to print the channel capacity
Hi:
I’m using flume to pass log data to mongodb, but I find that some
data lose when the pressure is in high level, so I want to know the max
request that flume can hold and need to print the capacity.but I can not
find the proper way to do this instead of change the source code. Any ideas?
thanks
Re: how to print the channel capacity
Posted by Matt Wise <ma...@nextdoor.com>.
http://engblog.nextdoor.com/post/50507841273/apache-flume-performance-monitoring
--Matt
On May 15, 2013, at 10:10 AM, Matt Wise <ma...@nextdoor.com> wrote:
> We do the same thing, but with Collectd as our graphing/collection mechanism. I am actually going to do a blog post in the next day or two with the code to our flume data collection script, and some example graphs/etc. We've done a similar thing with Zookeeper monitoring (http://engblog.nextdoor.com/post/49942956311/apache-zookeeper-performance-monitoring).
>
> --Matt
>
> On May 15, 2013, at 9:36 AM, Paul Chavez <pc...@verticalsearchworks.com> wrote:
>
>> There are a few ways to monitor flume in operation. We use the JSON reporting, which is available via 'http://<agent address>:<port>/metrics'. You need to start the agent with the following parameters to get this interface:
>> -Dflume.monitoring.type=http -Dflume.monitoring.port=34545
>> We use cacti to graph channel size both as a percentage of maximum and absolute number of events in channel. This provides warning if the sinks cannot keep up with the sources.
>>
>> We also graph ingress/egress event counts, much like a network bandwidth graph, for some channels to get an idea of the throughput and to see if sources/sinks are running at same speed.
>> From: liuyongbo [mailto:liuyongbo@baidu.com]
>> Sent: Tuesday, May 14, 2013 10:38 PM
>> To: user@flume.apache.org
>> Subject: how to print the channel capacity
>>
>> Hi:
>> I’m using flume to pass log data to mongodb, but I find that some data lose when the pressure is in high level, so I want to know the max request that flume can hold and need to print the capacity.but I can not find the proper way to do this instead of change the source code. Any ideas?
>> thanks
>
Re: how to print the channel capacity
Posted by Matt Wise <ma...@nextdoor.com>.
We do the same thing, but with Collectd as our graphing/collection mechanism. I am actually going to do a blog post in the next day or two with the code to our flume data collection script, and some example graphs/etc. We've done a similar thing with Zookeeper monitoring (http://engblog.nextdoor.com/post/49942956311/apache-zookeeper-performance-monitoring).
--Matt
On May 15, 2013, at 9:36 AM, Paul Chavez <pc...@verticalsearchworks.com> wrote:
> There are a few ways to monitor flume in operation. We use the JSON reporting, which is available via 'http://<agent address>:<port>/metrics'. You need to start the agent with the following parameters to get this interface:
> -Dflume.monitoring.type=http -Dflume.monitoring.port=34545
> We use cacti to graph channel size both as a percentage of maximum and absolute number of events in channel. This provides warning if the sinks cannot keep up with the sources.
>
> We also graph ingress/egress event counts, much like a network bandwidth graph, for some channels to get an idea of the throughput and to see if sources/sinks are running at same speed.
> From: liuyongbo [mailto:liuyongbo@baidu.com]
> Sent: Tuesday, May 14, 2013 10:38 PM
> To: user@flume.apache.org
> Subject: how to print the channel capacity
>
> Hi:
> I’m using flume to pass log data to mongodb, but I find that some data lose when the pressure is in high level, so I want to know the max request that flume can hold and need to print the capacity.but I can not find the proper way to do this instead of change the source code. Any ideas?
> thanks
RE: how to print the channel capacity
Posted by Paul Chavez <pc...@verticalsearchworks.com>.
There are a few ways to monitor flume in operation. We use the JSON reporting, which is available via 'http://<agent address>:<port>/metrics'. You need to start the agent with the following parameters to get this interface:
-Dflume.monitoring.type=http -Dflume.monitoring.port=34545
We use cacti to graph channel size both as a percentage of maximum and absolute number of events in channel. This provides warning if the sinks cannot keep up with the sources.
We also graph ingress/egress event counts, much like a network bandwidth graph, for some channels to get an idea of the throughput and to see if sources/sinks are running at same speed.
________________________________
From: liuyongbo [mailto:liuyongbo@baidu.com]
Sent: Tuesday, May 14, 2013 10:38 PM
To: user@flume.apache.org
Subject: how to print the channel capacity
Hi:
I'm using flume to pass log data to mongodb, but I find that some data lose when the pressure is in high level, so I want to know the max request that flume can hold and need to print the capacity.but I can not find the proper way to do this instead of change the source code. Any ideas?
thanks