You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@chukwa.apache.org by "T. A. Smooth" <ca...@gmail.com> on 2011/08/04 01:45:22 UTC

Re: Agent and collector

Concerning my email below. How do you all feel end to end would work with a
LB?  I assume it would break the agent check pointing.
TP
On Jul 29, 2011 10:36 PM, "T. A. Smooth" <ca...@gmail.com> wrote:
> Thanks for the feed back. We will have to try out the end to end features.
>
> I also read:
http://www.usenix.org/events/lisa10/tech/full_papers/Rabkin.pdf
> This is some good informative stuff.
>
> I have a follow up question:
>
> Concerning question 2 about the load balancer(lb): It seems with
end-to-end
> features agent&lbs may have trouble with it. If each agent talks to the
> collector to get an update to see how far along the file has been saved to
> the datastore that could be problematic for a lb setup.
> Because in that case, an agent will be sending a chunk of data to the
> collector "thinking" it is just one collector. But in reality it will be
> sending to many collectors. And each chunk will be going to different
files
> on different collectors.
>
> When the agent polls the collector(load balancer) about the length of the
> file that it has been writing to the datastore it seems the collector
would
> not handle that situation very well. The request could get round robin'd
or
> randomly sent to a collector on the backend that may have never written a
> file for the agent or the collector may only have written a portion of the
> file the agent wants to know about. So one poll the agent would get one
set
> of file length info from a collector and another poll it may get another
set
> of info from a different collector.
>
> This could be controlled some if we did connection persistence or
connection
> stickiness on the lb side. So each agent would be stuck to a collector on
> the backend unless there is some availability issue with a collector. What
> other issues do you foresee?
>
> The main reason I love the lb idea, in theory, is because we don't have to
> update any of the agents when we decide to add collectors to the cluster.
We
> just have to update one place, the lb. In the default setup , when we add
a
> collector we would have to update the config files on all the agents and
> have them reread this collector list. For a few boxes this isn't a big
deal.
> But not so, if you have a ton of boxes.
>
> Thanks for everyone times! For real.
>
> -tp-
>
>
> Every few minutes, each agent process polls a collector to find the length
of
>>> each file to which data is being
>>
>> written. The length of the file is then compared with the
>>
>> offset at which each chunk was to be written. If the file
>>
>> length exceeds this value, then the data has been committed and the agent
>>> process advances its checkpoint accordingly. (Note that the length
returned
>>> by the filesystem
>>
>> is the amount of data that has been successfully replicated.)
>>
>>
> On Fri, Jul 29, 2011 at 4:37 PM, Eric Yang <er...@gmail.com> wrote:
>
>> Hi Tp,
>>
>> 1) Yes, chukwa communicate over http. By default, collector listens to
>> port 8080.
>>
>> 2) If agent only has one collect defined in it's collector list. It will
>> retry the same collector after a few second pause.
>>
>> 3) There are 2 additional features for improving end-to-end reliability.
>> In Chukwa collector, you can turn on httpConnector.asyncAcks=true. This
>> will ensure Agent resend data if the data has not been committed. A
second
>> method is to use localWriter to buffer the data on local disk of the
>> collector and periodically upload the data to HDFS. Both options can be
>> configured in chukwa-collector-conf.xml.
>>
>> Hope this helps.
>>
>> regards,
>> Eric
>>
>> On Jul 29, 2011, at 11:04 AM, T. A. Smooth wrote:
>>
>> > Hello I am checking out Chukwa. I have a few questions I was hoping the
>> mail list could answer :-)
>> >
>> > 1)Does Chukwa agents communicate to collectors over http? Or some other
>> protocol?
>> >
>> > The agent configuration makes me believe that:
>> http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Configuration
>> >
>> > 2) And the docs it seems an Agent will pick a collector at random and
>> then use that collect until there is a problem in communicating with it.
How
>> do you think the agent/collector would act if they have a load balancer
>> between them? For example, the agent configuration would have just one
url
>> http://collector-loadbalancer. example.com:8080/
>> >
>> > The load balancer would have 1 or more collectors behind it saving the
>> chunks it receives to disk or hadoop.
>> >
>> > 3) Does chukwa have any “end-to-end” reliability features for message
>> delivery? For example, a collector may receive the chunk from the agent
but
>> it may have a problem writing it to the data store. (ie. Disk space full,
>> connection to hadoop down) . Will the agent be notified that the chunk
was
>> not processed for a certain reason and the agent is told to cache to disk
>> the missed message?
>> >
>> > Thanks for the info!
>> >
>> > -tp-
>> >
>>
>>
>
>
> --
> *Splat*! <http://goog_843711221>
> <
http://www.webeclubbin.com/blog/2011/05/typoe-%E2%80%93-confetti-death-hypebeast/
>
>
> @CatDaaaady <https://twitter.com/#!/CatDaaaady>