You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@chukwa.apache.org by xzer <xi...@gmail.com> on 2012/03/30 07:04:27 UTC

allocate customized information via chukwa

Hi,

I am working for a solution for allocating statistics information from
our own systems, I found that the architecture of chuka is a good one
for my work. I have 2 questions here:

1. I found that chuka is designed for allocate information of hadoop
clusters not for a common useage of information allocating, is that
right? or I missed something?

2. If I did not miss anything, is it possible that reforming chuka to
a common framework for allocating information?

Best Regards

xzer

Re: allocate customized information via chukwa

Posted by Eric Yang <er...@gmail.com>.

Hi,

Chukwa is a generic collection system.  It should be possible to extend it
to collect your log files and implement a demur parser for your log files.

Regards,
Eric

On Friday, March 30, 2012, xzer wrote:

> Hi,
>
> Thanks for your reply, I read about your advised projects, and I
> believe scribe or flume is for my case.
>
> But I found that In flume, there is no a defined tier for post
> processing just as demux jobs in chukwa, I know I can do them by an
> individual module of my system but I believe a built-in post
> processing conception is better.
>
> Another way, In scribe, it seems that there are things called
> continuous copier and parallel tailer for post processing but it is
> not clear that how they work.
>
> My system is not big enough as facebook or google, it has small or
> middle level load and there are only several servers, I need a
> replicatable, speedy and simple solution for log allocating. I found
> the architecture of chukwa is clear and simple for a middle  scale
> system despite of what it is really doing.
>
> I think scribe is too complex for me, and I am interested in whether
> it is possible to reuse the control follow of chukwa disregard where I
> save my data to. I would do more work on those frameworks but I think
> to abstract the control follow of chukwa is a good idea because it is
> simple enough for a small system.
>
> Best Regards
>
> xzer
>
> 在 2012年3月30日 下午2:29，Jiaqi Tan <tanjiaqi@gmail.com <javascript:;>> 写道：
> > Hi,
> >
> > On Fri, Mar 30, 2012 at 1:04 PM, xzer <xiaozhu@gmail.com <javascript:;>>
> wrote:
> >> Hi,
> >>
> >> I am working for a solution for allocating statistics information from
> >> our own systems, I found that the architecture of chuka is a good one
> >> for my work. I have 2 questions here:
> >>
> >> 1. I found that chuka is designed for allocate information of hadoop
> >> clusters not for a common useage of information allocating, is that
> >> right? or I missed something?
> >
> > Yes and no. You can run Chukwa collectors anywhere, but there are
> > quite a number of plugins designed for collecting Hadoop data. Also,
> > Chukwa deals with large volumes of collected data by writing to HDFS
> > (or HBase?), so it uses Hadoop as well.
> >
> >>
> >> 2. If I did not miss anything, is it possible that reforming chuka to
> >> a common framework for allocating information?
> >
> > There are a number of other frameworks for collecting log data and
> > other system information such as Facebook's Scribe, Cloudera's Flume,
> > and even more traditional systems such as Nagios or Ganglia.
> >
> >>
> >> Best Regards
> >>
> >> xzer
> >
> > Hope this helps,
> > Jiaqi
>

Re: allocate customized information via chukwa

Posted by xzer <xi...@gmail.com>.

Hi,

Thanks for your reply, I read about your advised projects, and I
believe scribe or flume is for my case.

But I found that In flume, there is no a defined tier for post
processing just as demux jobs in chukwa, I know I can do them by an
individual module of my system but I believe a built-in post
processing conception is better.

Another way, In scribe, it seems that there are things called
continuous copier and parallel tailer for post processing but it is
not clear that how they work.

My system is not big enough as facebook or google, it has small or
middle level load and there are only several servers, I need a
replicatable, speedy and simple solution for log allocating. I found
the architecture of chukwa is clear and simple for a middle  scale
system despite of what it is really doing.

I think scribe is too complex for me, and I am interested in whether
it is possible to reuse the control follow of chukwa disregard where I
save my data to. I would do more work on those frameworks but I think
to abstract the control follow of chukwa is a good idea because it is
simple enough for a small system.

Best Regards

xzer

在 2012年3月30日 下午2:29，Jiaqi Tan <ta...@gmail.com> 写道：
> Hi,
>
> On Fri, Mar 30, 2012 at 1:04 PM, xzer <xi...@gmail.com> wrote:
>> Hi,
>>
>> I am working for a solution for allocating statistics information from
>> our own systems, I found that the architecture of chuka is a good one
>> for my work. I have 2 questions here:
>>
>> 1. I found that chuka is designed for allocate information of hadoop
>> clusters not for a common useage of information allocating, is that
>> right? or I missed something?
>
> Yes and no. You can run Chukwa collectors anywhere, but there are
> quite a number of plugins designed for collecting Hadoop data. Also,
> Chukwa deals with large volumes of collected data by writing to HDFS
> (or HBase?), so it uses Hadoop as well.
>
>>
>> 2. If I did not miss anything, is it possible that reforming chuka to
>> a common framework for allocating information?
>
> There are a number of other frameworks for collecting log data and
> other system information such as Facebook's Scribe, Cloudera's Flume,
> and even more traditional systems such as Nagios or Ganglia.
>
>>
>> Best Regards
>>
>> xzer
>
> Hope this helps,
> Jiaqi

Re: allocate customized information via chukwa

Posted by Jiaqi Tan <ta...@gmail.com>.

Hi,

On Fri, Mar 30, 2012 at 1:04 PM, xzer <xi...@gmail.com> wrote:
> Hi,
>
> I am working for a solution for allocating statistics information from
> our own systems, I found that the architecture of chuka is a good one
> for my work. I have 2 questions here:
>
> 1. I found that chuka is designed for allocate information of hadoop
> clusters not for a common useage of information allocating, is that
> right? or I missed something?

Yes and no. You can run Chukwa collectors anywhere, but there are
quite a number of plugins designed for collecting Hadoop data. Also,
Chukwa deals with large volumes of collected data by writing to HDFS
(or HBase?), so it uses Hadoop as well.

>
> 2. If I did not miss anything, is it possible that reforming chuka to
> a common framework for allocating information?

There are a number of other frameworks for collecting log data and
other system information such as Facebook's Scribe, Cloudera's Flume,
and even more traditional systems such as Nagios or Ganglia.

>
> Best Regards
>
> xzer

Hope this helps,
Jiaqi