You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Kim Vogt <vo...@llnl.gov> on 2009/12/02 00:14:22 UTC
Re: Scribe vs. Chukwa

Hi Eric,

Thanks for the info.  It doesn't look like Scribe comes with a log file
viewer, but I could be wrong?

I'm attempting to install Scribe now, and hopefully if I have time, do the
same with Chukwa to get a better feel for both.  I'm interested in checking
out the reporting tools Chukwa offers.

-Kim

On 11/30/09 1:55 PM, "Eric Yang" <ey...@yahoo-inc.com> wrote:

> Hi Kim,
> 
> Scribe works well for simple deployment.  The complexity increases when
> "central scribe server" is multi-machines deployment.  Basically, it
> requires a reverse proxy to load balance the data collection.  (
> http://*www.*cloudera.com/blog/2008/11/02/configuring-and-using-scribe-for-had
> oop-log-collection/ )  I have not used scribe personally, therefore someone
> else could fill in the experience.
> 
> Chukwa was designed to be fault tolerant log collection/analytics platform.
> Each chukwa agent automatically creates it's own routing table to chukwa
> collectors.  Therefore, Chukwa does not require a reverse proxy.  However,
> Chukwa Agent requires knowledge of all collector addresses, hence the
> initial deployment complexity may be a little higher than scribe.  The
> largest test for Chukwa deployment was 50 Chukwa collectors running on top
> of 100 dedicated hadoop nodes to process log files from a data center.
> (Which was decommissioned due to lack of log files)  Base on my experience,
> a single collector with 8GB allocated RAM could handle all log files from
> 2000 hadoop nodes + System Metrics (top, df, sar, iostat, netstat, vmstat
> output).  
> 
> Chukwa does not have a direct log file viewer, instead, it has an analytics
> engine which computes various facts and provide reports.  There are frequent
> requests about log file viewer but it hasn't been implemented.  We only have
> command line utility to dump the log files because it is difficult to view
> terabytes of log file.  At some point in the future, when a full body index
> engine is implemented, then we will provide log file search.
> 
> In essence, it depends on what you are looking for.  If you are looking for
> simple log collection and viewer, Scribe is probably a good tool.  If you
> are looking for log collection and reporting platform, Chukwa is a good
> solution.
> 
> Regards,
> Eric
> 
> On 11/30/09 11:54 AM, "Kim Vogt" <vo...@llnl.gov> wrote:
> 
>> Hi,
>> 
>> My team is looking into using Scribe or Chukwa for hadoop log collection.  I
>> was wondering if anyone had any opinions about one vs. the other?  I
>> apologize if this topic was covered before, but I don¹t see a link to the
>> archives for this mailing list.
>> 
>> Thanks,
>> 
>> Kim
> 
>