You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@chukwa.apache.org by Eric Yang <ey...@yahoo-inc.com> on 2010/04/25 21:08:37 UTC

Data partitioning for demux

Hi all,

I am working on enhancing the reducer partitioning for demux.  It basically
boils down to two main use cases.

Case #1, demux is responsible for crunching large volumes of the same data
type (dozen of types).  It will probably make more sense to partition the
reducer by time grouping + data type (extend TotalOrderPartitioner).  I.e. A
user can have evenly distributed workload for each reducer base on time
interval.  A distributed hash table like Hbase/voldermort could be the down
stream system to store/cache the data for data serving.  This model is great
for collecting fixed time interval logs like hadoop metrics, and ExecAdaptor
which generates repetitive time series summary.

Case #2, demux is responsible for crunching hundred of different data type,
but small volumn for each data type.  The current demux implementation is
using this model, where a single data type is reduced by one reducer slot
(ChukwaRecordPartitioner).  One draw back from this model,the data from each
data type must have similar volume.  Otherwise, the largest data volume type
becomes the long tail of the mapreduce job.  Materialized report is easy to
generate by using this model because the single reducer per data type has
view to all data of the given demux run.  This model works great for many
different application and all logging through Chukwa Log4j appender.  I.e.
web crawl, or log file indexing / viewing.

I am thinking to change the default Chukwa demux implementation to case #1,
and restructure the current demux as Archive Organizer.  Any suggestion or
objection?

Regards,
Eric

Re: Data partitioning for demux

Posted by Eric Yang <ey...@yahoo-inc.com>.

Hi Jerome,

The number of reducer slots is static configuration from job conf.  The
partitioning algorithm should be optimized for the supported use cases.  It
may be possible to configure the partitioning prior to job execution, but I
don't think it is a good idea to fancy partitioning by run-time data.
Partitioning by run-time data is most likely to be too costly, and difficult
to make it perfectly balanced.  Hence, the reducer partition enhancement is
targeted toward improving demux performance for the monitoring use case,
hopefully this change benefits Chukwa community's use cases.  I filed the
jira CHUKWA-481 for this issue.  Hopefully, this is only a knob to switch
between hash partitioning and ordered partition, but not something to
support wide range of partitioning algorithms.

Regards,
Eric


On 4/26/10 10:28 AM, "Jerome Boulon" <jb...@netflix.com> wrote:

> Hi,
> The partitionning function should be driven by the user not decide at this
> level.
> The Mapper class, the reducer class and the partionner should all be driven
> by configuration.
> There's no way for Demux to do the right thing based on static
> configuration. Even for the same demux but different dataType you may want
> to use a different partionning function so we need to have a
> partionnerManager that will select the right partionner based on the
> reduceType similar to what we are doing to select the right parser/reducer
> class.
> 
> The reason I'm saying that is that in Hive world, nobody access the SeqFile
> itself, just Hive engine is doing that and since there's no index it doesn't
> make sense to spend time/cpu/memory to have a file that will be globally
> sorted. So in that case, you want to have the same number of rows per
> reducer (%reducerCount), your proposal will be better than the current
> implementation but will not be good for anybody who does not need a file to
> be globally sorted.
> 
> Could you open a Jira for this and I will add more comments on it?
> 
> /Jerome.
> 
> On 4/25/10 12:08 PM, "Eric Yang" <ey...@yahoo-inc.com> wrote:
> 
>> Hi all,
>> 
>> I am working on enhancing the reducer partitioning for demux.  It basically
>> boils down to two main use cases.
>> 
>> Case #1, demux is responsible for crunching large volumes of the same data
>> type (dozen of types).  It will probably make more sense to partition the
>> reducer by time grouping + data type (extend TotalOrderPartitioner).  I.e. A
>> user can have evenly distributed workload for each reducer base on time
>> interval.  A distributed hash table like Hbase/voldermort could be the down
>> stream system to store/cache the data for data serving.  This model is great
>> for collecting fixed time interval logs like hadoop metrics, and ExecAdaptor
>> which generates repetitive time series summary.
>> 
>> Case #2, demux is responsible for crunching hundred of different data type,
>> but small volumn for each data type.  The current demux implementation is
>> using this model, where a single data type is reduced by one reducer slot
>> (ChukwaRecordPartitioner).  One draw back from this model,the data from each
>> data type must have similar volume.  Otherwise, the largest data volume type
>> becomes the long tail of the mapreduce job.  Materialized report is easy to
>> generate by using this model because the single reducer per data type has
>> view to all data of the given demux run.  This model works great for many
>> different application and all logging through Chukwa Log4j appender.  I.e.
>> web crawl, or log file indexing / viewing.
>> 
>> I am thinking to change the default Chukwa demux implementation to case #1,
>> and restructure the current demux as Archive Organizer.  Any suggestion or
>> objection?
>> 
>> Regards,
>> Eric 
>> 
>> 
>

Re: Data partitioning for demux

Posted by Jerome Boulon <jb...@netflix.com>.

Hi,
The partitionning function should be driven by the user not decide at this
level.
The Mapper class, the reducer class and the partionner should all be driven
by configuration. 
There's no way for Demux to do the right thing based on static
configuration. Even for the same demux but different dataType you may want
to use a different partionning function so we need to have a
partionnerManager that will select the right partionner based on the
reduceType similar to what we are doing to select the right parser/reducer
class.

The reason I'm saying that is that in Hive world, nobody access the SeqFile
itself, just Hive engine is doing that and since there's no index it doesn't
make sense to spend time/cpu/memory to have a file that will be globally
sorted. So in that case, you want to have the same number of rows per
reducer (%reducerCount), your proposal will be better than the current
implementation but will not be good for anybody who does not need a file to
be globally sorted.

Could you open a Jira for this and I will add more comments on it?

/Jerome.

On 4/25/10 12:08 PM, "Eric Yang" <ey...@yahoo-inc.com> wrote:

> Hi all,
> 
> I am working on enhancing the reducer partitioning for demux.  It basically
> boils down to two main use cases.
> 
> Case #1, demux is responsible for crunching large volumes of the same data
> type (dozen of types).  It will probably make more sense to partition the
> reducer by time grouping + data type (extend TotalOrderPartitioner).  I.e. A
> user can have evenly distributed workload for each reducer base on time
> interval.  A distributed hash table like Hbase/voldermort could be the down
> stream system to store/cache the data for data serving.  This model is great
> for collecting fixed time interval logs like hadoop metrics, and ExecAdaptor
> which generates repetitive time series summary.
> 
> Case #2, demux is responsible for crunching hundred of different data type,
> but small volumn for each data type.  The current demux implementation is
> using this model, where a single data type is reduced by one reducer slot
> (ChukwaRecordPartitioner).  One draw back from this model,the data from each
> data type must have similar volume.  Otherwise, the largest data volume type
> becomes the long tail of the mapreduce job.  Materialized report is easy to
> generate by using this model because the single reducer per data type has
> view to all data of the given demux run.  This model works great for many
> different application and all logging through Chukwa Log4j appender.  I.e.
> web crawl, or log file indexing / viewing.
> 
> I am thinking to change the default Chukwa demux implementation to case #1,
> and restructure the current demux as Archive Organizer.  Any suggestion or
> objection?
> 
> Regards,
> Eric 
> 
>