You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by cem <ca...@gmail.com> on 2013/02/15 17:52:33 UTC

virtual nodes + map reduce = too many mappers

Hi All,

I have just started to use virtual nodes. I set the number of nodes to 256
as recommended.

The problem that I have is when I run a mapreduce job it creates node * 256
mappers. It creates node * 256 splits. this effects the performance since
the range queries have a lot of overhead.

Any suggestion to improve the performance? It seems like I need to lower
the  number of virtual nodes.

Best Regards,
Cem

Re: virtual nodes + map reduce = too many mappers

Posted by Eric Evans <ee...@acunu.com>.

On Sat, Feb 16, 2013 at 9:13 AM, Edward Capriolo <ed...@gmail.com> wrote:
> No one had ever tried vnodes with hadoop until the OP did, or they
> would have noticed this. No one extensively used it with secondary
> indexes either from the last ticket I mentioned.
>
> My mistake they are not a default.
>
> I do think vnodes are awesome, its great that c* has the longer
> release cylcle. Just saying I do not know what .0 and .1 releases are.
> They just seem like extended beta-s to me.

We should definitely aspire to better/more thorough QA, but at the
risk of making what sounds like an excuse, I would argue that this is
the nature of open source software development.  You "Release Early,
Release Often", and iterate with your early adopters to shake out the
missed bugs.

What's important, I think, is to minimize the impact on existing
users, and properly set expectations.  I don't see where we've failed
here, but I'm definitely open to hearing that I'm wrong (or how we
could have done better).

> On Fri, Feb 15, 2013 at 11:10 PM, Eric Evans <ee...@acunu.com> wrote:
>> On Fri, Feb 15, 2013 at 7:01 PM, Edward Capriolo <ed...@gmail.com> wrote:
>>> Seems like the hadoop Input format should combine the splits that are
>>> on the same node into the same map task, like Hadoop's
>>> CombinedInputFormat can. I am not sure who recommends vnodes as the
>>> default, because this is now the second problem (that I know of) of
>>> this class where vnodes has extra overhead,
>>> https://issues.apache.org/jira/browse/CASSANDRA-5161
>>>
>>> This seems to be the standard operating practice in c* now, enable
>>> things in the default configuration like new partitioners and newer
>>> features like vnodes, even though they are not heavily tested in the
>>> wild or well understood, then deal with fallout.
>>
>> Except that it is not in fact enabled by default; The default remains
>> 1-token-per-node.
>>
>> That said, the only way that a feature like this will ever be heavily
>> tested in the wild, and well understood, is if it is actually put to
>> use.  Speaking only for myself, I am grateful to users like Cem who
>> test new features and report the issues they find.
>>
>>> On Fri, Feb 15, 2013 at 11:52 AM, cem <ca...@gmail.com> wrote:
>>>> Hi All,
>>>>
>>>> I have just started to use virtual nodes. I set the number of nodes to 256
>>>> as recommended.
>>>>
>>>> The problem that I have is when I run a mapreduce job it creates node * 256
>>>> mappers. It creates node * 256 splits. this effects the performance since
>>>> the range queries have a lot of overhead.
>>>>
>>>> Any suggestion to improve the performance? It seems like I need to lower the
>>>> number of virtual nodes.
>>>>
>>>> Best Regards,
>>>> Cem
>>>>
>>>>
>>
>>
>>
>> --
>> Eric Evans
>> Acunu | http://www.acunu.com | @acunu



-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: virtual nodes + map reduce = too many mappers

Posted by Edward Capriolo <ed...@gmail.com>.

No one had ever tried vnodes with hadoop until the OP did, or they
would have noticed this. No one extensively used it with secondary
indexes either from the last ticket I mentioned.

My mistake they are not a default.

I do think vnodes are awesome, its great that c* has the longer
release cylcle. Just saying I do not know what .0 and .1 releases are.
They just seem like extended beta-s to me.

Edward


On Fri, Feb 15, 2013 at 11:10 PM, Eric Evans <ee...@acunu.com> wrote:
> On Fri, Feb 15, 2013 at 7:01 PM, Edward Capriolo <ed...@gmail.com> wrote:
>> Seems like the hadoop Input format should combine the splits that are
>> on the same node into the same map task, like Hadoop's
>> CombinedInputFormat can. I am not sure who recommends vnodes as the
>> default, because this is now the second problem (that I know of) of
>> this class where vnodes has extra overhead,
>> https://issues.apache.org/jira/browse/CASSANDRA-5161
>>
>> This seems to be the standard operating practice in c* now, enable
>> things in the default configuration like new partitioners and newer
>> features like vnodes, even though they are not heavily tested in the
>> wild or well understood, then deal with fallout.
>
> Except that it is not in fact enabled by default; The default remains
> 1-token-per-node.
>
> That said, the only way that a feature like this will ever be heavily
> tested in the wild, and well understood, is if it is actually put to
> use.  Speaking only for myself, I am grateful to users like Cem who
> test new features and report the issues they find.
>
>> On Fri, Feb 15, 2013 at 11:52 AM, cem <ca...@gmail.com> wrote:
>>> Hi All,
>>>
>>> I have just started to use virtual nodes. I set the number of nodes to 256
>>> as recommended.
>>>
>>> The problem that I have is when I run a mapreduce job it creates node * 256
>>> mappers. It creates node * 256 splits. this effects the performance since
>>> the range queries have a lot of overhead.
>>>
>>> Any suggestion to improve the performance? It seems like I need to lower the
>>> number of virtual nodes.
>>>
>>> Best Regards,
>>> Cem
>>>
>>>
>
>
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu

Re: virtual nodes + map reduce = too many mappers

Posted by Eric Evans <ee...@acunu.com>.

On Fri, Feb 15, 2013 at 7:01 PM, Edward Capriolo <ed...@gmail.com> wrote:
> Seems like the hadoop Input format should combine the splits that are
> on the same node into the same map task, like Hadoop's
> CombinedInputFormat can. I am not sure who recommends vnodes as the
> default, because this is now the second problem (that I know of) of
> this class where vnodes has extra overhead,
> https://issues.apache.org/jira/browse/CASSANDRA-5161
>
> This seems to be the standard operating practice in c* now, enable
> things in the default configuration like new partitioners and newer
> features like vnodes, even though they are not heavily tested in the
> wild or well understood, then deal with fallout.

Except that it is not in fact enabled by default; The default remains
1-token-per-node.

That said, the only way that a feature like this will ever be heavily
tested in the wild, and well understood, is if it is actually put to
use.  Speaking only for myself, I am grateful to users like Cem who
test new features and report the issues they find.

> On Fri, Feb 15, 2013 at 11:52 AM, cem <ca...@gmail.com> wrote:
>> Hi All,
>>
>> I have just started to use virtual nodes. I set the number of nodes to 256
>> as recommended.
>>
>> The problem that I have is when I run a mapreduce job it creates node * 256
>> mappers. It creates node * 256 splits. this effects the performance since
>> the range queries have a lot of overhead.
>>
>> Any suggestion to improve the performance? It seems like I need to lower the
>> number of virtual nodes.
>>
>> Best Regards,
>> Cem
>>
>>



-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu

Re: virtual nodes + map reduce = too many mappers

Posted by cem <ca...@gmail.com>.

Thanks Eric for the appreciation :)

Default split size is 64K rows. ColumnFamilyInputFormat first collects all
tokens and create a split for each. if you have 256 vnode for each node
that it creates 256 splits even if you have no data at all. current split
size will only work if you have a vnode that has more than 64K rows.

Possible solution that came to my mind: We can simply
extend ColumnFamilySplit by adding a list of token ranges instead of one.
Than no need create mapper for each token. Each  mapper can
do multiple range queries.  But I don't know how to combine the range
queries because in the typical range query  you need to set start and end
token. But in the virtual nodes I realized that tokens are not continuous.

Best Regards,
Cem

On Sun, Feb 17, 2013 at 2:47 AM, Edward Capriolo <ed...@gmail.com>wrote:

> Split size does not have to equal block size.
>
>
> http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/mapred/lib/CombineFileInputFormat.html
>
> An abstract InputFormat that returns CombineFileSplit's in
> InputFormat.getSplits(JobConf, int) method. Splits are constructed
> from the files under the input paths. A split cannot have files from
> different pools. Each split returned may contain blocks from different
> files. If a maxSplitSize is specified, then blocks on the same node
> are combined to form a single split. Blocks that are left over are
> then combined with other blocks in the same rack. If maxSplitSize is
> not specified, then blocks from the same rack are combined in a single
> split; no attempt is made to create node-local splits. If the
> maxSplitSize is equal to the block size, then this class is similar to
> the default spliting behaviour in Hadoop: each block is a locally
> processed split. Subclasses implement
> InputFormat.getRecordReader(InputSplit, JobConf, Reporter) to
> construct RecordReader's for CombineFileSplit's.
>
> Hive offers a CombinedHiveInputFormat
>
> https://issues.apache.org/jira/browse/HIVE-74
>
> Essentially Combined input formats rock hard. If you have a directory
> with say 2000 files, you do not want 2000 splits, and then the
> overhead of starting stopping 2000 mappers.
>
> If you enable CombineInputFormat you can tune mapred.split.size and
> the number of mappers is based (mostly) on the input size. This gives
> jobs that would create too many map tasks way more throughput, and
> stops them from monopolizing the map slots on the cluster.
>
> It would seem like all the extra splits from the vnode change could be
> combined back together.
>
> On Sat, Feb 16, 2013 at 8:21 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> > Wouldn't you have more than 256 splits anyway, given a normal amount of
> data?
> >
> > (Default split size is 64k rows.)
> >
> > On Fri, Feb 15, 2013 at 7:01 PM, Edward Capriolo <ed...@gmail.com>
> wrote:
> >> Seems like the hadoop Input format should combine the splits that are
> >> on the same node into the same map task, like Hadoop's
> >> CombinedInputFormat can. I am not sure who recommends vnodes as the
> >> default, because this is now the second problem (that I know of) of
> >> this class where vnodes has extra overhead,
> >> https://issues.apache.org/jira/browse/CASSANDRA-5161
> >>
> >> This seems to be the standard operating practice in c* now, enable
> >> things in the default configuration like new partitioners and newer
> >> features like vnodes, even though they are not heavily tested in the
> >> wild or well understood, then deal with fallout.
> >>
> >>
> >> On Fri, Feb 15, 2013 at 11:52 AM, cem <ca...@gmail.com> wrote:
> >>> Hi All,
> >>>
> >>> I have just started to use virtual nodes. I set the number of nodes to
> 256
> >>> as recommended.
> >>>
> >>> The problem that I have is when I run a mapreduce job it creates node
> * 256
> >>> mappers. It creates node * 256 splits. this effects the performance
> since
> >>> the range queries have a lot of overhead.
> >>>
> >>> Any suggestion to improve the performance? It seems like I need to
> lower the
> >>> number of virtual nodes.
> >>>
> >>> Best Regards,
> >>> Cem
> >>>
> >>>
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder, http://www.datastax.com
> > @spyced
>

Re: virtual nodes + map reduce = too many mappers

Posted by Edward Capriolo <ed...@gmail.com>.

Split size does not have to equal block size.

http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/mapred/lib/CombineFileInputFormat.html

An abstract InputFormat that returns CombineFileSplit's in
InputFormat.getSplits(JobConf, int) method. Splits are constructed
from the files under the input paths. A split cannot have files from
different pools. Each split returned may contain blocks from different
files. If a maxSplitSize is specified, then blocks on the same node
are combined to form a single split. Blocks that are left over are
then combined with other blocks in the same rack. If maxSplitSize is
not specified, then blocks from the same rack are combined in a single
split; no attempt is made to create node-local splits. If the
maxSplitSize is equal to the block size, then this class is similar to
the default spliting behaviour in Hadoop: each block is a locally
processed split. Subclasses implement
InputFormat.getRecordReader(InputSplit, JobConf, Reporter) to
construct RecordReader's for CombineFileSplit's.

Hive offers a CombinedHiveInputFormat

https://issues.apache.org/jira/browse/HIVE-74

Essentially Combined input formats rock hard. If you have a directory
with say 2000 files, you do not want 2000 splits, and then the
overhead of starting stopping 2000 mappers.

If you enable CombineInputFormat you can tune mapred.split.size and
the number of mappers is based (mostly) on the input size. This gives
jobs that would create too many map tasks way more throughput, and
stops them from monopolizing the map slots on the cluster.

It would seem like all the extra splits from the vnode change could be
combined back together.

On Sat, Feb 16, 2013 at 8:21 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> Wouldn't you have more than 256 splits anyway, given a normal amount of data?
>
> (Default split size is 64k rows.)
>
> On Fri, Feb 15, 2013 at 7:01 PM, Edward Capriolo <ed...@gmail.com> wrote:
>> Seems like the hadoop Input format should combine the splits that are
>> on the same node into the same map task, like Hadoop's
>> CombinedInputFormat can. I am not sure who recommends vnodes as the
>> default, because this is now the second problem (that I know of) of
>> this class where vnodes has extra overhead,
>> https://issues.apache.org/jira/browse/CASSANDRA-5161
>>
>> This seems to be the standard operating practice in c* now, enable
>> things in the default configuration like new partitioners and newer
>> features like vnodes, even though they are not heavily tested in the
>> wild or well understood, then deal with fallout.
>>
>>
>> On Fri, Feb 15, 2013 at 11:52 AM, cem <ca...@gmail.com> wrote:
>>> Hi All,
>>>
>>> I have just started to use virtual nodes. I set the number of nodes to 256
>>> as recommended.
>>>
>>> The problem that I have is when I run a mapreduce job it creates node * 256
>>> mappers. It creates node * 256 splits. this effects the performance since
>>> the range queries have a lot of overhead.
>>>
>>> Any suggestion to improve the performance? It seems like I need to lower the
>>> number of virtual nodes.
>>>
>>> Best Regards,
>>> Cem
>>>
>>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced

Re: virtual nodes + map reduce = too many mappers

Posted by Jonathan Ellis <jb...@gmail.com>.

Wouldn't you have more than 256 splits anyway, given a normal amount of data?

(Default split size is 64k rows.)

On Fri, Feb 15, 2013 at 7:01 PM, Edward Capriolo <ed...@gmail.com> wrote:
> Seems like the hadoop Input format should combine the splits that are
> on the same node into the same map task, like Hadoop's
> CombinedInputFormat can. I am not sure who recommends vnodes as the
> default, because this is now the second problem (that I know of) of
> this class where vnodes has extra overhead,
> https://issues.apache.org/jira/browse/CASSANDRA-5161
>
> This seems to be the standard operating practice in c* now, enable
> things in the default configuration like new partitioners and newer
> features like vnodes, even though they are not heavily tested in the
> wild or well understood, then deal with fallout.
>
>
> On Fri, Feb 15, 2013 at 11:52 AM, cem <ca...@gmail.com> wrote:
>> Hi All,
>>
>> I have just started to use virtual nodes. I set the number of nodes to 256
>> as recommended.
>>
>> The problem that I have is when I run a mapreduce job it creates node * 256
>> mappers. It creates node * 256 splits. this effects the performance since
>> the range queries have a lot of overhead.
>>
>> Any suggestion to improve the performance? It seems like I need to lower the
>> number of virtual nodes.
>>
>> Best Regards,
>> Cem
>>
>>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced

Re: virtual nodes + map reduce = too many mappers

Posted by Edward Capriolo <ed...@gmail.com>.

Seems like the hadoop Input format should combine the splits that are
on the same node into the same map task, like Hadoop's
CombinedInputFormat can. I am not sure who recommends vnodes as the
default, because this is now the second problem (that I know of) of
this class where vnodes has extra overhead,
https://issues.apache.org/jira/browse/CASSANDRA-5161

This seems to be the standard operating practice in c* now, enable
things in the default configuration like new partitioners and newer
features like vnodes, even though they are not heavily tested in the
wild or well understood, then deal with fallout.

On Fri, Feb 15, 2013 at 11:52 AM, cem <ca...@gmail.com> wrote:
> Hi All,
>
> I have just started to use virtual nodes. I set the number of nodes to 256
> as recommended.
>
> The problem that I have is when I run a mapreduce job it creates node * 256
> mappers. It creates node * 256 splits. this effects the performance since
> the range queries have a lot of overhead.
>
> Any suggestion to improve the performance? It seems like I need to lower the
> number of virtual nodes.
>
> Best Regards,
> Cem
>
>