You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Adam Masters <ad...@gmail.com> on 2013/07/31 17:48:34 UTC

Hadoop - using SlicePredicate with wide rows

Hi all,

I need to limit a MapReduce job to only scan a specific range of columns.
The CF being processed is a wide row, so I've set the 'widerow' property in
ConfigHelper.setInputColumnFamily() to true.

However, in the word_count example on github, the following comment exists:

// this will cause the predicate to be ignored in favor of scanning
everything as a wide row
ConfigHelper.setInputColumnFamily(job.getConfiguration(), KEYSPACE,
COLUMN_FAMILY, true);

This suggests that ignoring the SlicePredicate for wide rows is by design -
and this is certainly the behavior I've been witnessing. In which case, how
do I limit the columns being scanned?

N.B. I cant set the 'widerow' flag to false as it breaks Cassandra (too
many columns are loaded at once, causing an outofmemory style exception).

Many thanks,
Adam