You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by mck <mc...@apache.org> on 2010/11/21 19:43:26 UTC

(newbie) ColumnFamilyOutputFormat only writes one column (per key)

(I'm new here so forgive any mistakes or mis-presumptions...)

I've set up a cassandra-0.7.0-beta3 and populated it with
thrift-serialised objects via a scribe server. This seems a great way to
get thrift beans out of the application asap and have then sitting in
cassandra for later processing.

I then went to write a m/r job that deserialises the thrift objects and
aggregates the data accordingly into a new column family. But what i've
found is that ColumnFamilyOutputFormat will only write out one column
per key.

Alex Burkoff also reported this nearly two months ago, but nobody ever
replied...
 http://article.gmane.org/gmane.comp.db.cassandra.user/9325

has anyone any ideas? 
should it be possible to write multiple columns out?

This is very easy to reproduce. Use the contrib/wordcount example, with
OUTPUT_REDUCER=cassandra and in WordCount.java add at line 132

>              results.add(getMutation(key, sum));
> +            results.add(getMutation(new Text("doubled"), sum*2));

Only the last mutation for any key seems to be written.


~mck

-- 
echo '[q]sa[ln0=aln256%
Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc 

| www.semb.wever.org | www.sesat.no 
| www.finn.no        | http://xss-http-filter.sf.net


Re: (newbie) ColumnFamilyOutputFormat only writes one column (per key)

Posted by Mck <mc...@apache.org>.
> I then went to write a m/r job that deserialises the thrift objects and
> aggregates the data accordingly into a new column family. But what i've
> found is that ColumnFamilyOutputFormat will only write out one column
> per key.

I've entered a bug for this:
 https://issues.apache.org/jira/browse/CASSANDRA-1774

~mck