You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by James Campbell <ja...@breachintelligence.com> on 2014/04/22 15:43:48 UTC
BulkOutputFormat and CQL3
Hi Cassandra Users-
I have a Hadoop job that uses the pattern in Cassandra 2.0.6's hadoop_cql3_word_count example to load data from HDFS into Cassandra. Having read about BulkOutputFormat as a way to potentially significantly increase the write throughput from Hadoop to Cassandra, I am considering testing against that pattern (http://www.datastax.com/dev/blog/improved-hadoop-output, http://shareitexploreit.blogspot.com/2012/03/bulkloadto-cassandra-with-hadoop.html ).
Is it possible/supported/recommended to use the BulkOutputFormat to load data from Hadoop to a CQL3 table in Cassandra?
I see several examples of building composite keys using Hector (e.g. http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1, http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html ), but with the changes to support CQL3 having left a lot of different documentation out there for different versions, it's not clear to me what the "proper" way to build the requisite ByteBuffer, List<Mutation> pairs that the ColumnFamilyOutputFormat (and so BulkOutputFormat) needs.
James