You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Constantin Teodorescu <br...@gmail.com> on 2011/04/21 00:40:57 UTC

CQL in future 8.0 cassandra will work as I'm expecting ?

My use case is as follows: we are using in 70% of the jobs information
retrieval using keys, column names and ranges and up to now, what we have
tested suits our need.
However, the rest of 30% of the jobs involve full sequential scan of all
records in the database.

I found some web pages describing the next good thing for cassandra 0.8
release, CQL, and I'm wondering: the CQL execution will involve separate
processes running simultaneously on all nodes in the cluster that will do
the "filtering and pre-sorting phase" on the local stored data (using
indexes when available) and then execute the "merge phase" on a single node
(that one that have received the request) ?

Best regards,
Teo

Re: CQL in future 8.0 cassandra will work as I'm expecting ?

Posted by Jonathan Ellis <jb...@gmail.com>.
The latter.

On Thu, Apr 21, 2011 at 1:24 AM, Constantin Teodorescu
<br...@gmail.com> wrote:
> Thank you very much Ellis, I heard about Brisk two weeks ago and I'm already
> checking DataStax web site twice a day waiting for Brisk to come.
> It seems that it will be a good solution for us.
> Once more question please: the Brisk way of operation need to transfer
> intermediate data from Cassandra storage to some sort of Hadoop storage?
> Or it's just like that:
> [ Cassandra data ] --> [ Hadoop Job Tracker ] --> [ Parallel N-nodes
> MapReduce Task ] -->
> [ Cassandra storage for temporary results ] --> [ Final 1-Node ReReduce Task
> ] -->
> [ Cassandra storage for final results ]
> Thank you for your time,
> Teo
>
> On Thu, Apr 21, 2011 at 5:38 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>>
>> You want to run map/reduce jobs for your use case. You can already do
>> this with Cassandra (http://wiki.apache.org/cassandra/HadoopSupport),
>> and DataStax is introducing Brisk soon to make it easier:
>> http://www.datastax.com/products/brisk
>>
>> On Wed, Apr 20, 2011 at 9:36 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> > CQL changes the API, that is all.
>> >
>> > On Wed, Apr 20, 2011 at 5:40 PM, Constantin Teodorescu
>> > <br...@gmail.com> wrote:
>> >> My use case is as follows: we are using in 70% of the jobs information
>> >> retrieval using keys, column names and ranges and up to now, what we
>> >> have
>> >> tested suits our need.
>> >> However, the rest of 30% of the jobs involve full sequential scan of
>> >> all
>> >> records in the database.
>> >> I found some web pages describing the next good thing for cassandra 0.8
>> >> release, CQL, and I'm wondering: the CQL execution will involve
>> >> separate
>> >> processes running simultaneously on all nodes in the cluster that will
>> >> do
>> >> the "filtering and pre-sorting phase" on the local stored data (using
>> >> indexes when available) and then execute the "merge phase" on a single
>> >> node
>> >> (that one that have received the request) ?
>> >> Best regards,
>> >> Teo
>> >>
>> >
>> >
>> >
>> > --
>> > Jonathan Ellis
>> > Project Chair, Apache Cassandra
>> > co-founder of DataStax, the source for professional Cassandra support
>> > http://www.datastax.com
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: CQL in future 8.0 cassandra will work as I'm expecting ?

Posted by Constantin Teodorescu <br...@gmail.com>.
Thank you very much Ellis, I heard about Brisk two weeks ago and I'm already
checking DataStax web site twice a day waiting for Brisk to come.
It seems that it will be a good solution for us.

Once more question please: the Brisk way of operation need to transfer
intermediate data from Cassandra storage to some sort of Hadoop storage?
Or it's just like that:

[ Cassandra data ] --> [ Hadoop Job Tracker ] --> [ Parallel N-nodes
MapReduce Task ] -->
[ Cassandra storage for temporary results ] --> [ Final 1-Node ReReduce Task
] -->
[ Cassandra storage for final results ]

Thank you for your time,
Teo

On Thu, Apr 21, 2011 at 5:38 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> You want to run map/reduce jobs for your use case. You can already do
> this with Cassandra (http://wiki.apache.org/cassandra/HadoopSupport),
> and DataStax is introducing Brisk soon to make it easier:
> http://www.datastax.com/products/brisk
>
> On Wed, Apr 20, 2011 at 9:36 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> > CQL changes the API, that is all.
> >
> > On Wed, Apr 20, 2011 at 5:40 PM, Constantin Teodorescu
> > <br...@gmail.com> wrote:
> >> My use case is as follows: we are using in 70% of the jobs information
> >> retrieval using keys, column names and ranges and up to now, what we
> have
> >> tested suits our need.
> >> However, the rest of 30% of the jobs involve full sequential scan of all
> >> records in the database.
> >> I found some web pages describing the next good thing for cassandra 0.8
> >> release, CQL, and I'm wondering: the CQL execution will involve separate
> >> processes running simultaneously on all nodes in the cluster that will
> do
> >> the "filtering and pre-sorting phase" on the local stored data (using
> >> indexes when available) and then execute the "merge phase" on a single
> node
> >> (that one that have received the request) ?
> >> Best regards,
> >> Teo
> >>
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder of DataStax, the source for professional Cassandra support
> > http://www.datastax.com
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: CQL in future 8.0 cassandra will work as I'm expecting ?

Posted by Jonathan Ellis <jb...@gmail.com>.
You want to run map/reduce jobs for your use case. You can already do
this with Cassandra (http://wiki.apache.org/cassandra/HadoopSupport),
and DataStax is introducing Brisk soon to make it easier:
http://www.datastax.com/products/brisk

On Wed, Apr 20, 2011 at 9:36 PM, Jonathan Ellis <jb...@gmail.com> wrote:
> CQL changes the API, that is all.
>
> On Wed, Apr 20, 2011 at 5:40 PM, Constantin Teodorescu
> <br...@gmail.com> wrote:
>> My use case is as follows: we are using in 70% of the jobs information
>> retrieval using keys, column names and ranges and up to now, what we have
>> tested suits our need.
>> However, the rest of 30% of the jobs involve full sequential scan of all
>> records in the database.
>> I found some web pages describing the next good thing for cassandra 0.8
>> release, CQL, and I'm wondering: the CQL execution will involve separate
>> processes running simultaneously on all nodes in the cluster that will do
>> the "filtering and pre-sorting phase" on the local stored data (using
>> indexes when available) and then execute the "merge phase" on a single node
>> (that one that have received the request) ?
>> Best regards,
>> Teo
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: CQL in future 8.0 cassandra will work as I'm expecting ?

Posted by Jonathan Ellis <jb...@gmail.com>.
CQL changes the API, that is all.

On Wed, Apr 20, 2011 at 5:40 PM, Constantin Teodorescu
<br...@gmail.com> wrote:
> My use case is as follows: we are using in 70% of the jobs information
> retrieval using keys, column names and ranges and up to now, what we have
> tested suits our need.
> However, the rest of 30% of the jobs involve full sequential scan of all
> records in the database.
> I found some web pages describing the next good thing for cassandra 0.8
> release, CQL, and I'm wondering: the CQL execution will involve separate
> processes running simultaneously on all nodes in the cluster that will do
> the "filtering and pre-sorting phase" on the local stored data (using
> indexes when available) and then execute the "merge phase" on a single node
> (that one that have received the request) ?
> Best regards,
> Teo
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com