You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kanwar Sangha <ka...@mavenir.com> on 2013/08/21 02:57:43 UTC

Secondary Index Question

Hi - I was reading some blogs on implementation of secondary indexes in Cassandra and they say that "the read requests are sent sequentially to all the nodes" ?

So if I have a query to fetch ALL records with the secondary index filter, will the co-ordinator node send the requests to nodes one by one ?

Thanks,
Kanwar


Re: Secondary Index Question

Posted by Robert Coli <rc...@eventbrite.com>.
On Tue, Aug 20, 2013 at 5:57 PM, Kanwar Sangha <ka...@mavenir.com> wrote:

>  Hi – I was reading some blogs on implementation of secondary indexes in
> Cassandra and they say that “the read requests are sent sequentially to all
> the nodes” ? ****
>
> ** **
>
> So if I have a query to fetch ALL records with the secondary index filter,
> will the co-ordinator node send the requests to nodes one by one ?
>

Stock disclaimer about Cassandra Secondary Indexes :

Unless you actually need the feature of atomic update of the secondary
index with the base row, you are probably better off just using a manual
pseudo-Secondary-Index column family.

=Rob

Re: Secondary Index Question

Posted by "Hiller, Dean" <De...@nrel.gov>.
Oh, I do know it is not "see if one node can return the desired results"
as each node will have different results for your client and you get
results from the first node, then results from second node, etc. etc.  (I
remember having this discussion but for the life of me can't remember why
it is sequentialŠ.it may just be the overload thing).  Hopefully someone
else will respond with a better answer.

Dean

On 8/21/13 9:10 AM, "Kanwar Sangha" <ka...@mavenir.com> wrote:

>Thanks Dean. Any reason why it is sequential ? It is to avoid loading all
>the nodes and see if one node can return the desired results ?
>
>
>-----Original Message-----
>From: Hiller, Dean [mailto:Dean.Hiller@nrel.gov]
>Sent: 21 August 2013 07:36
>To: user@cassandra.apache.org
>Subject: Re: Secondary Index Question
>
>Yup, there are other types of indexing like that in PlayOrm which do it
>differently so all nodes are not hit so it works better for instance if
>you are partitioning your data and you query into just a single partition
>so it doesn't put load on all the nodes.  (of course, you have to have a
>partition strategy to partition by say month with key being the timestamp
>of begin of month or maybe you partition by account as you only query
>into accounts).
>
>It is feasible to roll your own as well.  (though you do need to worry
>about eventual consistency here when rolling your own)
>
>Later,
>Dean
>
>From: Kanwar Sangha <ka...@mavenir.com>>
>Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
><us...@cassandra.apache.org>>
>Date: Tuesday, August 20, 2013 6:57 PM
>To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
><us...@cassandra.apache.org>>
>Subject: Secondary Index Question
>
>Hi - I was reading some blogs on implementation of secondary indexes in
>Cassandra and they say that "the read requests are sent sequentially to
>all the nodes" ?
>
>So if I have a query to fetch ALL records with the secondary index
>filter, will the co-ordinator node send the requests to nodes one by one ?
>
>Thanks,
>Kanwar
>


Re: Secondary Index Question

Posted by "Hiller, Dean" <De...@nrel.gov>.
Sorry, I forget why.  Someone told me at the cassandra conference.  It
might be to not overload the entire cluster at once so if you have 1000
nodes and you run just 5 queries, you could take out your cluster.  (This
is why I use playorm's querying and in tons of use cases, you don't want
to query the entire clusterŠ.usually that is saved for a map/reduce type
operation).

Dean

On 8/21/13 9:10 AM, "Kanwar Sangha" <ka...@mavenir.com> wrote:

>Thanks Dean. Any reason why it is sequential ? It is to avoid loading all
>the nodes and see if one node can return the desired results ?
>
>
>-----Original Message-----
>From: Hiller, Dean [mailto:Dean.Hiller@nrel.gov]
>Sent: 21 August 2013 07:36
>To: user@cassandra.apache.org
>Subject: Re: Secondary Index Question
>
>Yup, there are other types of indexing like that in PlayOrm which do it
>differently so all nodes are not hit so it works better for instance if
>you are partitioning your data and you query into just a single partition
>so it doesn't put load on all the nodes.  (of course, you have to have a
>partition strategy to partition by say month with key being the timestamp
>of begin of month or maybe you partition by account as you only query
>into accounts).
>
>It is feasible to roll your own as well.  (though you do need to worry
>about eventual consistency here when rolling your own)
>
>Later,
>Dean
>
>From: Kanwar Sangha <ka...@mavenir.com>>
>Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
><us...@cassandra.apache.org>>
>Date: Tuesday, August 20, 2013 6:57 PM
>To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
><us...@cassandra.apache.org>>
>Subject: Secondary Index Question
>
>Hi - I was reading some blogs on implementation of secondary indexes in
>Cassandra and they say that "the read requests are sent sequentially to
>all the nodes" ?
>
>So if I have a query to fetch ALL records with the secondary index
>filter, will the co-ordinator node send the requests to nodes one by one ?
>
>Thanks,
>Kanwar
>


RE: Secondary Index Question

Posted by Kanwar Sangha <ka...@mavenir.com>.
Thanks Dean. Any reason why it is sequential ? It is to avoid loading all the nodes and see if one node can return the desired results ?


-----Original Message-----
From: Hiller, Dean [mailto:Dean.Hiller@nrel.gov] 
Sent: 21 August 2013 07:36
To: user@cassandra.apache.org
Subject: Re: Secondary Index Question

Yup, there are other types of indexing like that in PlayOrm which do it differently so all nodes are not hit so it works better for instance if you are partitioning your data and you query into just a single partition so it doesn't put load on all the nodes.  (of course, you have to have a partition strategy to partition by say month with key being the timestamp of begin of month or maybe you partition by account as you only query into accounts).

It is feasible to roll your own as well.  (though you do need to worry about eventual consistency here when rolling your own)

Later,
Dean

From: Kanwar Sangha <ka...@mavenir.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 6:57 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Secondary Index Question

Hi - I was reading some blogs on implementation of secondary indexes in Cassandra and they say that "the read requests are sent sequentially to all the nodes" ?

So if I have a query to fetch ALL records with the secondary index filter, will the co-ordinator node send the requests to nodes one by one ?

Thanks,
Kanwar


Re: Secondary Index Question

Posted by "Hiller, Dean" <De...@nrel.gov>.
Yup, there are other types of indexing like that in PlayOrm which do it differently so all nodes are not hit so it works better for instance if you are partitioning your data and you query into just a single partition so it doesn't put load on all the nodes.  (of course, you have to have a partition strategy to partition by say month with key being the timestamp of begin of month or maybe you partition by account as you only query into accounts).

It is feasible to roll your own as well.  (though you do need to worry about eventual consistency here when rolling your own)

Later,
Dean

From: Kanwar Sangha <ka...@mavenir.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 6:57 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Secondary Index Question

Hi – I was reading some blogs on implementation of secondary indexes in Cassandra and they say that “the read requests are sent sequentially to all the nodes” ?

So if I have a query to fetch ALL records with the secondary index filter, will the co-ordinator node send the requests to nodes one by one ?

Thanks,
Kanwar