You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by al...@ceid.upatras.gr on 2011/07/18 17:01:59 UTC

Read asynchronously from multiple nodes

I am developing a new API call which will read results from multiple
nodes. I am first sending a message to each node and maintain a list of
handlers, one for each message.

However, after all requests are sent, I can only call the handlers' get()
function sequentially, wait for each to finish and then call the next one.
This causes unnecessary delay: if the second node has already computed and
sent a response, but the first one hasn't, the first handler's get() is
blocking until the first node has finished, before advancing to the second
node. Ideally, I would like to receive the output of the first node to
finish, as soon as it is finished.

I have read about the SEDA design, and I think a solution would be to
submit a callable for each node to the READ stage, while gathering the
results in a (synchronized) Vector. However, I can't figure out how to do
that and whether it will work as I am describing.

This might also be beneficial for getRangeSlice(), which, as far as I
understand, gets all results from one node before advancing to the next.
For particularly large range queries, it might be better to just send the
message to two nodes at once: if the next node responds with an empty
list, the coordinator knows it isn't necessary to send the request to
further nodes.

In conclusion, please explain how to submit to the READ stage, and comment
on getRangeSlice().

Alexander