You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Boris Yen <yu...@gmail.com> on 2011/07/06 10:04:56 UTC

get_slice needs offset

Hi,

It seems it was implemented before. However, due to Cassandra-286, it was
removed. I was wondering if it is possible to put it back. Because computing
the offset on the client side should be less efficient than do it on the
server side.

I am thinking this might be possible to achieve by tweaking both the
CassandraServer.multigetSliceInternal and CassandraServer.getSlice. Let us
assume there is one extra attribute called "offset" inside SliceRange
object. When the "offset" attribute exists in the SliceRange, the read
command should be create like "new SliceFromReadCommand(keyspace, key,
column_parent, range.start, range.finish, range.reversed,
range.count+range.offset)". And then, the only thing left to do is to make
the  CassandraServer.getSlice be aware of the "offset", based on offset it
should be able to return the right fragment of data.

Boris

Re: get_slice needs offset

Posted by Boris Yen <yu...@gmail.com>.
I suppose what you meant is to use the last-column-seen as the "start" of
SliceRange to make another query, so that the server could use the per-row
column index to jump to the right position of a file and retrieve column
data.

Based on that, if an application/cassandra-client wants to implement a
function to let it's user has the freedom to select arbitrary pages of data
to browser. The application/cassandra-client needs to send two requests to
cassandra to accomplish this, one request is for getting the "start" of
SliceRange, the other request is to use the prepared SliceRange to retrieve
the data needed. From the perspective of application/cassandra-client, it
takes more resources to handle this type of request, because the first
request actually need to get all the columns >= "start" back. From the
perspective of cassandra, it needs to handle two request instead of one, if
we consider the consistency level, there might be more requests going on
between cluster nodes.

Actually, what I am proposing here is to use the "offset" to widen the range
of data got from a get_slice query. Everything of the internal mechanism of
the get_slice query remains the same. The "start" and "offset" could
coexist. The purpose of "offset" is to enlarge the "count" of SliceRange,
makes the "count" = "count+offset". And also "offset" will also be used to
trim data returned from the internal mechanism of a get_slice query.

Regards
Boris

On Wed, Jul 6, 2011 at 9:01 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> It was removed because the "right" way to do this is to page by
> last-column-seen, which can take advantage of the per-row column
> index.  Offset cannot.
>
> On Wed, Jul 6, 2011 at 3:04 AM, Boris Yen <yu...@gmail.com> wrote:
> > Hi,
> >
> > It seems it was implemented before. However, due to Cassandra-286, it was
> > removed. I was wondering if it is possible to put it back. Because
> computing
> > the offset on the client side should be less efficient than do it on the
> > server side.
> >
> > I am thinking this might be possible to achieve by tweaking both the
> > CassandraServer.multigetSliceInternal and CassandraServer.getSlice. Let
> us
> > assume there is one extra attribute called "offset" inside SliceRange
> > object. When the "offset" attribute exists in the SliceRange, the read
> > command should be create like "new SliceFromReadCommand(keyspace, key,
> > column_parent, range.start, range.finish, range.reversed,
> > range.count+range.offset)". And then, the only thing left to do is to
> make
> > the  CassandraServer.getSlice be aware of the "offset", based on offset
> it
> > should be able to return the right fragment of data.
> >
> > Boris
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: get_slice needs offset

Posted by Jonathan Ellis <jb...@gmail.com>.
It was removed because the "right" way to do this is to page by
last-column-seen, which can take advantage of the per-row column
index.  Offset cannot.

On Wed, Jul 6, 2011 at 3:04 AM, Boris Yen <yu...@gmail.com> wrote:
> Hi,
>
> It seems it was implemented before. However, due to Cassandra-286, it was
> removed. I was wondering if it is possible to put it back. Because computing
> the offset on the client side should be less efficient than do it on the
> server side.
>
> I am thinking this might be possible to achieve by tweaking both the
> CassandraServer.multigetSliceInternal and CassandraServer.getSlice. Let us
> assume there is one extra attribute called "offset" inside SliceRange
> object. When the "offset" attribute exists in the SliceRange, the read
> command should be create like "new SliceFromReadCommand(keyspace, key,
> column_parent, range.start, range.finish, range.reversed,
> range.count+range.offset)". And then, the only thing left to do is to make
> the  CassandraServer.getSlice be aware of the "offset", based on offset it
> should be able to return the right fragment of data.
>
> Boris
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com