You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Desimpel, Ignace" <Ig...@nuance.com> on 2011/04/20 13:35:28 UTC

Question about AbstractType class

Cassandra version 0.7.4

 

Hi,

 

I created my own java class as an extension of the AbstractType class.
But I'm not sure about the following items related to the compare
function :

# The remaining bytes of the buffer sometimes is zero during thrift
get_slice execution, however I never store any zero length column name
nor query for it . If normal, what would be the correct handling of the
zero remaining bytes? Would it be something like :

public int compare(ByteBuffer o1, ByteBuffer o2){             

int ar1Rem = o1.remaining();

int ar2Rem = o2.remaining();

if ( ar1Rem == 0 || ar2Rem == 0 ) {

if ( ar1Rem != 0 ) {

                     return 1;

              } else if ( ar2Rem != 0 ) {

                     return -1;

              } else {

                     return 0;

              }

}

//Add the real compare here

.......}

 

# Since in version 0.6.3 the same function was passing an array of
bytes, I assumed that I could now call the ByteBuffer.array() function
in order to get the array of bytes backing up the ByteBuffer. Also the
length of the byte array in 0.6.3 seemed always to correspond to the
bytes of column name stored. But now in version 0.7.4 that ByteBuffer is
not always backed by such an array.

I can still get around this by making the needed buffer myself like :

int ar2Rem = o2.remaining();

byte[] ar2 = new byte[ar2Rem];

o2.get(ar2, 0, ar2Rem);

Question is : Are the remaining bytes the actual bytes for this column
name (eg: 20 bytes) or would that ByteBuffer ever be some wrapper around
some larger stream of data and the remaining bytes number could be 10 M
bytes. Thus I would not be able to detect the end of the column to
compare and I would possibly be allocating a large unneeded byte array?

 

#Using the ByteBuffer's 'get' function also updates the position of the
ByteBuffer. Is the compare function expected to do that or should it
reset the position back to what it was or ...?

 

 

Or maybe there is some good documentation I should read?

 

Ignace

 


RE: Question about AbstractType class

Posted by "Desimpel, Ignace" <Ig...@nuance.com>.
Thanks Sylvain. Your answer already helped me out a lot! I was using a
ByteBuffer.get function that is changing the ByteBuffer's position. And
I got all kinds of stranges effects and exceptions I didn't get in
0.6.x. Changed that code and all problems are gone...

Many thanks!!
Ignace

-----Original Message-----
From: Sylvain Lebresne [mailto:sylvain@datastax.com] 
Sent: Wednesday, April 20, 2011 4:04 PM
To: user@cassandra.apache.org
Subject: Re: Question about AbstractType class

On Wed, Apr 20, 2011 at 3:06 PM, Desimpel, Ignace
<Ig...@nuance.com> wrote:
> As said above, the remaing bytes won't (always) be the actual bytes.

Sorry I answered a bit quickly, I meant to say that the actual bytes
won't (always) be the full backing array.
That is, we never guarantee that BB.arrayOffset() == 0, nor
BB.position() == 0, nor BB.limit() == backingArray.length.
But the remaining() bytes will be the actual bytes, my bad.

--
Sylvain

Re: Question about AbstractType class

Posted by Sylvain Lebresne <sy...@datastax.com>.
On Wed, Apr 20, 2011 at 3:06 PM, Desimpel, Ignace
<Ig...@nuance.com> wrote:
> As said above, the remaing bytes won't (always) be the actual bytes.

Sorry I answered a bit quickly, I meant to say that the actual bytes
won't (always) be the full backing array.
That is, we never guarantee that BB.arrayOffset() == 0, nor
BB.position() == 0, nor BB.limit() == backingArray.length.
But the remaining() bytes will be the actual bytes, my bad.

--
Sylvain

RE: Question about AbstractType class

Posted by "Desimpel, Ignace" <Ig...@nuance.com>.

-----Original Message-----
From: Sylvain Lebresne [mailto:sylvain@datastax.com] 
Sent: Wednesday, April 20, 2011 2:07 PM
To: user@cassandra.apache.org
Subject: Re: Question about AbstractType class

On Wed, Apr 20, 2011 at 1:35 PM, Desimpel, Ignace <Ig...@nuance.com> wrote:
> Cassandra version 0.7.4
>
>
>
> Hi,
>
>
>
> I created my own java class as an extension of the AbstractType class. 
> But I'm not sure about the following items related to the compare function :
>
> # The remaining bytes of the buffer sometimes is zero during thrift 
> get_slice execution, however I never store any zero length column name 
> nor query for it . If normal, what would be the correct handling of 
> the zero remaining bytes?

It is normal, the empty ByteBuffer is used in slice queries to indicate the beginning of the row (start=""). More generally, compare and validate should work for anything you store but also anything you provide for the 'start'
and 'end' argument of slices.

> Would it be something like :
>
> public int compare(ByteBuffer o1, ByteBuffer o2){ int ar1Rem = 
> o1.remaining(); int ar2Rem = o2.remaining(); if ( ar1Rem == 0 || 
> ar2Rem == 0 ) { if ( ar1Rem != 0 ) {
>                      return 1;
>               } else if ( ar2Rem != 0 ) {
>                      return -1;
>               } else {
>                      return 0;
>               }
> }
> //Add the real compare here
> .......}

That looks reasonable (though not optimal in the number of comparison :))
->OK

> # Since in version 0.6.3 the same function was passing an array of 
> bytes, I assumed that I could now call the ByteBuffer.array() function 
> in order to get the array of bytes backing up the ByteBuffer.

It's not that simple. First, even if you use ByteBuffer.array(), you'll have to be careful that the ByteBuffer has a position, a limit and an arrayOffset and you should take that into account when accessing the backing array. But there is also no guarantee that the ByteBuffer will have a backing array so you need to handle this case too (I refer you to the ByteBuffer documentation).
->OK

> Also the length of the
> byte array in 0.6.3 seemed always to correspond to the bytes of column 
> name stored. But now in version 0.7.4 that ByteBuffer is not always 
> backed by such an array.
>
> I can still get around this by making the needed buffer myself like :
>
> int ar2Rem = o2.remaining();
>> byte[] ar2 = new byte[ar2Rem];
>> o2.get(ar2, 0, ar2Rem);
>
> Question is : Are the remaining bytes the actual bytes for this column 
> name
> (eg: 20 bytes) or would that ByteBuffer ever be some wrapper around 
> some larger stream of data and the remaining bytes number could be 10 M bytes.
> Thus I would not be able to detect the end of the column to compare 
> and I would possibly be allocating a large unneeded byte array?

As said above, the remaing bytes won't (always) be the actual bytes.
->Then how do I know the end is near? Eg.:  If the stored value is a char string, it would be nice to know the end. Unless I also store it before the char string.
->Assuming that both ByteBuffers have the same data and the same position and limit, thus same remaining, one can imagine a loop comparing each byte until the remaining is used up. Thus then I can not get any more data and thus I should return 0?

> #Using the ByteBuffer's 'get' function also updates the position of 
> the ByteBuffer. Is the compare function expected to do that or should 
> it reset the position back to what it was or ...?

Neither. You should *not* use any function that change the ByteBuffer position.
That is, changing it and resetting it afterward is *not* ok.
->OK
Instead you should only use only the absolute get() methods, that do not change the position at all.
Or, you start your compare function by calling BB.duplicate() on both buffers and then you're free to change the position of the duplicates.
->OK

--
Sylvain

Thanks Sylvain!

Re: Question about AbstractType class

Posted by Sylvain Lebresne <sy...@datastax.com>.
On Wed, Apr 20, 2011 at 1:35 PM, Desimpel, Ignace
<Ig...@nuance.com> wrote:
> Cassandra version 0.7.4
>
>
>
> Hi,
>
>
>
> I created my own java class as an extension of the AbstractType class. But
> I’m not sure about the following items related to the compare function :
>
> # The remaining bytes of the buffer sometimes is zero during thrift
> get_slice execution, however I never store any zero length column name nor
> query for it . If normal, what would be the correct handling of the zero
> remaining bytes?

It is normal, the empty ByteBuffer is used in slice queries to indicate the
beginning of the row (start=""). More generally, compare and validate
should work for anything you store but also anything you provide for
the 'start'
and 'end' argument of slices.

> Would it be something like :
>
> public int compare(ByteBuffer o1, ByteBuffer o2){
> int ar1Rem = o1.remaining();
> int ar2Rem = o2.remaining();
> if ( ar1Rem == 0 || ar2Rem == 0 ) {
> if ( ar1Rem != 0 ) {
>                      return 1;
>               } else if ( ar2Rem != 0 ) {
>                      return -1;
>               } else {
>                      return 0;
>               }
> }
> //Add the real compare here
> …….}

That looks reasonable (though not optimal in the number of comparison :))

> # Since in version 0.6.3 the same function was passing an array of bytes, I
> assumed that I could now call the ByteBuffer.array() function in order to
> get the array of bytes backing up the ByteBuffer.

It's not that simple. First, even if you use ByteBuffer.array(),
you'll have to be
careful that the ByteBuffer has a position, a limit and an arrayOffset and you
should take that into account when accessing the backing array. But there is
also no guarantee that the ByteBuffer will have a backing array so you need to
handle this case too (I refer you to the ByteBuffer documentation).

> Also the length of the
> byte array in 0.6.3 seemed always to correspond to the bytes of column name
> stored. But now in version 0.7.4 that ByteBuffer is not always backed by
> such an array.
>
> I can still get around this by making the needed buffer myself like :
>
> int ar2Rem = o2.remaining();
>> byte[] ar2 = new byte[ar2Rem];
>> o2.get(ar2, 0, ar2Rem);
>
> Question is : Are the remaining bytes the actual bytes for this column name
> (eg: 20 bytes) or would that ByteBuffer ever be some wrapper around some
> larger stream of data and the remaining bytes number could be 10 M bytes.
> Thus I would not be able to detect the end of the column to compare and I
> would possibly be allocating a large unneeded byte array?

As said above, the remaing bytes won't (always) be the actual bytes.

> #Using the ByteBuffer’s ‘get’ function also updates the position of the
> ByteBuffer. Is the compare function expected to do that or should it reset
> the position back to what it was or …?

Neither. You should *not* use any function that change the ByteBuffer position.
That is, changing it and resetting it afterward is *not* ok.
Instead you should only use only the absolute get() methods, that do
not change the
position at all.
Or, you start your compare function by calling BB.duplicate() on both
buffers and
then you're free to change the position of the duplicates.

--
Sylvain