You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jim Ancona (JIRA)" <ji...@apache.org> on 2010/10/29 00:28:20 UTC

[jira] Created: (CASSANDRA-1680) Behavior of column value ByteBuffer returned from Thrift calls is surprising

Behavior of column value ByteBuffer returned from Thrift calls is surprising
----------------------------------------------------------------------------

                 Key: CASSANDRA-1680
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1680
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 0.7 beta 2
            Reporter: Jim Ancona


The Thrift Column object has a public ByteBuffer value member. In order to safely use that member, you must first process it call TBaseHelper.rightSize(). The Thrift Column object does this when getValue() is called, but of course has no way to do so when the value ByteBuffer is accessed directly. The end result is that you get unexpected results when accessing the ByteBuffer's underlying byte array. I suppose that arguably users of thrift should understand ByteBuffer semantics well enough to avoid getting burned, but I think in practice this is unlikely. 

You can reproduce the problem with the following sequence of operations in cassandra-cli in the cassandra-0.7 branch:

[default@MyKeyspace] create column family CF1 with comparator=UTF8Type
76ab2284-e2e1-11df-93c0-e700f669bcfc
[default@MyKeyspace] set CF1[key][integer] = Integer(12345678987654321)
Value inserted.
[default@MyKeyspace] get CF1[key][integer]
=> (column=integer, value=12345678987654321, timestamp=1288304422843000)
[default@MyKeyspace] get CF1[key]         
=> (column=integer, value=-8104275257521291409654259134258589618690198366301968500366925384201062337700106679188936883799193604223363867889371831541230052432517523790062908835405316953744191265273683116032, timestamp=1288304422843000)
Returned 1 results.

Note that the first Get returns the value that was set, but the second returns a bogus value because it goes through a code path that uses the value ByteBuffer directly. I can supply a patch for cassandra-cli, but I wonder if there's something that can be done at the Thrift level to make it harder for this to occur.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-1680) Behavior of column value ByteBuffer returned from Thrift calls is surprising

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-1680.
---------------------------------------

    Resolution: Invalid

The ByteBuffer semantics are important for performance and are not going to change.  (see http://blog.rapleaf.com/dev/2010/10/19/striving-for-zero-copies-with-thrift-0.5/) 

As far as Cassandra is concerned, you should probably use Hector instead of raw Thrift.

> Behavior of column value ByteBuffer returned from Thrift calls is surprising
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-1680
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1680
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7 beta 3
>            Reporter: Jim Ancona
>
> The Thrift Column object has a public ByteBuffer value member. In order to safely use that member, you must first process it by calling TBaseHelper.rightSize(). The Thrift Column object does this when getValue() is called, but of course has no way to do so when the value ByteBuffer is accessed directly. The end result is that you get unexpected results when accessing the ByteBuffer's underlying byte array. I suppose that arguably users of thrift should understand ByteBuffer semantics well enough to avoid getting burned, but I think in practice this is unlikely. 
> You can reproduce the problem with the following sequence of operations in cassandra-cli in the cassandra-0.7 branch:
> {code} 
> [default@MyKeyspace] create column family CF1 with comparator=UTF8Type
> 76ab2284-e2e1-11df-93c0-e700f669bcfc
> [default@MyKeyspace] set CF1[key][integer] = Integer(12345678987654321)
> Value inserted.
> [default@MyKeyspace] get CF1[key][integer]
> => (column=integer, value=12345678987654321, timestamp=1288304422843000)
> [default@MyKeyspace] get CF1[key]         
> => (column=integer, value=-8104275257521291409654259134258589618690198366301968500366925384201062337700106679188936883799193604223363867889371831541230052432517523790062908835405316953744191265273683116032, timestamp=1288304422843000)
> Returned 1 results.
> {code} 
> Note that the first Get returns the value that was set, but the second returns a bogus value because it goes through a code path that uses the value ByteBuffer directly. I can supply a patch for cassandra-cli, but I wonder if there's something that can be done at the Thrift level to make it harder for this to occur.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1680) Behavior of column value ByteBuffer returned from Thrift calls is surprising

Posted by "Jim Ancona (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Ancona updated CASSANDRA-1680:
----------------------------------

    Affects Version/s:     (was: 0.7 beta 2)
                       0.7 beta 3

> Behavior of column value ByteBuffer returned from Thrift calls is surprising
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-1680
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1680
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7 beta 3
>            Reporter: Jim Ancona
>
> The Thrift Column object has a public ByteBuffer value member. In order to safely use that member, you must first process it by calling TBaseHelper.rightSize(). The Thrift Column object does this when getValue() is called, but of course has no way to do so when the value ByteBuffer is accessed directly. The end result is that you get unexpected results when accessing the ByteBuffer's underlying byte array. I suppose that arguably users of thrift should understand ByteBuffer semantics well enough to avoid getting burned, but I think in practice this is unlikely. 
> You can reproduce the problem with the following sequence of operations in cassandra-cli in the cassandra-0.7 branch:
> {code} 
> [default@MyKeyspace] create column family CF1 with comparator=UTF8Type
> 76ab2284-e2e1-11df-93c0-e700f669bcfc
> [default@MyKeyspace] set CF1[key][integer] = Integer(12345678987654321)
> Value inserted.
> [default@MyKeyspace] get CF1[key][integer]
> => (column=integer, value=12345678987654321, timestamp=1288304422843000)
> [default@MyKeyspace] get CF1[key]         
> => (column=integer, value=-8104275257521291409654259134258589618690198366301968500366925384201062337700106679188936883799193604223363867889371831541230052432517523790062908835405316953744191265273683116032, timestamp=1288304422843000)
> Returned 1 results.
> {code} 
> Note that the first Get returns the value that was set, but the second returns a bogus value because it goes through a code path that uses the value ByteBuffer directly. I can supply a patch for cassandra-cli, but I wonder if there's something that can be done at the Thrift level to make it harder for this to occur.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1680) Behavior of column value ByteBuffer returned from Thrift calls is surprising

Posted by "Jim Ancona (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926030#action_12926030 ] 

Jim Ancona commented on CASSANDRA-1680:
---------------------------------------

I am using Hector for my own work. cassandra-cli isn't, and is doing it wrong. I'll file a separate bug.

So if the ByteBuffer semantics are important should Column.getValue() have the side effect of calling ˆsetValue(ˆTBaseHelper.rightSize(value)), which does a copy of the bye array?

> Behavior of column value ByteBuffer returned from Thrift calls is surprising
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-1680
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1680
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7 beta 3
>            Reporter: Jim Ancona
>
> The Thrift Column object has a public ByteBuffer value member. In order to safely use that member, you must first process it by calling TBaseHelper.rightSize(). The Thrift Column object does this when getValue() is called, but of course has no way to do so when the value ByteBuffer is accessed directly. The end result is that you get unexpected results when accessing the ByteBuffer's underlying byte array. I suppose that arguably users of thrift should understand ByteBuffer semantics well enough to avoid getting burned, but I think in practice this is unlikely. 
> You can reproduce the problem with the following sequence of operations in cassandra-cli in the cassandra-0.7 branch:
> {code} 
> [default@MyKeyspace] create column family CF1 with comparator=UTF8Type
> 76ab2284-e2e1-11df-93c0-e700f669bcfc
> [default@MyKeyspace] set CF1[key][integer] = Integer(12345678987654321)
> Value inserted.
> [default@MyKeyspace] get CF1[key][integer]
> => (column=integer, value=12345678987654321, timestamp=1288304422843000)
> [default@MyKeyspace] get CF1[key]         
> => (column=integer, value=-8104275257521291409654259134258589618690198366301968500366925384201062337700106679188936883799193604223363867889371831541230052432517523790062908835405316953744191265273683116032, timestamp=1288304422843000)
> Returned 1 results.
> {code} 
> Note that the first Get returns the value that was set, but the second returns a bogus value because it goes through a code path that uses the value ByteBuffer directly. I can supply a patch for cassandra-cli, but I wonder if there's something that can be done at the Thrift level to make it harder for this to occur.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1680) Behavior of column value ByteBuffer returned from Thrift calls is surprising

Posted by "Jim Ancona (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Ancona updated CASSANDRA-1680:
----------------------------------

    Description: 
The Thrift Column object has a public ByteBuffer value member. In order to safely use that member, you must first process it by calling TBaseHelper.rightSize(). The Thrift Column object does this when getValue() is called, but of course has no way to do so when the value ByteBuffer is accessed directly. The end result is that you get unexpected results when accessing the ByteBuffer's underlying byte array. I suppose that arguably users of thrift should understand ByteBuffer semantics well enough to avoid getting burned, but I think in practice this is unlikely. 

You can reproduce the problem with the following sequence of operations in cassandra-cli in the cassandra-0.7 branch:

{code} 
[default@MyKeyspace] create column family CF1 with comparator=UTF8Type
76ab2284-e2e1-11df-93c0-e700f669bcfc
[default@MyKeyspace] set CF1[key][integer] = Integer(12345678987654321)
Value inserted.
[default@MyKeyspace] get CF1[key][integer]
=> (column=integer, value=12345678987654321, timestamp=1288304422843000)
[default@MyKeyspace] get CF1[key]         
=> (column=integer, value=-8104275257521291409654259134258589618690198366301968500366925384201062337700106679188936883799193604223363867889371831541230052432517523790062908835405316953744191265273683116032, timestamp=1288304422843000)
Returned 1 results.
{code} 

Note that the first Get returns the value that was set, but the second returns a bogus value because it goes through a code path that uses the value ByteBuffer directly. I can supply a patch for cassandra-cli, but I wonder if there's something that can be done at the Thrift level to make it harder for this to occur.



  was:
The Thrift Column object has a public ByteBuffer value member. In order to safely use that member, you must first process it call TBaseHelper.rightSize(). The Thrift Column object does this when getValue() is called, but of course has no way to do so when the value ByteBuffer is accessed directly. The end result is that you get unexpected results when accessing the ByteBuffer's underlying byte array. I suppose that arguably users of thrift should understand ByteBuffer semantics well enough to avoid getting burned, but I think in practice this is unlikely. 

You can reproduce the problem with the following sequence of operations in cassandra-cli in the cassandra-0.7 branch:

[default@MyKeyspace] create column family CF1 with comparator=UTF8Type
76ab2284-e2e1-11df-93c0-e700f669bcfc
[default@MyKeyspace] set CF1[key][integer] = Integer(12345678987654321)
Value inserted.
[default@MyKeyspace] get CF1[key][integer]
=> (column=integer, value=12345678987654321, timestamp=1288304422843000)
[default@MyKeyspace] get CF1[key]         
=> (column=integer, value=-8104275257521291409654259134258589618690198366301968500366925384201062337700106679188936883799193604223363867889371831541230052432517523790062908835405316953744191265273683116032, timestamp=1288304422843000)
Returned 1 results.

Note that the first Get returns the value that was set, but the second returns a bogus value because it goes through a code path that uses the value ByteBuffer directly. I can supply a patch for cassandra-cli, but I wonder if there's something that can be done at the Thrift level to make it harder for this to occur.




> Behavior of column value ByteBuffer returned from Thrift calls is surprising
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-1680
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1680
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7 beta 2
>            Reporter: Jim Ancona
>
> The Thrift Column object has a public ByteBuffer value member. In order to safely use that member, you must first process it by calling TBaseHelper.rightSize(). The Thrift Column object does this when getValue() is called, but of course has no way to do so when the value ByteBuffer is accessed directly. The end result is that you get unexpected results when accessing the ByteBuffer's underlying byte array. I suppose that arguably users of thrift should understand ByteBuffer semantics well enough to avoid getting burned, but I think in practice this is unlikely. 
> You can reproduce the problem with the following sequence of operations in cassandra-cli in the cassandra-0.7 branch:
> {code} 
> [default@MyKeyspace] create column family CF1 with comparator=UTF8Type
> 76ab2284-e2e1-11df-93c0-e700f669bcfc
> [default@MyKeyspace] set CF1[key][integer] = Integer(12345678987654321)
> Value inserted.
> [default@MyKeyspace] get CF1[key][integer]
> => (column=integer, value=12345678987654321, timestamp=1288304422843000)
> [default@MyKeyspace] get CF1[key]         
> => (column=integer, value=-8104275257521291409654259134258589618690198366301968500366925384201062337700106679188936883799193604223363867889371831541230052432517523790062908835405316953744191265273683116032, timestamp=1288304422843000)
> Returned 1 results.
> {code} 
> Note that the first Get returns the value that was set, but the second returns a bogus value because it goes through a code path that uses the value ByteBuffer directly. I can supply a patch for cassandra-cli, but I wonder if there's something that can be done at the Thrift level to make it harder for this to occur.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.