You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Lee Parker <le...@socialagency.com> on 2010/04/15 17:39:58 UTC

timestamp not found

We are currently migrating about 70G of data from mysql to cassandra.  I am
occasionally getting the following error:

Required field 'timestamp' was not found in serialized data! Struct:
Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
timestamp:0)

The loop which is building out the mutation map for the batch_mutate call is
adding a timestamp to each column.  I have verified that the time stamp is
there for several calls and I feel like if the logic was bad, i would see
the error more frequently.  Does anyone have suggestions as to what may be
causing this?

Lee Parker
lee@spredfast.com

[image: Spredfast]

Re: timestamp not found

Posted by Lee Parker <le...@socialagency.com>.
In an attempt to continue trouble shooting these errors, I took the text
from one and converted it from hex to ascii.  Here is the original error:
Required field 'timestamp' was not found in serialized data! Struct:
Column(name:61 75 74 68 6F 72 5F 69 63 6F 6E, value:68 74 74 70 3A 2F 2F 61
31 2E 74 77 69 6D 67 2E 63 6F 6D 2F 70 72 6F 66 69 6C 65 5F 69 6D 61 67 65
73 2F 37 37 35 35 30 33 34 32 2F 61 6E 64 72 65 77 5F 72 6F 6E 64 65 80 01
00 01 00 00 00 0C 62 61 74 63 68, timestamp:0)

Here is the converted information from the "value" of the column:
http://a1.twimg.com/profile_images/77550342/andrew_ronde????????batch

This looks like two batch mutate commands are somehow overlapping each
other.  I can't see how this could happen in my code.  Again, I am using the
PHP Thrift library.  Can anyone help me identify the problem?

Lee Parker
On Thu, Apr 15, 2010 at 3:03 PM, Lee Parker <le...@socialagency.com> wrote:

> I have done more error checking and I am relatively certain that I am
> sending a valid timestamp to the thrift library.  I was testing a switch to
> the Framed Transport instead of Buffered Transport and I am getting fewer
> errors, but now the cassandra server dies when this happens.  It is starting
> to feel like this is a bug in Thrift or the Cassandra Thrift interface.  Can
> anyone offer any other insight?  I'm using the current stable release of
> Thrift 0.2.0, and Cassandra 0.6.0.
>
> It seems to happen more under heavy load. I don't know if that is
> meaningful or not.
>
> Lee Parker
>
> On Thu, Apr 15, 2010 at 11:00 AM, Lee Parker <le...@socialagency.com> wrote:
>
>> I'm actually using PHP.  I do have several php processes running, but each
>> one should have it's own Thrift connection.
>>
>>
>> Lee Parker
>> lee@spredfast.com
>>
>> [image: Spredfast]
>> On Thu, Apr 15, 2010 at 10:53 AM, Jonathan Ellis <jb...@gmail.com>wrote:
>>
>>> Looks like you are using C++ and not setting the "isset" flag on the
>>> timestamp field, so it's getting the default value for a Java long ("0").
>>>
>>> If it works "most of the time" then possibly you are using a Thrift
>>> connection from multiple threads at the same time, which is not safe.
>>>
>>>
>>> On Thu, Apr 15, 2010 at 10:39 AM, Lee Parker <le...@socialagency.com>wrote:
>>>
>>>> We are currently migrating about 70G of data from mysql to cassandra.  I
>>>> am occasionally getting the following error:
>>>>
>>>> Required field 'timestamp' was not found in serialized data! Struct:
>>>> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
>>>> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
>>>> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
>>>> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
>>>> timestamp:0)
>>>>
>>>> The loop which is building out the mutation map for the batch_mutate
>>>> call is adding a timestamp to each column.  I have verified that the time
>>>> stamp is there for several calls and I feel like if the logic was bad, i
>>>> would see the error more frequently.  Does anyone have suggestions as to
>>>> what may be causing this?
>>>>
>>>> Lee Parker
>>>> lee@spredfast.com
>>>>
>>>> [image: Spredfast]
>>>>
>>>
>>>
>>
>

Re: timestamp not found

Posted by Lee Parker <le...@socialagency.com>.
I have done more error checking and I am relatively certain that I am
sending a valid timestamp to the thrift library.  I was testing a switch to
the Framed Transport instead of Buffered Transport and I am getting fewer
errors, but now the cassandra server dies when this happens.  It is starting
to feel like this is a bug in Thrift or the Cassandra Thrift interface.  Can
anyone offer any other insight?  I'm using the current stable release of
Thrift 0.2.0, and Cassandra 0.6.0.

It seems to happen more under heavy load. I don't know if that is meaningful
or not.

Lee Parker

On Thu, Apr 15, 2010 at 11:00 AM, Lee Parker <le...@socialagency.com> wrote:

> I'm actually using PHP.  I do have several php processes running, but each
> one should have it's own Thrift connection.
>
>
> Lee Parker
> lee@spredfast.com
>
> [image: Spredfast]
> On Thu, Apr 15, 2010 at 10:53 AM, Jonathan Ellis <jb...@gmail.com>wrote:
>
>> Looks like you are using C++ and not setting the "isset" flag on the
>> timestamp field, so it's getting the default value for a Java long ("0").
>>
>> If it works "most of the time" then possibly you are using a Thrift
>> connection from multiple threads at the same time, which is not safe.
>>
>>
>> On Thu, Apr 15, 2010 at 10:39 AM, Lee Parker <le...@socialagency.com>wrote:
>>
>>> We are currently migrating about 70G of data from mysql to cassandra.  I
>>> am occasionally getting the following error:
>>>
>>> Required field 'timestamp' was not found in serialized data! Struct:
>>> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
>>> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
>>> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
>>> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
>>> timestamp:0)
>>>
>>> The loop which is building out the mutation map for the batch_mutate call
>>> is adding a timestamp to each column.  I have verified that the time stamp
>>> is there for several calls and I feel like if the logic was bad, i would see
>>> the error more frequently.  Does anyone have suggestions as to what may be
>>> causing this?
>>>
>>> Lee Parker
>>> lee@spredfast.com
>>>
>>> [image: Spredfast]
>>>
>>
>>
>

Re: timestamp not found

Posted by Lee Parker <le...@socialagency.com>.
I'm actually using PHP.  I do have several php processes running, but each
one should have it's own Thrift connection.

Lee Parker
lee@spredfast.com

[image: Spredfast]
On Thu, Apr 15, 2010 at 10:53 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> Looks like you are using C++ and not setting the "isset" flag on the
> timestamp field, so it's getting the default value for a Java long ("0").
>
> If it works "most of the time" then possibly you are using a Thrift
> connection from multiple threads at the same time, which is not safe.
>
>
> On Thu, Apr 15, 2010 at 10:39 AM, Lee Parker <le...@socialagency.com> wrote:
>
>> We are currently migrating about 70G of data from mysql to cassandra.  I
>> am occasionally getting the following error:
>>
>> Required field 'timestamp' was not found in serialized data! Struct:
>> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
>> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
>> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
>> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
>> timestamp:0)
>>
>> The loop which is building out the mutation map for the batch_mutate call
>> is adding a timestamp to each column.  I have verified that the time stamp
>> is there for several calls and I feel like if the logic was bad, i would see
>> the error more frequently.  Does anyone have suggestions as to what may be
>> causing this?
>>
>> Lee Parker
>> lee@spredfast.com
>>
>> [image: Spredfast]
>>
>
>

Re: timestamp not found

Posted by Jonathan Ellis <jb...@gmail.com>.
Looks like you are using C++ and not setting the "isset" flag on the
timestamp field, so it's getting the default value for a Java long ("0").

If it works "most of the time" then possibly you are using a Thrift
connection from multiple threads at the same time, which is not safe.

On Thu, Apr 15, 2010 at 10:39 AM, Lee Parker <le...@socialagency.com> wrote:

> We are currently migrating about 70G of data from mysql to cassandra.  I am
> occasionally getting the following error:
>
> Required field 'timestamp' was not found in serialized data! Struct:
> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
> timestamp:0)
>
> The loop which is building out the mutation map for the batch_mutate call
> is adding a timestamp to each column.  I have verified that the time stamp
> is there for several calls and I feel like if the logic was bad, i would see
> the error more frequently.  Does anyone have suggestions as to what may be
> causing this?
>
> Lee Parker
> lee@spredfast.com
>
> [image: Spredfast]
>

Re: timestamp not found

Posted by Lee Parker <le...@socialagency.com>.
When I am verifying the columns in the mutation map before sending it to
cassandra, none of the timestamps are 0.  I have had a difficult time
recreating the error in a controlled environment so I can see the mutation
map that was actually sent.

Lee Parker
lee@spredfast.com

[image: Spredfast]
On Thu, Apr 15, 2010 at 10:45 AM, Mike Malone <mi...@simplegeo.com> wrote:

> Looks like the timestamp, in this case, is 0. Does Cassandra allow zero
> timestamps? Could be a bug in Cassandra doing an implicit boolean coercion
> in a conditional where it shouldn't.
>
> Mike
>
>
> On Thu, Apr 15, 2010 at 8:39 AM, Lee Parker <le...@socialagency.com> wrote:
>
>> We are currently migrating about 70G of data from mysql to cassandra.  I
>> am occasionally getting the following error:
>>
>> Required field 'timestamp' was not found in serialized data! Struct:
>> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
>> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
>> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
>> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
>> timestamp:0)
>>
>> The loop which is building out the mutation map for the batch_mutate call
>> is adding a timestamp to each column.  I have verified that the time stamp
>> is there for several calls and I feel like if the logic was bad, i would see
>> the error more frequently.  Does anyone have suggestions as to what may be
>> causing this?
>>
>> Lee Parker
>> lee@spredfast.com
>>
>> [image: Spredfast]
>>
>
>

Re: timestamp not found

Posted by Mike Malone <mi...@simplegeo.com>.
Looks like the timestamp, in this case, is 0. Does Cassandra allow zero
timestamps? Could be a bug in Cassandra doing an implicit boolean coercion
in a conditional where it shouldn't.

Mike

On Thu, Apr 15, 2010 at 8:39 AM, Lee Parker <le...@socialagency.com> wrote:

> We are currently migrating about 70G of data from mysql to cassandra.  I am
> occasionally getting the following error:
>
> Required field 'timestamp' was not found in serialized data! Struct:
> Column(name:74 65 78 74, value:44 61 73 20 6C 69 65 62 20 69 63 68 20 76 6F
> 6E 20 23 49 6E 61 3A 20 68 74 74 70 3A 2F 2F 77 77 77 2E 79 6F 75 74 75 62
> 65 2E 63 6F 6D 2F 77 61 74 63 68 3F 76 3D 70 75 38 4B 54 77 79 64 56 77 6B
> 26 66 65 61 74 75 72 65 3D 72 65 6C 61 74 65 64 20 40 70 6A 80 01 00 01 00,
> timestamp:0)
>
> The loop which is building out the mutation map for the batch_mutate call
> is adding a timestamp to each column.  I have verified that the time stamp
> is there for several calls and I feel like if the logic was bad, i would see
> the error more frequently.  Does anyone have suggestions as to what may be
> causing this?
>
> Lee Parker
> lee@spredfast.com
>
> [image: Spredfast]
>