You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Lee Parker <le...@socialagency.com> on 2010/04/12 17:02:52 UTC

frequent "unknown result" errors

I am a newbie with Cassandra.  We are currently migrating a large amount of
data out of MySQL into Cassandra.  I have two ColumnFamilies.  One contains
one row per item and each item has roughly 12 columns.  These are items from
REST APIs like the Twitter API.  Then I have a second ColumnFamily with very
large rows and TimeUUID column names which contain the key of the items in
the other ColumnFamily.  So one ColumnFamily has lots of rows with a low
number of columns per row, and the other has relatively few rows with a
large (~500k) columns per row.

I am getting rather frequent errors with "unknown result" from get_slice and
multiget_slice calls from the index ColumnFamily.  I am using Pandra for the
calls.  I can see that this is a generic exception thrown by the Cassandra
Thrift package when it doesn't know what else to say.  Is there a way to
actually see what the result was in a more raw form from the Thrift
protocol?

One thought I had on why this is happening is that my results might be
larger than the configuration settings.  Does anyone have any good ideas on
how to calculate what the ideal values of SlicedBufferSizeInKB
and ColumnIndexSizeInKB should be?  If these are too low, would i get a more
descriptive error?

Lee Parker

Re: frequent "unknown result" errors

Posted by Michael Pearson <mj...@gmail.com>.
Lee, I dropped (official) 0.5 support from Pandra yesterday and
committed 0.6 Thrift files, if you're still considering that
upgrade... worth a shot imo.

-michael

On Tue, Apr 13, 2010 at 7:19 AM, Lee Parker <le...@socialagency.com> wrote:
> So, it didn't get rid of the problem, i'm still getting the errors.  The
> only thing I can think of now is top upgrade to 0.6, but I would prefer to
> stay with the current stable release.  I have regenerated the thrift code
> for 0.5.0 and there is no difference between those files and the ones i'm
> using in my software now.  Are there any other suggestions?  What code would
> be helpful to see?
> Lee
>
> On Mon, Apr 12, 2010 at 1:17 PM, Keith Thornhill <ke...@raptr.com> wrote:
>>
>> i also noticed "unknown result" errors when my php thrift code was
>> generated using a different version of thrift than cassandra uses.
>>
>> after regenerating my php code from thrift-r917130 (for
>> cassandra-0.6.0-rc1), the errors stopped.
>>
>> -keith
>>
>> On Mon, Apr 12, 2010 at 9:40 AM, vineet daniel <vi...@gmail.com>
>> wrote:
>> > can you post the code
>> >
>> > On Mon, Apr 12, 2010 at 9:22 PM, Lee Parker <le...@socialagency.com>
>> > wrote:
>> >>
>> >> According to his docs, he says you need Cassandra >= 0.5.0.  I guess it
>> >> is
>> >> possible that the included thrift files are targeted at 0.6, but I
>> >> don't see
>> >> the "batch_mutate" method which is part of 0.6.  So I'm assuming that
>> >> it
>> >> should work fine with 0.5.0.
>> >> I have now changed some of those entries in the configs and I have not
>> >> seen the error in a while.  So, it may have simply been that I was
>> >> trying to
>> >> do a query which was too large for the configured buffer to handle.
>> >> For the time being, I would like to stick with 0.5 as it is the
>> >> "stable"
>> >> release and we are running this in a production environment.
>> >>
>> >> Lee Parker
>> >> On Mon, Apr 12, 2010 at 10:45 AM, Jonathan Ellis <jb...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Pandra is probably targetting 0.6.
>> >>>
>> >>> If you're just starting, there's no reason for you not to use 0.6 over
>> >>> 0.5 now.
>> >>>
>> >>> On Mon, Apr 12, 2010 at 10:42 AM, Lee Parker <le...@socialagency.com>
>> >>> wrote:
>> >>> > I'm using the thrift client which is packaged with Pandra and my
>> >>> > cassandra
>> >>> > version is 0.5.0 which is in the debian packages.  How can i tell
>> >>> > which
>> >>> > version of Thrift i'm using?
>> >>> > Lee
>> >>> >
>> >>> > On Mon, Apr 12, 2010 at 10:30 AM, Jonathan Ellis <jb...@gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> Then you're probably using a client incompatible with the server
>> >>> >> version you're using.
>> >>> >>
>> >>> >> On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker <le...@socialagency.com>
>> >>> >> wrote:
>> >>> >> > If the connections are being made by individual PHP processes
>> >>> >> > running
>> >>> >> > from
>> >>> >> > the command line, they shouldn't be using the same connection.
>> >>> >> >  Should
>> >>> >> > my
>> >>> >> > code close the connections after each query and open a new one?
>> >>> >> > Here is the flow of what is happening when we get the error:
>> >>> >> > 1. Get a set of items from remote API
>> >>> >> > 2. Insert all of the items into the items CF. (usually anywhere
>> >>> >> > from
>> >>> >> > 2 -
>> >>> >> > 200
>> >>> >> > items)
>> >>> >> > 3. Query the correct index for all entries within a particular
>> >>> >> > time
>> >>> >> > frame
>> >>> >> > (which is determined by the timeframe of the results of step 1)
>> >>> >> > 4. Compare keys in index to keys of items inserted in step 2.
>> >>> >> > 5. Insert new index columns for items which aren't already in the
>> >>> >> > index.
>> >>> >> > I am getting the "unknown result" error during step 3.
>> >>> >> > Lee
>> >>> >> >
>> >>> >> > On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis
>> >>> >> > <jb...@gmail.com>
>> >>> >> > wrote:
>> >>> >> >>
>> >>> >> >> unknown result means thrift is badly confused.  You will get
>> >>> >> >> this
>> >>> >> >> when
>> >>> >> >> using the same thrift connection from multiple threads, for
>> >>> >> >> instance.
>> >>> >> >>
>> >>> >> >> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker
>> >>> >> >> <le...@socialagency.com>
>> >>> >> >> wrote:
>> >>> >> >> > I am a newbie with Cassandra.  We are currently migrating a
>> >>> >> >> > large
>> >>> >> >> > amount
>> >>> >> >> > of
>> >>> >> >> > data out of MySQL into Cassandra.  I have two ColumnFamilies.
>> >>> >> >> >  One
>> >>> >> >> > contains
>> >>> >> >> > one row per item and each item has roughly 12 columns.  These
>> >>> >> >> > are
>> >>> >> >> > items
>> >>> >> >> > from
>> >>> >> >> > REST APIs like the Twitter API.  Then I have a second
>> >>> >> >> > ColumnFamily
>> >>> >> >> > with
>> >>> >> >> > very
>> >>> >> >> > large rows and TimeUUID column names which contain the key of
>> >>> >> >> > the
>> >>> >> >> > items
>> >>> >> >> > in
>> >>> >> >> > the other ColumnFamily.  So one ColumnFamily has lots of rows
>> >>> >> >> > with a
>> >>> >> >> > low
>> >>> >> >> > number of columns per row, and the other has relatively few
>> >>> >> >> > rows
>> >>> >> >> > with
>> >>> >> >> > a
>> >>> >> >> > large (~500k) columns per row.
>> >>> >> >> > I am getting rather frequent errors with "unknown result" from
>> >>> >> >> > get_slice
>> >>> >> >> > and
>> >>> >> >> > multiget_slice calls from the index ColumnFamily.  I am using
>> >>> >> >> > Pandra
>> >>> >> >> > for
>> >>> >> >> > the
>> >>> >> >> > calls.  I can see that this is a generic exception thrown by
>> >>> >> >> > the
>> >>> >> >> > Cassandra
>> >>> >> >> > Thrift package when it doesn't know what else to say.  Is
>> >>> >> >> > there a
>> >>> >> >> > way
>> >>> >> >> > to
>> >>> >> >> > actually see what the result was in a more raw form from the
>> >>> >> >> > Thrift
>> >>> >> >> > protocol?
>> >>> >> >> > One thought I had on why this is happening is that my results
>> >>> >> >> > might
>> >>> >> >> > be
>> >>> >> >> > larger than the configuration settings.  Does anyone have any
>> >>> >> >> > good
>> >>> >> >> > ideas
>> >>> >> >> > on
>> >>> >> >> > how to calculate what the ideal values of SlicedBufferSizeInKB
>> >>> >> >> > and ColumnIndexSizeInKB should be?  If these are too low,
>> >>> >> >> > would i
>> >>> >> >> > get
>> >>> >> >> > a
>> >>> >> >> > more
>> >>> >> >> > descriptive error?
>> >>> >> >> > Lee Parker
>> >>> >> >
>> >>> >> >
>> >>> >
>> >>> >
>> >>
>> >
>> >
>
>



-- 
http://www.x0rz.com
http://www.linkedin.com/in/mjpearson

Re: frequent "unknown result" errors

Posted by Lee Parker <le...@socialagency.com>.
So, it didn't get rid of the problem, i'm still getting the errors.  The
only thing I can think of now is top upgrade to 0.6, but I would prefer to
stay with the current stable release.  I have regenerated the thrift code
for 0.5.0 and there is no difference between those files and the ones i'm
using in my software now.  Are there any other suggestions?  What code would
be helpful to see?

Lee

On Mon, Apr 12, 2010 at 1:17 PM, Keith Thornhill <ke...@raptr.com> wrote:

> i also noticed "unknown result" errors when my php thrift code was
> generated using a different version of thrift than cassandra uses.
>
> after regenerating my php code from thrift-r917130 (for
> cassandra-0.6.0-rc1), the errors stopped.
>
> -keith
>
> On Mon, Apr 12, 2010 at 9:40 AM, vineet daniel <vi...@gmail.com>
> wrote:
> > can you post the code
> >
> > On Mon, Apr 12, 2010 at 9:22 PM, Lee Parker <le...@socialagency.com>
> wrote:
> >>
> >> According to his docs, he says you need Cassandra >= 0.5.0.  I guess it
> is
> >> possible that the included thrift files are targeted at 0.6, but I don't
> see
> >> the "batch_mutate" method which is part of 0.6.  So I'm assuming that it
> >> should work fine with 0.5.0.
> >> I have now changed some of those entries in the configs and I have not
> >> seen the error in a while.  So, it may have simply been that I was
> trying to
> >> do a query which was too large for the configured buffer to handle.
> >> For the time being, I would like to stick with 0.5 as it is the "stable"
> >> release and we are running this in a production environment.
> >>
> >> Lee Parker
> >> On Mon, Apr 12, 2010 at 10:45 AM, Jonathan Ellis <jb...@gmail.com>
> >> wrote:
> >>>
> >>> Pandra is probably targetting 0.6.
> >>>
> >>> If you're just starting, there's no reason for you not to use 0.6 over
> >>> 0.5 now.
> >>>
> >>> On Mon, Apr 12, 2010 at 10:42 AM, Lee Parker <le...@socialagency.com>
> >>> wrote:
> >>> > I'm using the thrift client which is packaged with Pandra and my
> >>> > cassandra
> >>> > version is 0.5.0 which is in the debian packages.  How can i tell
> which
> >>> > version of Thrift i'm using?
> >>> > Lee
> >>> >
> >>> > On Mon, Apr 12, 2010 at 10:30 AM, Jonathan Ellis <jb...@gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> Then you're probably using a client incompatible with the server
> >>> >> version you're using.
> >>> >>
> >>> >> On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker <le...@socialagency.com>
> >>> >> wrote:
> >>> >> > If the connections are being made by individual PHP processes
> >>> >> > running
> >>> >> > from
> >>> >> > the command line, they shouldn't be using the same connection.
> >>> >> >  Should
> >>> >> > my
> >>> >> > code close the connections after each query and open a new one?
> >>> >> > Here is the flow of what is happening when we get the error:
> >>> >> > 1. Get a set of items from remote API
> >>> >> > 2. Insert all of the items into the items CF. (usually anywhere
> from
> >>> >> > 2 -
> >>> >> > 200
> >>> >> > items)
> >>> >> > 3. Query the correct index for all entries within a particular
> time
> >>> >> > frame
> >>> >> > (which is determined by the timeframe of the results of step 1)
> >>> >> > 4. Compare keys in index to keys of items inserted in step 2.
> >>> >> > 5. Insert new index columns for items which aren't already in the
> >>> >> > index.
> >>> >> > I am getting the "unknown result" error during step 3.
> >>> >> > Lee
> >>> >> >
> >>> >> > On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis <
> jbellis@gmail.com>
> >>> >> > wrote:
> >>> >> >>
> >>> >> >> unknown result means thrift is badly confused.  You will get this
> >>> >> >> when
> >>> >> >> using the same thrift connection from multiple threads, for
> >>> >> >> instance.
> >>> >> >>
> >>> >> >> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker <
> lee@socialagency.com>
> >>> >> >> wrote:
> >>> >> >> > I am a newbie with Cassandra.  We are currently migrating a
> large
> >>> >> >> > amount
> >>> >> >> > of
> >>> >> >> > data out of MySQL into Cassandra.  I have two ColumnFamilies.
> >>> >> >> >  One
> >>> >> >> > contains
> >>> >> >> > one row per item and each item has roughly 12 columns.  These
> are
> >>> >> >> > items
> >>> >> >> > from
> >>> >> >> > REST APIs like the Twitter API.  Then I have a second
> >>> >> >> > ColumnFamily
> >>> >> >> > with
> >>> >> >> > very
> >>> >> >> > large rows and TimeUUID column names which contain the key of
> the
> >>> >> >> > items
> >>> >> >> > in
> >>> >> >> > the other ColumnFamily.  So one ColumnFamily has lots of rows
> >>> >> >> > with a
> >>> >> >> > low
> >>> >> >> > number of columns per row, and the other has relatively few
> rows
> >>> >> >> > with
> >>> >> >> > a
> >>> >> >> > large (~500k) columns per row.
> >>> >> >> > I am getting rather frequent errors with "unknown result" from
> >>> >> >> > get_slice
> >>> >> >> > and
> >>> >> >> > multiget_slice calls from the index ColumnFamily.  I am using
> >>> >> >> > Pandra
> >>> >> >> > for
> >>> >> >> > the
> >>> >> >> > calls.  I can see that this is a generic exception thrown by
> the
> >>> >> >> > Cassandra
> >>> >> >> > Thrift package when it doesn't know what else to say.  Is there
> a
> >>> >> >> > way
> >>> >> >> > to
> >>> >> >> > actually see what the result was in a more raw form from the
> >>> >> >> > Thrift
> >>> >> >> > protocol?
> >>> >> >> > One thought I had on why this is happening is that my results
> >>> >> >> > might
> >>> >> >> > be
> >>> >> >> > larger than the configuration settings.  Does anyone have any
> >>> >> >> > good
> >>> >> >> > ideas
> >>> >> >> > on
> >>> >> >> > how to calculate what the ideal values of SlicedBufferSizeInKB
> >>> >> >> > and ColumnIndexSizeInKB should be?  If these are too low, would
> i
> >>> >> >> > get
> >>> >> >> > a
> >>> >> >> > more
> >>> >> >> > descriptive error?
> >>> >> >> > Lee Parker
> >>> >> >
> >>> >> >
> >>> >
> >>> >
> >>
> >
> >
>

Re: frequent "unknown result" errors

Posted by Keith Thornhill <ke...@raptr.com>.
i also noticed "unknown result" errors when my php thrift code was
generated using a different version of thrift than cassandra uses.

after regenerating my php code from thrift-r917130 (for
cassandra-0.6.0-rc1), the errors stopped.

-keith

On Mon, Apr 12, 2010 at 9:40 AM, vineet daniel <vi...@gmail.com> wrote:
> can you post the code
>
> On Mon, Apr 12, 2010 at 9:22 PM, Lee Parker <le...@socialagency.com> wrote:
>>
>> According to his docs, he says you need Cassandra >= 0.5.0.  I guess it is
>> possible that the included thrift files are targeted at 0.6, but I don't see
>> the "batch_mutate" method which is part of 0.6.  So I'm assuming that it
>> should work fine with 0.5.0.
>> I have now changed some of those entries in the configs and I have not
>> seen the error in a while.  So, it may have simply been that I was trying to
>> do a query which was too large for the configured buffer to handle.
>> For the time being, I would like to stick with 0.5 as it is the "stable"
>> release and we are running this in a production environment.
>>
>> Lee Parker
>> On Mon, Apr 12, 2010 at 10:45 AM, Jonathan Ellis <jb...@gmail.com>
>> wrote:
>>>
>>> Pandra is probably targetting 0.6.
>>>
>>> If you're just starting, there's no reason for you not to use 0.6 over
>>> 0.5 now.
>>>
>>> On Mon, Apr 12, 2010 at 10:42 AM, Lee Parker <le...@socialagency.com>
>>> wrote:
>>> > I'm using the thrift client which is packaged with Pandra and my
>>> > cassandra
>>> > version is 0.5.0 which is in the debian packages.  How can i tell which
>>> > version of Thrift i'm using?
>>> > Lee
>>> >
>>> > On Mon, Apr 12, 2010 at 10:30 AM, Jonathan Ellis <jb...@gmail.com>
>>> > wrote:
>>> >>
>>> >> Then you're probably using a client incompatible with the server
>>> >> version you're using.
>>> >>
>>> >> On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker <le...@socialagency.com>
>>> >> wrote:
>>> >> > If the connections are being made by individual PHP processes
>>> >> > running
>>> >> > from
>>> >> > the command line, they shouldn't be using the same connection.
>>> >> >  Should
>>> >> > my
>>> >> > code close the connections after each query and open a new one?
>>> >> > Here is the flow of what is happening when we get the error:
>>> >> > 1. Get a set of items from remote API
>>> >> > 2. Insert all of the items into the items CF. (usually anywhere from
>>> >> > 2 -
>>> >> > 200
>>> >> > items)
>>> >> > 3. Query the correct index for all entries within a particular time
>>> >> > frame
>>> >> > (which is determined by the timeframe of the results of step 1)
>>> >> > 4. Compare keys in index to keys of items inserted in step 2.
>>> >> > 5. Insert new index columns for items which aren't already in the
>>> >> > index.
>>> >> > I am getting the "unknown result" error during step 3.
>>> >> > Lee
>>> >> >
>>> >> > On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis <jb...@gmail.com>
>>> >> > wrote:
>>> >> >>
>>> >> >> unknown result means thrift is badly confused.  You will get this
>>> >> >> when
>>> >> >> using the same thrift connection from multiple threads, for
>>> >> >> instance.
>>> >> >>
>>> >> >> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker <le...@socialagency.com>
>>> >> >> wrote:
>>> >> >> > I am a newbie with Cassandra.  We are currently migrating a large
>>> >> >> > amount
>>> >> >> > of
>>> >> >> > data out of MySQL into Cassandra.  I have two ColumnFamilies.
>>> >> >> >  One
>>> >> >> > contains
>>> >> >> > one row per item and each item has roughly 12 columns.  These are
>>> >> >> > items
>>> >> >> > from
>>> >> >> > REST APIs like the Twitter API.  Then I have a second
>>> >> >> > ColumnFamily
>>> >> >> > with
>>> >> >> > very
>>> >> >> > large rows and TimeUUID column names which contain the key of the
>>> >> >> > items
>>> >> >> > in
>>> >> >> > the other ColumnFamily.  So one ColumnFamily has lots of rows
>>> >> >> > with a
>>> >> >> > low
>>> >> >> > number of columns per row, and the other has relatively few rows
>>> >> >> > with
>>> >> >> > a
>>> >> >> > large (~500k) columns per row.
>>> >> >> > I am getting rather frequent errors with "unknown result" from
>>> >> >> > get_slice
>>> >> >> > and
>>> >> >> > multiget_slice calls from the index ColumnFamily.  I am using
>>> >> >> > Pandra
>>> >> >> > for
>>> >> >> > the
>>> >> >> > calls.  I can see that this is a generic exception thrown by the
>>> >> >> > Cassandra
>>> >> >> > Thrift package when it doesn't know what else to say.  Is there a
>>> >> >> > way
>>> >> >> > to
>>> >> >> > actually see what the result was in a more raw form from the
>>> >> >> > Thrift
>>> >> >> > protocol?
>>> >> >> > One thought I had on why this is happening is that my results
>>> >> >> > might
>>> >> >> > be
>>> >> >> > larger than the configuration settings.  Does anyone have any
>>> >> >> > good
>>> >> >> > ideas
>>> >> >> > on
>>> >> >> > how to calculate what the ideal values of SlicedBufferSizeInKB
>>> >> >> > and ColumnIndexSizeInKB should be?  If these are too low, would i
>>> >> >> > get
>>> >> >> > a
>>> >> >> > more
>>> >> >> > descriptive error?
>>> >> >> > Lee Parker
>>> >> >
>>> >> >
>>> >
>>> >
>>
>
>

Re: frequent "unknown result" errors

Posted by vineet daniel <vi...@gmail.com>.
can you post the code

On Mon, Apr 12, 2010 at 9:22 PM, Lee Parker <le...@socialagency.com> wrote:

> According to his docs, he says you need Cassandra >= 0.5.0.  I guess it is
> possible that the included thrift files are targeted at 0.6, but I don't see
> the "batch_mutate" method which is part of 0.6.  So I'm assuming that it
> should work fine with 0.5.0.
>
> I have now changed some of those entries in the configs and I have not seen
> the error in a while.  So, it may have simply been that I was trying to do a
> query which was too large for the configured buffer to handle.
>
> For the time being, I would like to stick with 0.5 as it is the "stable"
> release and we are running this in a production environment.
>
> Lee Parker
>
> On Mon, Apr 12, 2010 at 10:45 AM, Jonathan Ellis <jb...@gmail.com>wrote:
>
>> Pandra is probably targetting 0.6.
>>
>> If you're just starting, there's no reason for you not to use 0.6 over 0.5
>> now.
>>
>> On Mon, Apr 12, 2010 at 10:42 AM, Lee Parker <le...@socialagency.com>
>> wrote:
>> > I'm using the thrift client which is packaged with Pandra and my
>> cassandra
>> > version is 0.5.0 which is in the debian packages.  How can i tell which
>> > version of Thrift i'm using?
>> > Lee
>> >
>> > On Mon, Apr 12, 2010 at 10:30 AM, Jonathan Ellis <jb...@gmail.com>
>> wrote:
>> >>
>> >> Then you're probably using a client incompatible with the server
>> >> version you're using.
>> >>
>> >> On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker <le...@socialagency.com>
>> wrote:
>> >> > If the connections are being made by individual PHP processes running
>> >> > from
>> >> > the command line, they shouldn't be using the same connection.
>>  Should
>> >> > my
>> >> > code close the connections after each query and open a new one?
>> >> > Here is the flow of what is happening when we get the error:
>> >> > 1. Get a set of items from remote API
>> >> > 2. Insert all of the items into the items CF. (usually anywhere from
>> 2 -
>> >> > 200
>> >> > items)
>> >> > 3. Query the correct index for all entries within a particular time
>> >> > frame
>> >> > (which is determined by the timeframe of the results of step 1)
>> >> > 4. Compare keys in index to keys of items inserted in step 2.
>> >> > 5. Insert new index columns for items which aren't already in the
>> index.
>> >> > I am getting the "unknown result" error during step 3.
>> >> > Lee
>> >> >
>> >> > On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis <jb...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> unknown result means thrift is badly confused.  You will get this
>> when
>> >> >> using the same thrift connection from multiple threads, for
>> instance.
>> >> >>
>> >> >> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker <le...@socialagency.com>
>> >> >> wrote:
>> >> >> > I am a newbie with Cassandra.  We are currently migrating a large
>> >> >> > amount
>> >> >> > of
>> >> >> > data out of MySQL into Cassandra.  I have two ColumnFamilies.  One
>> >> >> > contains
>> >> >> > one row per item and each item has roughly 12 columns.  These are
>> >> >> > items
>> >> >> > from
>> >> >> > REST APIs like the Twitter API.  Then I have a second ColumnFamily
>> >> >> > with
>> >> >> > very
>> >> >> > large rows and TimeUUID column names which contain the key of the
>> >> >> > items
>> >> >> > in
>> >> >> > the other ColumnFamily.  So one ColumnFamily has lots of rows with
>> a
>> >> >> > low
>> >> >> > number of columns per row, and the other has relatively few rows
>> with
>> >> >> > a
>> >> >> > large (~500k) columns per row.
>> >> >> > I am getting rather frequent errors with "unknown result" from
>> >> >> > get_slice
>> >> >> > and
>> >> >> > multiget_slice calls from the index ColumnFamily.  I am using
>> Pandra
>> >> >> > for
>> >> >> > the
>> >> >> > calls.  I can see that this is a generic exception thrown by the
>> >> >> > Cassandra
>> >> >> > Thrift package when it doesn't know what else to say.  Is there a
>> way
>> >> >> > to
>> >> >> > actually see what the result was in a more raw form from the
>> Thrift
>> >> >> > protocol?
>> >> >> > One thought I had on why this is happening is that my results
>> might
>> >> >> > be
>> >> >> > larger than the configuration settings.  Does anyone have any good
>> >> >> > ideas
>> >> >> > on
>> >> >> > how to calculate what the ideal values of SlicedBufferSizeInKB
>> >> >> > and ColumnIndexSizeInKB should be?  If these are too low, would i
>> get
>> >> >> > a
>> >> >> > more
>> >> >> > descriptive error?
>> >> >> > Lee Parker
>> >> >
>> >> >
>> >
>> >
>>
>
>

Re: frequent "unknown result" errors

Posted by Lee Parker <le...@socialagency.com>.
According to his docs, he says you need Cassandra >= 0.5.0.  I guess it is
possible that the included thrift files are targeted at 0.6, but I don't see
the "batch_mutate" method which is part of 0.6.  So I'm assuming that it
should work fine with 0.5.0.

I have now changed some of those entries in the configs and I have not seen
the error in a while.  So, it may have simply been that I was trying to do a
query which was too large for the configured buffer to handle.

For the time being, I would like to stick with 0.5 as it is the "stable"
release and we are running this in a production environment.

Lee Parker

On Mon, Apr 12, 2010 at 10:45 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> Pandra is probably targetting 0.6.
>
> If you're just starting, there's no reason for you not to use 0.6 over 0.5
> now.
>
> On Mon, Apr 12, 2010 at 10:42 AM, Lee Parker <le...@socialagency.com> wrote:
> > I'm using the thrift client which is packaged with Pandra and my
> cassandra
> > version is 0.5.0 which is in the debian packages.  How can i tell which
> > version of Thrift i'm using?
> > Lee
> >
> > On Mon, Apr 12, 2010 at 10:30 AM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>
> >> Then you're probably using a client incompatible with the server
> >> version you're using.
> >>
> >> On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker <le...@socialagency.com>
> wrote:
> >> > If the connections are being made by individual PHP processes running
> >> > from
> >> > the command line, they shouldn't be using the same connection.  Should
> >> > my
> >> > code close the connections after each query and open a new one?
> >> > Here is the flow of what is happening when we get the error:
> >> > 1. Get a set of items from remote API
> >> > 2. Insert all of the items into the items CF. (usually anywhere from 2
> -
> >> > 200
> >> > items)
> >> > 3. Query the correct index for all entries within a particular time
> >> > frame
> >> > (which is determined by the timeframe of the results of step 1)
> >> > 4. Compare keys in index to keys of items inserted in step 2.
> >> > 5. Insert new index columns for items which aren't already in the
> index.
> >> > I am getting the "unknown result" error during step 3.
> >> > Lee
> >> >
> >> > On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis <jb...@gmail.com>
> >> > wrote:
> >> >>
> >> >> unknown result means thrift is badly confused.  You will get this
> when
> >> >> using the same thrift connection from multiple threads, for instance.
> >> >>
> >> >> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker <le...@socialagency.com>
> >> >> wrote:
> >> >> > I am a newbie with Cassandra.  We are currently migrating a large
> >> >> > amount
> >> >> > of
> >> >> > data out of MySQL into Cassandra.  I have two ColumnFamilies.  One
> >> >> > contains
> >> >> > one row per item and each item has roughly 12 columns.  These are
> >> >> > items
> >> >> > from
> >> >> > REST APIs like the Twitter API.  Then I have a second ColumnFamily
> >> >> > with
> >> >> > very
> >> >> > large rows and TimeUUID column names which contain the key of the
> >> >> > items
> >> >> > in
> >> >> > the other ColumnFamily.  So one ColumnFamily has lots of rows with
> a
> >> >> > low
> >> >> > number of columns per row, and the other has relatively few rows
> with
> >> >> > a
> >> >> > large (~500k) columns per row.
> >> >> > I am getting rather frequent errors with "unknown result" from
> >> >> > get_slice
> >> >> > and
> >> >> > multiget_slice calls from the index ColumnFamily.  I am using
> Pandra
> >> >> > for
> >> >> > the
> >> >> > calls.  I can see that this is a generic exception thrown by the
> >> >> > Cassandra
> >> >> > Thrift package when it doesn't know what else to say.  Is there a
> way
> >> >> > to
> >> >> > actually see what the result was in a more raw form from the Thrift
> >> >> > protocol?
> >> >> > One thought I had on why this is happening is that my results might
> >> >> > be
> >> >> > larger than the configuration settings.  Does anyone have any good
> >> >> > ideas
> >> >> > on
> >> >> > how to calculate what the ideal values of SlicedBufferSizeInKB
> >> >> > and ColumnIndexSizeInKB should be?  If these are too low, would i
> get
> >> >> > a
> >> >> > more
> >> >> > descriptive error?
> >> >> > Lee Parker
> >> >
> >> >
> >
> >
>

Re: frequent "unknown result" errors

Posted by Jonathan Ellis <jb...@gmail.com>.
Pandra is probably targetting 0.6.

If you're just starting, there's no reason for you not to use 0.6 over 0.5 now.

On Mon, Apr 12, 2010 at 10:42 AM, Lee Parker <le...@socialagency.com> wrote:
> I'm using the thrift client which is packaged with Pandra and my cassandra
> version is 0.5.0 which is in the debian packages.  How can i tell which
> version of Thrift i'm using?
> Lee
>
> On Mon, Apr 12, 2010 at 10:30 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>>
>> Then you're probably using a client incompatible with the server
>> version you're using.
>>
>> On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker <le...@socialagency.com> wrote:
>> > If the connections are being made by individual PHP processes running
>> > from
>> > the command line, they shouldn't be using the same connection.  Should
>> > my
>> > code close the connections after each query and open a new one?
>> > Here is the flow of what is happening when we get the error:
>> > 1. Get a set of items from remote API
>> > 2. Insert all of the items into the items CF. (usually anywhere from 2 -
>> > 200
>> > items)
>> > 3. Query the correct index for all entries within a particular time
>> > frame
>> > (which is determined by the timeframe of the results of step 1)
>> > 4. Compare keys in index to keys of items inserted in step 2.
>> > 5. Insert new index columns for items which aren't already in the index.
>> > I am getting the "unknown result" error during step 3.
>> > Lee
>> >
>> > On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis <jb...@gmail.com>
>> > wrote:
>> >>
>> >> unknown result means thrift is badly confused.  You will get this when
>> >> using the same thrift connection from multiple threads, for instance.
>> >>
>> >> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker <le...@socialagency.com>
>> >> wrote:
>> >> > I am a newbie with Cassandra.  We are currently migrating a large
>> >> > amount
>> >> > of
>> >> > data out of MySQL into Cassandra.  I have two ColumnFamilies.  One
>> >> > contains
>> >> > one row per item and each item has roughly 12 columns.  These are
>> >> > items
>> >> > from
>> >> > REST APIs like the Twitter API.  Then I have a second ColumnFamily
>> >> > with
>> >> > very
>> >> > large rows and TimeUUID column names which contain the key of the
>> >> > items
>> >> > in
>> >> > the other ColumnFamily.  So one ColumnFamily has lots of rows with a
>> >> > low
>> >> > number of columns per row, and the other has relatively few rows with
>> >> > a
>> >> > large (~500k) columns per row.
>> >> > I am getting rather frequent errors with "unknown result" from
>> >> > get_slice
>> >> > and
>> >> > multiget_slice calls from the index ColumnFamily.  I am using Pandra
>> >> > for
>> >> > the
>> >> > calls.  I can see that this is a generic exception thrown by the
>> >> > Cassandra
>> >> > Thrift package when it doesn't know what else to say.  Is there a way
>> >> > to
>> >> > actually see what the result was in a more raw form from the Thrift
>> >> > protocol?
>> >> > One thought I had on why this is happening is that my results might
>> >> > be
>> >> > larger than the configuration settings.  Does anyone have any good
>> >> > ideas
>> >> > on
>> >> > how to calculate what the ideal values of SlicedBufferSizeInKB
>> >> > and ColumnIndexSizeInKB should be?  If these are too low, would i get
>> >> > a
>> >> > more
>> >> > descriptive error?
>> >> > Lee Parker
>> >
>> >
>
>

Re: frequent "unknown result" errors

Posted by Lee Parker <le...@socialagency.com>.
I'm using the thrift client which is packaged with Pandra and my cassandra
version is 0.5.0 which is in the debian packages.  How can i tell which
version of Thrift i'm using?

Lee

On Mon, Apr 12, 2010 at 10:30 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> Then you're probably using a client incompatible with the server
> version you're using.
>
> On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker <le...@socialagency.com> wrote:
> > If the connections are being made by individual PHP processes running
> from
> > the command line, they shouldn't be using the same connection.  Should my
> > code close the connections after each query and open a new one?
> > Here is the flow of what is happening when we get the error:
> > 1. Get a set of items from remote API
> > 2. Insert all of the items into the items CF. (usually anywhere from 2 -
> 200
> > items)
> > 3. Query the correct index for all entries within a particular time frame
> > (which is determined by the timeframe of the results of step 1)
> > 4. Compare keys in index to keys of items inserted in step 2.
> > 5. Insert new index columns for items which aren't already in the index.
> > I am getting the "unknown result" error during step 3.
> > Lee
> >
> > On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>
> >> unknown result means thrift is badly confused.  You will get this when
> >> using the same thrift connection from multiple threads, for instance.
> >>
> >> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker <le...@socialagency.com>
> wrote:
> >> > I am a newbie with Cassandra.  We are currently migrating a large
> amount
> >> > of
> >> > data out of MySQL into Cassandra.  I have two ColumnFamilies.  One
> >> > contains
> >> > one row per item and each item has roughly 12 columns.  These are
> items
> >> > from
> >> > REST APIs like the Twitter API.  Then I have a second ColumnFamily
> with
> >> > very
> >> > large rows and TimeUUID column names which contain the key of the
> items
> >> > in
> >> > the other ColumnFamily.  So one ColumnFamily has lots of rows with a
> low
> >> > number of columns per row, and the other has relatively few rows with
> a
> >> > large (~500k) columns per row.
> >> > I am getting rather frequent errors with "unknown result" from
> get_slice
> >> > and
> >> > multiget_slice calls from the index ColumnFamily.  I am using Pandra
> for
> >> > the
> >> > calls.  I can see that this is a generic exception thrown by the
> >> > Cassandra
> >> > Thrift package when it doesn't know what else to say.  Is there a way
> to
> >> > actually see what the result was in a more raw form from the Thrift
> >> > protocol?
> >> > One thought I had on why this is happening is that my results might be
> >> > larger than the configuration settings.  Does anyone have any good
> ideas
> >> > on
> >> > how to calculate what the ideal values of SlicedBufferSizeInKB
> >> > and ColumnIndexSizeInKB should be?  If these are too low, would i get
> a
> >> > more
> >> > descriptive error?
> >> > Lee Parker
> >
> >
>

Re: frequent "unknown result" errors

Posted by Jonathan Ellis <jb...@gmail.com>.
Then you're probably using a client incompatible with the server
version you're using.

On Mon, Apr 12, 2010 at 10:24 AM, Lee Parker <le...@socialagency.com> wrote:
> If the connections are being made by individual PHP processes running from
> the command line, they shouldn't be using the same connection.  Should my
> code close the connections after each query and open a new one?
> Here is the flow of what is happening when we get the error:
> 1. Get a set of items from remote API
> 2. Insert all of the items into the items CF. (usually anywhere from 2 - 200
> items)
> 3. Query the correct index for all entries within a particular time frame
> (which is determined by the timeframe of the results of step 1)
> 4. Compare keys in index to keys of items inserted in step 2.
> 5. Insert new index columns for items which aren't already in the index.
> I am getting the "unknown result" error during step 3.
> Lee
>
> On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>>
>> unknown result means thrift is badly confused.  You will get this when
>> using the same thrift connection from multiple threads, for instance.
>>
>> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker <le...@socialagency.com> wrote:
>> > I am a newbie with Cassandra.  We are currently migrating a large amount
>> > of
>> > data out of MySQL into Cassandra.  I have two ColumnFamilies.  One
>> > contains
>> > one row per item and each item has roughly 12 columns.  These are items
>> > from
>> > REST APIs like the Twitter API.  Then I have a second ColumnFamily with
>> > very
>> > large rows and TimeUUID column names which contain the key of the items
>> > in
>> > the other ColumnFamily.  So one ColumnFamily has lots of rows with a low
>> > number of columns per row, and the other has relatively few rows with a
>> > large (~500k) columns per row.
>> > I am getting rather frequent errors with "unknown result" from get_slice
>> > and
>> > multiget_slice calls from the index ColumnFamily.  I am using Pandra for
>> > the
>> > calls.  I can see that this is a generic exception thrown by the
>> > Cassandra
>> > Thrift package when it doesn't know what else to say.  Is there a way to
>> > actually see what the result was in a more raw form from the Thrift
>> > protocol?
>> > One thought I had on why this is happening is that my results might be
>> > larger than the configuration settings.  Does anyone have any good ideas
>> > on
>> > how to calculate what the ideal values of SlicedBufferSizeInKB
>> > and ColumnIndexSizeInKB should be?  If these are too low, would i get a
>> > more
>> > descriptive error?
>> > Lee Parker
>
>

Re: frequent "unknown result" errors

Posted by Lee Parker <le...@socialagency.com>.
If the connections are being made by individual PHP processes running from
the command line, they shouldn't be using the same connection.  Should my
code close the connections after each query and open a new one?

Here is the flow of what is happening when we get the error:
1. Get a set of items from remote API
2. Insert all of the items into the items CF. (usually anywhere from 2 - 200
items)
3. Query the correct index for all entries within a particular time frame
(which is determined by the timeframe of the results of step 1)
4. Compare keys in index to keys of items inserted in step 2.
5. Insert new index columns for items which aren't already in the index.

I am getting the "unknown result" error during step 3.

Lee

On Mon, Apr 12, 2010 at 10:05 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> unknown result means thrift is badly confused.  You will get this when
> using the same thrift connection from multiple threads, for instance.
>
> On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker <le...@socialagency.com> wrote:
> > I am a newbie with Cassandra.  We are currently migrating a large amount
> of
> > data out of MySQL into Cassandra.  I have two ColumnFamilies.  One
> contains
> > one row per item and each item has roughly 12 columns.  These are items
> from
> > REST APIs like the Twitter API.  Then I have a second ColumnFamily with
> very
> > large rows and TimeUUID column names which contain the key of the items
> in
> > the other ColumnFamily.  So one ColumnFamily has lots of rows with a low
> > number of columns per row, and the other has relatively few rows with a
> > large (~500k) columns per row.
> > I am getting rather frequent errors with "unknown result" from get_slice
> and
> > multiget_slice calls from the index ColumnFamily.  I am using Pandra for
> the
> > calls.  I can see that this is a generic exception thrown by the
> Cassandra
> > Thrift package when it doesn't know what else to say.  Is there a way to
> > actually see what the result was in a more raw form from the Thrift
> > protocol?
> > One thought I had on why this is happening is that my results might be
> > larger than the configuration settings.  Does anyone have any good ideas
> on
> > how to calculate what the ideal values of SlicedBufferSizeInKB
> > and ColumnIndexSizeInKB should be?  If these are too low, would i get a
> more
> > descriptive error?
> > Lee Parker
>

Re: frequent "unknown result" errors

Posted by Jonathan Ellis <jb...@gmail.com>.
unknown result means thrift is badly confused.  You will get this when
using the same thrift connection from multiple threads, for instance.

On Mon, Apr 12, 2010 at 10:02 AM, Lee Parker <le...@socialagency.com> wrote:
> I am a newbie with Cassandra.  We are currently migrating a large amount of
> data out of MySQL into Cassandra.  I have two ColumnFamilies.  One contains
> one row per item and each item has roughly 12 columns.  These are items from
> REST APIs like the Twitter API.  Then I have a second ColumnFamily with very
> large rows and TimeUUID column names which contain the key of the items in
> the other ColumnFamily.  So one ColumnFamily has lots of rows with a low
> number of columns per row, and the other has relatively few rows with a
> large (~500k) columns per row.
> I am getting rather frequent errors with "unknown result" from get_slice and
> multiget_slice calls from the index ColumnFamily.  I am using Pandra for the
> calls.  I can see that this is a generic exception thrown by the Cassandra
> Thrift package when it doesn't know what else to say.  Is there a way to
> actually see what the result was in a more raw form from the Thrift
> protocol?
> One thought I had on why this is happening is that my results might be
> larger than the configuration settings.  Does anyone have any good ideas on
> how to calculate what the ideal values of SlicedBufferSizeInKB
> and ColumnIndexSizeInKB should be?  If these are too low, would i get a more
> descriptive error?
> Lee Parker