You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jone Lura <jo...@ecc.no> on 2010/08/18 20:31:44 UTC

Help with getting Key range with some column limitations

Hi,

We are trying to implement Cassandra to replace one of our biggest SQL tables, and so far we got it working.

However, for testing I'm using Cassandra 0.6.2, Java and Pelops. (Pelops not that important for my question) and need suggestions on how to solve a problem retrieving a key range based on the following.

<Keyspace Name="AIS">
        <ColumnFamily Name="Location"
        ColumnType="Super"
        CompareWith="LongType"
        KeysCached="100%"
        CompareSubcolumnsWith="UTF8Type" />
     ...
    </Keyspace>

The super column got columns for longitude and latitude.

 1. Need to get get max long number for key
 2. The key should also have supercolumns latitude and longitude column intersecting (inside) with a given boundingbox.

Currently I'm doing like this


        KeyRange keyRange = new KeyRange();
        keyRange.setStart_key("");
        keyRange.setEnd_key("");
        keyRange.setCount(700);

And checking every row in db if it match my bounding box.

But there are a lot more than 700 keys.. and if i set a higher count, the get_range_slice get a Timeout Exception.

Any ideas?

Best Regards
Jone

Re: Help with getting Key range with some column limitations

Posted by Chen Xinli <ch...@gmail.com>.
Hi,

If reading latency is tolerable, you can get 700 columns every time, set end
key of last iteration as start key of next iteration, to retrieve all the
data.

Or you can implement a plugin of cassandra to do column filter, only returns
the data you want.
The computation is done locally in cassandra machine, and will be quite fast
to avoid timeoutexception.

2010/8/19 Jone Lura <jo...@ecc.no>

> Hi,
>
> We are trying to implement Cassandra to replace one of our biggest SQL
> tables, and so far we got it working.
>
> However, for testing I'm using Cassandra 0.6.2, Java and Pelops. (Pelops
> not that important for my question) and need suggestions on how to solve a
> problem retrieving a key range based on the following.
>
> <Keyspace Name="AIS">
>         <ColumnFamily Name="Location"
>         ColumnType="Super"
>         CompareWith="LongType"
>         KeysCached="100%"
>         CompareSubcolumnsWith="UTF8Type" />
>      ...
>     </Keyspace>
>
> The super column got columns for longitude and latitude.
>
>  1. Need to get get max long number for key
>  2. The key should also have supercolumns latitude and longitude column
> intersecting (inside) with a given boundingbox.
>
> Currently I'm doing like this
>
>
>         KeyRange keyRange = new KeyRange();
>         keyRange.setStart_key("");
>         keyRange.setEnd_key("");
>         keyRange.setCount(700);
>
> And checking every row in db if it match my bounding box.
>
> But there are a lot more than 700 keys.. and if i set a higher count, the
> get_range_slice get a Timeout Exception.
>
> Any ideas?
>
> Best Regards
> Jone
>



-- 
Best Regards,
Chen Xinli

Re: SV: SV: SV: Help with getting Key range with some column limitations

Posted by "thelastpickle.com" <aa...@thelastpickle.com>.
If you doing geo stuff you may want to take a look at the geo extension for couch db 
http://github.com/vmx/couchdb

Sounds like it may give you many if the features your thinking about out of the box. 

Aaron

On 20 Aug 2010, at 20:54, Jone Lura <jo...@ecc.no> wrote:

> Thank you for your effort.
> 
> Im pretty sure I will make it work.
> 
> Have a nice weekend!
> 
> On 20/08/2010 10:48, Thorvaldsson Justus wrote:
>> 
>> If you only want to check the last 5 min, make time a part of your key
>> And make a customized sort and sort by the time. Remember sort is made when inserting data. http://www.sodeso.nl/?p=421
>> Or make a range check that understands the time limit, should work I think from the top of my head.
>>  
>> But you don’t want a lot of small rows and also not to fat rows so..
>> Perhaps there is some time limit that could be rowed
>> And sc is long&lat and you can figure a way to make a range check on one of them if they arnt to many perhaps It would work.
>> /J
>> There are so many opportunities to model that you probably want to make several and test them
>>  
>> Från: Jone Lura [mailto:jone.lura@ecc.no] 
>> Skickat: den 20 augusti 2010 10:37
>> Till: user@cassandra.apache.org
>> Ämne: Re: SV: SV: Help with getting Key range with some column limitations
>>  
>> Thanks! Read your blog a few times, but it's hard to get rid of sql thinking.
>> 
>> So if I create a new standard ColumnFamily with a rowId and geohash a lat/lon into a UTF8Type, I could geohash the boundingbox, and query for all matching columns. Or do I always need to know the rowId to do a slicerange? I also need to only get the columns that are modified within last 5 minutes.
>> 
>> Jone
>> 
>> 
>> On 20/08/2010 10:05, Thorvaldsson Justus wrote:
>> I think you should try to do it some other way than iterate, it sounds super suboptimal to me. Also the plugin option he was thinking of I think is changing Cassandra sourcecode, kind of hard when Cassandra is changing so fast but very possible. I think you should look at http://blip.tv/file/4015273 and perhaps my blog post about the same thing at www.Justus.st Cassandra post 4 more on Data model
>>  
>> Exampel code in java, start and end key, next iteration the end should be the last key that you collected, depends how you made you model
>> //Keyrange is for what row key, you can specify what row startkey,endkey and how many rows
>> KeyRange keyRange = new KeyRange(700);
>> keyRange.setStart_key(rowId);
>> keyRange.setEnd_key(rowId);
>>  
>>  //Specify what supercolumns you want to get
>> SliceRange sliceRange = new SliceRange();
>> sliceRange.setStart(new byte[] {});
>> sliceRange.setFinish(new byte[] {});
>>  
>> /J
>>  
>> Från: Jone Lura [mailto:jone.lura@ecc.no] 
>> Skickat: den 20 augusti 2010 08:53
>> Till: user@cassandra.apache.org
>> Ämne: Re: SV: Help with getting Key range with some column limitations
>>  
>> Thanks for you suggestions.
>> 
>> I tried to iterate them, however I could not get it to work (pretty sure its my code). Im still not to familiar with Cassandra, so could you provide a small example?
>> 
>> The key count could be up to atleast 20k and maybe more, and users should not wait for more than 10 seconds for their map, so I also want to investigate the plugin suggestion. Does the plugin exist? or do I have to develop it myself? Are there any documentation on plugin development for Cassandra?
>> 
>> Best regards
>> 
>> Jone
>> 
>> 
>> On 19/08/2010 08:42, Thorvaldsson Justus wrote:
>> You should iterate through them, get 200 then go get the next 200 and so on.
>> Also if checking a bounding box to another.. perhaps try sorting them so you could start looking at both ends, perhaps make the iteration smaller until match somehow?
>> Just my simple coins, also upgrading will probably be needed to iterate through RP because of bugs. But that should be simple enough to 6.4
>> /Justus
>>  
>> Från: Jone Lura [mailto:jone.lura@ecc.no] 
>> Skickat: den 18 augusti 2010 20:32
>> Till: user@cassandra.apache.org
>> Ämne: Help with getting Key range with some column limitations
>>  
>> Hi,
>>  
>> We are trying to implement Cassandra to replace one of our biggest SQL tables, and so far we got it working.
>>  
>> However, for testing I'm using Cassandra 0.6.2, Java and Pelops. (Pelops not that important for my question) and need suggestions on how to solve a problem retrieving a key range based on the following.
>>  
>> <Keyspace Name="AIS">
>>       
>> 
>> 
>> 
>>         ColumnType="Super"
>>         CompareWith="LongType"
>>         KeysCached="100%"
>>         CompareSubcolumnsWith="UTF8Type" />
>>      ...
>>     </Keyspace>
>>  
>> The super column got columns for longitude and latitude.
>>  
>>  1. Need to get get max long number for key
>>  2. The key should also have supercolumns latitude and longitude column intersecting (inside) with a given boundingbox.
>>  
>> Currently I'm doing like this
>>  
>>  
>>         KeyRange keyRange = new KeyRange();
>>         keyRange.setStart_key("");
>>         keyRange.setEnd_key("");
>>         keyRange.setCount(700);
>>  
>> And checking every row in db if it match my bounding box.
>>  
>> But there are a lot more than 700 keys.. and if i set a higher count, the get_range_slice get a Timeout Exception.
>>  
>> Any ideas?
>>  
>> Best Regards
>> Jone
>>  
>>  
> 

Re: SV: SV: SV: Help with getting Key range with some column limitations

Posted by Jone Lura <jo...@ecc.no>.
  Thank you for your effort.

Im pretty sure I will make it work.

Have a nice weekend!

On 20/08/2010 10:48, Thorvaldsson Justus wrote:
>
> If you only want to check the last 5 min, make time a part of your key
>
> And make a customized sort and sort by the time. Remember sort is made 
> when inserting data. http://www.sodeso.nl/?p=421
>
> Or make a range check that understands the time limit, should work I 
> think from the top of my head.
>
> But you don't want a lot of small rows and also not to fat rows so..
>
> Perhaps there is some time limit that could be rowed
>
> And sc is long&lat and you can figure a way to make a range check on 
> one of them if they arnt to many perhaps It would work.
>
> /J
>
> There are so many opportunities to model that you probably want to 
> make several and test them
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 20 augusti 2010 10:37
> *Till:* user@cassandra.apache.org
> *Ämne:* Re: SV: SV: Help with getting Key range with some column 
> limitations
>
> Thanks! Read your blog a few times, but it's hard to get rid of sql 
> thinking.
>
> So if I create a new standard ColumnFamily with a rowId and geohash a 
> lat/lon into a UTF8Type, I could geohash the boundingbox, and query 
> for all matching columns. Or do I always need to know the rowId to do 
> a slicerange? I also need to only get the columns that are modified 
> within last 5 minutes.
>
> Jone
>
>
> On 20/08/2010 10:05, Thorvaldsson Justus wrote:
>
> I think you should try to do it some other way than iterate, it sounds 
> super suboptimal to me. Also the plugin option he was thinking of I 
> think is changing Cassandra sourcecode, kind of hard when Cassandra is 
> changing so fast but very possible. I think you should look at 
> http://blip.tv/file/4015273 and perhaps my blog post about the same 
> thing at www.Justus.st <http://www.Justus.st> Cassandra post 4 more on 
> Data model
>
> Exampel code in java, start and end key, next iteration the end should 
> be the last key that you collected, depends how you made you model
>
> //Keyrange is for what row key, you can specify what row 
> startkey,endkey and how many rows
>
> KeyRange keyRange = new KeyRange(700);
>
> keyRange.setStart_key(rowId);
>
> keyRange.setEnd_key(rowId);
>
>  //Specify what supercolumns you want to get
>
> SliceRange sliceRange = new SliceRange();
>
> sliceRange.setStart(new byte[] {});
>
> sliceRange.setFinish(new byte[] {});
>
> /J
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 20 augusti 2010 08:53
> *Till:* user@cassandra.apache.org <ma...@cassandra.apache.org>
> *Ämne:* Re: SV: Help with getting Key range with some column limitations
>
> Thanks for you suggestions.
>
> I tried to iterate them, however I could not get it to work (pretty 
> sure its my code). Im still not to familiar with Cassandra, so could 
> you provide a small example?
>
> The key count could be up to atleast 20k and maybe more, and users 
> should not wait for more than 10 seconds for their map, so I also want 
> to investigate the plugin suggestion. Does the plugin exist? or do I 
> have to develop it myself? Are there any documentation on plugin 
> development for Cassandra?
>
> Best regards
>
> Jone
>
>
> On 19/08/2010 08:42, Thorvaldsson Justus wrote:
>
> You should iterate through them, get 200 then go get the next 200 and 
> so on.
>
> Also if checking a bounding box to another.. perhaps try sorting them 
> so you could start looking at both ends, perhaps make the iteration 
> smaller until match somehow?
>
> Just my simple coins, also upgrading will probably be needed to 
> iterate through RP because of bugs. But that should be simple enough 
> to 6.4
>
> /Justus
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 18 augusti 2010 20:32
> *Till:* user@cassandra.apache.org <ma...@cassandra.apache.org>
> *Ämne:* Help with getting Key range with some column limitations
>
> Hi,
>
> We are trying to implement Cassandra to replace one of our biggest SQL 
> tables, and so far we got it working.
>
> However, for testing I'm using Cassandra 0.6.2, Java and Pelops. 
> (Pelops not that important for my question) and need suggestions on 
> how to solve a problem retrieving a key range based on the following.
>
> <Keyspace Name="AIS">
>
>
>
>
>         ColumnType="Super"
>
>         CompareWith="LongType"
>
>         KeysCached="100%"
>
>         CompareSubcolumnsWith="UTF8Type" />
>
>      ...
>
> </Keyspace>
>
> The super column got columns for longitude and latitude.
>
>  1. Need to get get max long number for key
>
>  2. The key should also have supercolumns latitude and longitude 
> column intersecting (inside) with a given boundingbox.
>
> Currently I'm doing like this
>
>         KeyRange keyRange = new KeyRange();
>
>         keyRange.setStart_key("");
>
>         keyRange.setEnd_key("");
>
>         keyRange.setCount(700);
>
> And checking every row in db if it match my bounding box.
>
> But there are a lot more than 700 keys.. and if i set a higher count, 
> the get_range_slice get a Timeout Exception.
>
> Any ideas?
>
> Best Regards
>
> Jone
>


SV: SV: SV: Help with getting Key range with some column limitations

Posted by Thorvaldsson Justus <ju...@svenskaspel.se>.
If you only want to check the last 5 min, make time a part of your key
And make a customized sort and sort by the time. Remember sort is made when inserting data. http://www.sodeso.nl/?p=421
Or make a range check that understands the time limit, should work I think from the top of my head.

But you don't want a lot of small rows and also not to fat rows so..
Perhaps there is some time limit that could be rowed
And sc is long&lat and you can figure a way to make a range check on one of them if they arnt to many perhaps It would work.
/J
There are so many opportunities to model that you probably want to make several and test them

Från: Jone Lura [mailto:jone.lura@ecc.no]
Skickat: den 20 augusti 2010 10:37
Till: user@cassandra.apache.org
Ämne: Re: SV: SV: Help with getting Key range with some column limitations

Thanks! Read your blog a few times, but it's hard to get rid of sql thinking.

So if I create a new standard ColumnFamily with a rowId and geohash a lat/lon into a UTF8Type, I could geohash the boundingbox, and query for all matching columns. Or do I always need to know the rowId to do a slicerange? I also need to only get the columns that are modified within last 5 minutes.

Jone


On 20/08/2010 10:05, Thorvaldsson Justus wrote:
I think you should try to do it some other way than iterate, it sounds super suboptimal to me. Also the plugin option he was thinking of I think is changing Cassandra sourcecode, kind of hard when Cassandra is changing so fast but very possible. I think you should look at http://blip.tv/file/4015273 and perhaps my blog post about the same thing at www.Justus.st<http://www.Justus.st> Cassandra post 4 more on Data model

Exampel code in java, start and end key, next iteration the end should be the last key that you collected, depends how you made you model
//Keyrange is for what row key, you can specify what row startkey,endkey and how many rows
KeyRange keyRange = new KeyRange(700);
keyRange.setStart_key(rowId);
keyRange.setEnd_key(rowId);

 //Specify what supercolumns you want to get
SliceRange sliceRange = new SliceRange();
sliceRange.setStart(new byte[] {});
sliceRange.setFinish(new byte[] {});

/J

Från: Jone Lura [mailto:jone.lura@ecc.no]
Skickat: den 20 augusti 2010 08:53
Till: user@cassandra.apache.org<ma...@cassandra.apache.org>
Ämne: Re: SV: Help with getting Key range with some column limitations

Thanks for you suggestions.

I tried to iterate them, however I could not get it to work (pretty sure its my code). Im still not to familiar with Cassandra, so could you provide a small example?

The key count could be up to atleast 20k and maybe more, and users should not wait for more than 10 seconds for their map, so I also want to investigate the plugin suggestion. Does the plugin exist? or do I have to develop it myself? Are there any documentation on plugin development for Cassandra?

Best regards

Jone


On 19/08/2010 08:42, Thorvaldsson Justus wrote:
You should iterate through them, get 200 then go get the next 200 and so on.
Also if checking a bounding box to another.. perhaps try sorting them so you could start looking at both ends, perhaps make the iteration smaller until match somehow?
Just my simple coins, also upgrading will probably be needed to iterate through RP because of bugs. But that should be simple enough to 6.4
/Justus

Från: Jone Lura [mailto:jone.lura@ecc.no]
Skickat: den 18 augusti 2010 20:32
Till: user@cassandra.apache.org<ma...@cassandra.apache.org>
Ämne: Help with getting Key range with some column limitations

Hi,

We are trying to implement Cassandra to replace one of our biggest SQL tables, and so far we got it working.

However, for testing I'm using Cassandra 0.6.2, Java and Pelops. (Pelops not that important for my question) and need suggestions on how to solve a problem retrieving a key range based on the following.

<Keyspace Name="AIS">




        ColumnType="Super"
        CompareWith="LongType"
        KeysCached="100%"
        CompareSubcolumnsWith="UTF8Type" />
     ...
    </Keyspace>

The super column got columns for longitude and latitude.

 1. Need to get get max long number for key
 2. The key should also have supercolumns latitude and longitude column intersecting (inside) with a given boundingbox.

Currently I'm doing like this


        KeyRange keyRange = new KeyRange();
        keyRange.setStart_key("");
        keyRange.setEnd_key("");
        keyRange.setCount(700);

And checking every row in db if it match my bounding box.

But there are a lot more than 700 keys.. and if i set a higher count, the get_range_slice get a Timeout Exception.

Any ideas?

Best Regards
Jone



Re: SV: SV: Help with getting Key range with some column limitations

Posted by Jone Lura <jo...@ecc.no>.
  Thanks! Read your blog a few times, but it's hard to get rid of sql 
thinking.

So if I create a new standard ColumnFamily with a rowId and geohash a 
lat/lon into a UTF8Type, I could geohash the boundingbox, and query for 
all matching columns. Or do I always need to know the rowId to do a 
slicerange? I also need to only get the columns that are modified within 
last 5 minutes.

Jone


On 20/08/2010 10:05, Thorvaldsson Justus wrote:
>
> I think you should try to do it some other way than iterate, it sounds 
> super suboptimal to me. Also the plugin option he was thinking of I 
> think is changing Cassandra sourcecode, kind of hard when Cassandra is 
> changing so fast but very possible. I think you should look at 
> http://blip.tv/file/4015273 and perhaps my blog post about the same 
> thing at www.Justus.st <http://www.Justus.st> Cassandra post 4 more on 
> Data model
>
> Exampel code in java, start and end key, next iteration the end should 
> be the last key that you collected, depends how you made you model
>
> //Keyrange is for what row key, you can specify what row 
> startkey,endkey and how many rows
>
> KeyRange keyRange = new KeyRange(700);
>
> keyRange.setStart_key(rowId);
>
> keyRange.setEnd_key(rowId);
>
>  //Specify what supercolumns you want to get
>
> SliceRange sliceRange = new SliceRange();
>
> sliceRange.setStart(new byte[] {});
>
> sliceRange.setFinish(new byte[] {});
>
> /J
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 20 augusti 2010 08:53
> *Till:* user@cassandra.apache.org
> *Ämne:* Re: SV: Help with getting Key range with some column limitations
>
> Thanks for you suggestions.
>
> I tried to iterate them, however I could not get it to work (pretty 
> sure its my code). Im still not to familiar with Cassandra, so could 
> you provide a small example?
>
> The key count could be up to atleast 20k and maybe more, and users 
> should not wait for more than 10 seconds for their map, so I also want 
> to investigate the plugin suggestion. Does the plugin exist? or do I 
> have to develop it myself? Are there any documentation on plugin 
> development for Cassandra?
>
> Best regards
>
> Jone
>
>
> On 19/08/2010 08:42, Thorvaldsson Justus wrote:
>
> You should iterate through them, get 200 then go get the next 200 and 
> so on.
>
> Also if checking a bounding box to another.. perhaps try sorting them 
> so you could start looking at both ends, perhaps make the iteration 
> smaller until match somehow?
>
> Just my simple coins, also upgrading will probably be needed to 
> iterate through RP because of bugs. But that should be simple enough 
> to 6.4
>
> /Justus
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 18 augusti 2010 20:32
> *Till:* user@cassandra.apache.org <ma...@cassandra.apache.org>
> *Ämne:* Help with getting Key range with some column limitations
>
> Hi,
>
> We are trying to implement Cassandra to replace one of our biggest SQL 
> tables, and so far we got it working.
>
> However, for testing I'm using Cassandra 0.6.2, Java and Pelops. 
> (Pelops not that important for my question) and need suggestions on 
> how to solve a problem retrieving a key range based on the following.
>
> <Keyspace Name="AIS">
>
>


>         ColumnType="Super"
>
>         CompareWith="LongType"
>
>         KeysCached="100%"
>
>         CompareSubcolumnsWith="UTF8Type" />
>
>      ...
>
> </Keyspace>
>
> The super column got columns for longitude and latitude.
>
>  1. Need to get get max long number for key
>
>  2. The key should also have supercolumns latitude and longitude 
> column intersecting (inside) with a given boundingbox.
>
> Currently I'm doing like this
>
>         KeyRange keyRange = new KeyRange();
>
>         keyRange.setStart_key("");
>
>         keyRange.setEnd_key("");
>
>         keyRange.setCount(700);
>
> And checking every row in db if it match my bounding box.
>
> But there are a lot more than 700 keys.. and if i set a higher count, 
> the get_range_slice get a Timeout Exception.
>
> Any ideas?
>
> Best Regards
>
> Jone
>


Re: SV: SV: Help with getting Key range with some column limitations

Posted by Mark <st...@gmail.com>.
  On 8/20/10 1:05 AM, Thorvaldsson Justus wrote:
>
> I think you should try to do it some other way than iterate, it sounds 
> super suboptimal to me. Also the plugin option he was thinking of I 
> think is changing Cassandra sourcecode, kind of hard when Cassandra is 
> changing so fast but very possible. I think you should look at 
> http://blip.tv/file/4015273 and perhaps my blog post about the same 
> thing at www.Justus.st <http://www.Justus.st> Cassandra post 4 more on 
> Data model
>
> Exampel code in java, start and end key, next iteration the end should 
> be the last key that you collected, depends how you made you model
>
> //Keyrange is for what row key, you can specify what row 
> startkey,endkey and how many rows
>
> KeyRange keyRange = new KeyRange(700);
>
> keyRange.setStart_key(rowId);
>
> keyRange.setEnd_key(rowId);
>
>  //Specify what supercolumns you want to get
>
> SliceRange sliceRange = new SliceRange();
>
> sliceRange.setStart(new byte[] {});
>
> sliceRange.setFinish(new byte[] {});
>
> /J
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 20 augusti 2010 08:53
> *Till:* user@cassandra.apache.org
> *Ämne:* Re: SV: Help with getting Key range with some column limitations
>
> Thanks for you suggestions.
>
> I tried to iterate them, however I could not get it to work (pretty 
> sure its my code). Im still not to familiar with Cassandra, so could 
> you provide a small example?
>
> The key count could be up to atleast 20k and maybe more, and users 
> should not wait for more than 10 seconds for their map, so I also want 
> to investigate the plugin suggestion. Does the plugin exist? or do I 
> have to develop it myself? Are there any documentation on plugin 
> development for Cassandra?
>
> Best regards
>
> Jone
>
>
> On 19/08/2010 08:42, Thorvaldsson Justus wrote:
>
> You should iterate through them, get 200 then go get the next 200 and 
> so on.
>
> Also if checking a bounding box to another.. perhaps try sorting them 
> so you could start looking at both ends, perhaps make the iteration 
> smaller until match somehow?
>
> Just my simple coins, also upgrading will probably be needed to 
> iterate through RP because of bugs. But that should be simple enough 
> to 6.4
>
> /Justus
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 18 augusti 2010 20:32
> *Till:* user@cassandra.apache.org <ma...@cassandra.apache.org>
> *Ämne:* Help with getting Key range with some column limitations
>
> Hi,
>
> We are trying to implement Cassandra to replace one of our biggest SQL 
> tables, and so far we got it working.
>
> However, for testing I'm using Cassandra 0.6.2, Java and Pelops. 
> (Pelops not that important for my question) and need suggestions on 
> how to solve a problem retrieving a key range based on the following.
>
> <Keyspace Name="AIS">
>
> <ColumnFamily Name="Location"
>
>         ColumnType="Super"
>
>         CompareWith="LongType"
>
>         KeysCached="100%"
>
>         CompareSubcolumnsWith="UTF8Type" />
>
>      ...
>
> </Keyspace>
>
> The super column got columns for longitude and latitude.
>
>  1. Need to get get max long number for key
>
>  2. The key should also have supercolumns latitude and longitude 
> column intersecting (inside) with a given boundingbox.
>
> Currently I'm doing like this
>
>         KeyRange keyRange = new KeyRange();
>
>         keyRange.setStart_key("");
>
>         keyRange.setEnd_key("");
>
>         keyRange.setCount(700);
>
> And checking every row in db if it match my bounding box.
>
> But there are a lot more than 700 keys.. and if i set a higher count, 
> the get_range_slice get a Timeout Exception.
>
> Any ideas?
>
> Best Regards
>
> Jone
>
In regards to Cassandra #3..

"Increasing the memtable thresholds so that you create less sstables, 
but larger ones, is also a good idea. The defaults are small so 
Cassandra can work on a 1GB heap which is much smaller than most 
production ones. Reasonable rule of thumb: if you have a heap of N GB, 
increase both the throughput and count thresholds by N times."

What throughput and count thresholds is he referring to? There are 
multiple throughput options and I am not sure what the count threshold is.

binary_memtable_throughput_in_mb
memtable_throughput_in_mb
memtable_operations_in_millions (count?)


SV: SV: Help with getting Key range with some column limitations

Posted by Thorvaldsson Justus <ju...@svenskaspel.se>.
I think you should try to do it some other way than iterate, it sounds super suboptimal to me. Also the plugin option he was thinking of I think is changing Cassandra sourcecode, kind of hard when Cassandra is changing so fast but very possible. I think you should look at http://blip.tv/file/4015273 and perhaps my blog post about the same thing at www.Justus.st<http://www.Justus.st> Cassandra post 4 more on Data model

Exampel code in java, start and end key, next iteration the end should be the last key that you collected, depends how you made you model
//Keyrange is for what row key, you can specify what row startkey,endkey and how many rows
KeyRange keyRange = new KeyRange(700);
keyRange.setStart_key(rowId);
keyRange.setEnd_key(rowId);

 //Specify what supercolumns you want to get
SliceRange sliceRange = new SliceRange();
sliceRange.setStart(new byte[] {});
sliceRange.setFinish(new byte[] {});

/J

Från: Jone Lura [mailto:jone.lura@ecc.no]
Skickat: den 20 augusti 2010 08:53
Till: user@cassandra.apache.org
Ämne: Re: SV: Help with getting Key range with some column limitations

Thanks for you suggestions.

I tried to iterate them, however I could not get it to work (pretty sure its my code). Im still not to familiar with Cassandra, so could you provide a small example?

The key count could be up to atleast 20k and maybe more, and users should not wait for more than 10 seconds for their map, so I also want to investigate the plugin suggestion. Does the plugin exist? or do I have to develop it myself? Are there any documentation on plugin development for Cassandra?

Best regards

Jone


On 19/08/2010 08:42, Thorvaldsson Justus wrote:
You should iterate through them, get 200 then go get the next 200 and so on.
Also if checking a bounding box to another.. perhaps try sorting them so you could start looking at both ends, perhaps make the iteration smaller until match somehow?
Just my simple coins, also upgrading will probably be needed to iterate through RP because of bugs. But that should be simple enough to 6.4
/Justus

Från: Jone Lura [mailto:jone.lura@ecc.no]
Skickat: den 18 augusti 2010 20:32
Till: user@cassandra.apache.org<ma...@cassandra.apache.org>
Ämne: Help with getting Key range with some column limitations

Hi,

We are trying to implement Cassandra to replace one of our biggest SQL tables, and so far we got it working.

However, for testing I'm using Cassandra 0.6.2, Java and Pelops. (Pelops not that important for my question) and need suggestions on how to solve a problem retrieving a key range based on the following.

<Keyspace Name="AIS">
        <ColumnFamily Name="Location"
        ColumnType="Super"
        CompareWith="LongType"
        KeysCached="100%"
        CompareSubcolumnsWith="UTF8Type" />
     ...
    </Keyspace>

The super column got columns for longitude and latitude.

 1. Need to get get max long number for key
 2. The key should also have supercolumns latitude and longitude column intersecting (inside) with a given boundingbox.

Currently I'm doing like this


        KeyRange keyRange = new KeyRange();
        keyRange.setStart_key("");
        keyRange.setEnd_key("");
        keyRange.setCount(700);

And checking every row in db if it match my bounding box.

But there are a lot more than 700 keys.. and if i set a higher count, the get_range_slice get a Timeout Exception.

Any ideas?

Best Regards
Jone


Re: SV: Help with getting Key range with some column limitations

Posted by Jone Lura <jo...@ecc.no>.
  Thanks for you suggestions.

I tried to iterate them, however I could not get it to work (pretty sure 
its my code). Im still not to familiar with Cassandra, so could you 
provide a small example?

The key count could be up to atleast 20k and maybe more, and users 
should not wait for more than 10 seconds for their map, so I also want 
to investigate the plugin suggestion. Does the plugin exist? or do I 
have to develop it myself? Are there any documentation on plugin 
development for Cassandra?

Best regards

Jone


On 19/08/2010 08:42, Thorvaldsson Justus wrote:
>
> You should iterate through them, get 200 then go get the next 200 and 
> so on.
>
> Also if checking a bounding box to another.. perhaps try sorting them 
> so you could start looking at both ends, perhaps make the iteration 
> smaller until match somehow?
>
> Just my simple coins, also upgrading will probably be needed to 
> iterate through RP because of bugs. But that should be simple enough 
> to 6.4
>
> /Justus
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 18 augusti 2010 20:32
> *Till:* user@cassandra.apache.org
> *Ämne:* Help with getting Key range with some column limitations
>
> Hi,
>
> We are trying to implement Cassandra to replace one of our biggest SQL 
> tables, and so far we got it working.
>
> However, for testing I'm using Cassandra 0.6.2, Java and Pelops. 
> (Pelops not that important for my question) and need suggestions on 
> how to solve a problem retrieving a key range based on the following.
>
> <Keyspace Name="AIS">
>
> <ColumnFamily Name="Location"
>
>         ColumnType="Super"
>
>         CompareWith="LongType"
>
>         KeysCached="100%"
>
>         CompareSubcolumnsWith="UTF8Type" />
>
>      ...
>
> </Keyspace>
>
> The super column got columns for longitude and latitude.
>
>  1. Need to get get max long number for key
>
>  2. The key should also have supercolumns latitude and longitude 
> column intersecting (inside) with a given boundingbox.
>
> Currently I'm doing like this
>
>         KeyRange keyRange = new KeyRange();
>
>         keyRange.setStart_key("");
>
>         keyRange.setEnd_key("");
>
>         keyRange.setCount(700);
>
> And checking every row in db if it match my bounding box.
>
> But there are a lot more than 700 keys.. and if i set a higher count, 
> the get_range_slice get a Timeout Exception.
>
> Any ideas?
>
> Best Regards
>
> Jone
>


SV: Help with getting Key range with some column limitations

Posted by Thorvaldsson Justus <ju...@svenskaspel.se>.
You should iterate through them, get 200 then go get the next 200 and so on.
Also if checking a bounding box to another.. perhaps try sorting them so you could start looking at both ends, perhaps make the iteration smaller until match somehow?
Just my simple coins, also upgrading will probably be needed to iterate through RP because of bugs. But that should be simple enough to 6.4
/Justus

Från: Jone Lura [mailto:jone.lura@ecc.no]
Skickat: den 18 augusti 2010 20:32
Till: user@cassandra.apache.org
Ämne: Help with getting Key range with some column limitations

Hi,

We are trying to implement Cassandra to replace one of our biggest SQL tables, and so far we got it working.

However, for testing I'm using Cassandra 0.6.2, Java and Pelops. (Pelops not that important for my question) and need suggestions on how to solve a problem retrieving a key range based on the following.

<Keyspace Name="AIS">
        <ColumnFamily Name="Location"
        ColumnType="Super"
        CompareWith="LongType"
        KeysCached="100%"
        CompareSubcolumnsWith="UTF8Type" />
     ...
    </Keyspace>

The super column got columns for longitude and latitude.

 1. Need to get get max long number for key
 2. The key should also have supercolumns latitude and longitude column intersecting (inside) with a given boundingbox.

Currently I'm doing like this


        KeyRange keyRange = new KeyRange();
        keyRange.setStart_key("");
        keyRange.setEnd_key("");
        keyRange.setCount(700);

And checking every row in db if it match my bounding box.

But there are a lot more than 700 keys.. and if i set a higher count, the get_range_slice get a Timeout Exception.

Any ideas?

Best Regards
Jone