You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Kasun Weranga <ka...@wso2.com> on 2013/02/28 23:49:39 UTC

Query data in a CF within a timestamp range

Hi all,

I have a column family with some data + timestamp values and I want to
query the column family to fetch data within a timestamp range. AFAIK it is
not better to use secondary index for timestamp due to high cardinality.

Is there a way to achieve this functionality?

Thanks,
Kasun.

Re: backing up and restoring from only 1 replica?

Posted by aaron morton <aa...@thelastpickle.com>.

Hinted Handoff works well. But it's an optimisation that has certain safety valves, configuration and throttling that means it is still not considered the way to ensure on disk consistency. 

In general, if a node restarts or drops mutations HH should get the message there eventually. In specific cases it may not. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/03/2013, at 10:40 AM, Mike Koh <de...@gmail.com> wrote:

> Thanks for the response.  Could you elaborate more on the bad things that happen during a restart or message drops that would cause a 1 replica restore to fail?  I'm completely on board with not using a restore process that nobody else uses, but I need to convince somebody else who thinks that it will work that it is not a good idea.
> 
> 
> On 3/4/2013 7:54 AM, aaron morton wrote:
>> That would be OK only if you never had node go down (e.g. a restart) or drop messages.
>> 
>> It's not something I would consider trying.
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 28/02/2013, at 3:21 PM, Mike Koh <de...@gmail.com> wrote:
>> 
>>> It has been suggested to me that we could save a fair amount of time and money by taking a snapshot of only 1 replica (so every third node for most column families).  Assuming that we are okay with not having the absolute latest data, does this have any possibility of working?  I feel like it shouldn't but don't really know the argument for why it wouldn't.
>

Re: backing up and restoring from only 1 replica?

Posted by Mike Koh <de...@gmail.com>.

Thanks for the response.  Could you elaborate more on the bad things 
that happen during a restart or message drops that would cause a 1 
replica restore to fail?  I'm completely on board with not using a 
restore process that nobody else uses, but I need to convince somebody 
else who thinks that it will work that it is not a good idea.

On 3/4/2013 7:54 AM, aaron morton wrote:
> That would be OK only if you never had node go down (e.g. a restart) or drop messages.
>
> It's not something I would consider trying.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 28/02/2013, at 3:21 PM, Mike Koh <de...@gmail.com> wrote:
>
>> It has been suggested to me that we could save a fair amount of time and money by taking a snapshot of only 1 replica (so every third node for most column families).  Assuming that we are okay with not having the absolute latest data, does this have any possibility of working?  I feel like it shouldn't but don't really know the argument for why it wouldn't.

Re: backing up and restoring from only 1 replica?

Posted by aaron morton <aa...@thelastpickle.com>.

That would be OK only if you never had node go down (e.g. a restart) or drop messages. 

It's not something I would consider trying.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 28/02/2013, at 3:21 PM, Mike Koh <de...@gmail.com> wrote:

> It has been suggested to me that we could save a fair amount of time and money by taking a snapshot of only 1 replica (so every third node for most column families).  Assuming that we are okay with not having the absolute latest data, does this have any possibility of working?  I feel like it shouldn't but don't really know the argument for why it wouldn't.

backing up and restoring from only 1 replica?

Posted by Mike Koh <de...@gmail.com>.

It has been suggested to me that we could save a fair amount of time and 
money by taking a snapshot of only 1 replica (so every third node for 
most column families).  Assuming that we are okay with not having the 
absolute latest data, does this have any possibility of working?  I feel 
like it shouldn't but don't really know the argument for why it wouldn't.

Re: Query data in a CF within a timestamp range

Posted by Edward Capriolo <ed...@gmail.com>.

Pseudo code :

GregorianCalendar gc = new GregorianCalendar();
DateFormat df = new SimpleDateFormat( "yyyyMMddhhmm');
String reversekey = df.format(gc);

set mycolumnfamily['myrow']['mycolumn'] = 'myvalue';
set myreverseindex['$reversekey]['myrow'] = '';

Under rapid insertion this makes hot-spots. Not an easy way around
that other then sharding the reverse index.


On Thu, Feb 28, 2013 at 5:49 PM, Kasun Weranga <ka...@wso2.com> wrote:
> Hi all,
>
> I have a column family with some data + timestamp values and I want to query
> the column family to fetch data within a timestamp range. AFAIK it is not
> better to use secondary index for timestamp due to high cardinality.
>
> Is there a way to achieve this functionality?
>
> Thanks,
> Kasun.