You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jason Tang <ar...@gmail.com> on 2013/01/17 02:56:54 UTC

Cassandra Consistency problem with NTP

Hi

I am using Cassandra in a message bus solution, the major responsibility of
cassandra is recording the incoming requests for later consumming.

One strategy is First in First out (FIFO), so I need to get the stored
request in reversed order.

I use NTP to synchronize the system time for the nodes in the cluster. (4
nodes).

But the local time of each node are still have some inaccuracy, around 40
ms.

The consistency level is write all and read one, and replicate factor is 3.

But here is the problem:
A request come to node One at local time PM 10:00:01.000
B request come to node Two at local time PM 10:00:00.980

The correct order is A --> B
But the timestamp is B --> A

So is there any way for Cassandra to keep the correct order for read
operation? (e.g. logical timestamp ?)

Or Cassandra strong depence on time synchronization solution?

BRs
//Tang

Re: Cassandra Consistency problem with NTP

Posted by Jason Tang <ar...@gmail.com>.

Delay read is acceptable, but problem still there:
A request come to node One at local time PM 10:00:01.000
B request come to node Two at local time PM 10:00:00.980

The correct order is A --> B
I am not sure how node C will handle the data, although A came before B,
but B's timestamp is earlier then A ?



2013/1/17 Russell Haering <ru...@gmail.com>

> One solution is to only read up to (now - 1 second). If this is a public
> API where you want to guarantee full consistency (ie, if you have added a
> message to the queue, it will definitely appear to be there) you can
> instead delay requests for 1 second before reading up to the moment that
> the request was received.
>
> In either of these approaches you can tune the time offset based on how
> closely synchronized you believe you can keep your clocks. The tradeoff of
> course, will be increased latency.
>
>
> On Wed, Jan 16, 2013 at 5:56 PM, Jason Tang <ar...@gmail.com> wrote:
>
>> Hi
>>
>> I am using Cassandra in a message bus solution, the major responsibility
>> of cassandra is recording the incoming requests for later consumming.
>>
>> One strategy is First in First out (FIFO), so I need to get the stored
>> request in reversed order.
>>
>> I use NTP to synchronize the system time for the nodes in the cluster. (4
>> nodes).
>>
>> But the local time of each node are still have some inaccuracy, around 40
>> ms.
>>
>> The consistency level is write all and read one, and replicate factor is
>> 3.
>>
>> But here is the problem:
>> A request come to node One at local time PM 10:00:01.000
>> B request come to node Two at local time PM 10:00:00.980
>>
>> The correct order is A --> B
>> But the timestamp is B --> A
>>
>> So is there any way for Cassandra to keep the correct order for read
>> operation? (e.g. logical timestamp ?)
>>
>> Or Cassandra strong depence on time synchronization solution?
>>
>> BRs
>> //Tang
>>
>>
>>
>>
>>
>

Re: Cassandra Consistency problem with NTP

Posted by Russell Haering <ru...@gmail.com>.

One solution is to only read up to (now - 1 second). If this is a public
API where you want to guarantee full consistency (ie, if you have added a
message to the queue, it will definitely appear to be there) you can
instead delay requests for 1 second before reading up to the moment that
the request was received.

In either of these approaches you can tune the time offset based on how
closely synchronized you believe you can keep your clocks. The tradeoff of
course, will be increased latency.

On Wed, Jan 16, 2013 at 5:56 PM, Jason Tang <ar...@gmail.com> wrote:

> Hi
>
> I am using Cassandra in a message bus solution, the major responsibility
> of cassandra is recording the incoming requests for later consumming.
>
> One strategy is First in First out (FIFO), so I need to get the stored
> request in reversed order.
>
> I use NTP to synchronize the system time for the nodes in the cluster. (4
> nodes).
>
> But the local time of each node are still have some inaccuracy, around 40
> ms.
>
> The consistency level is write all and read one, and replicate factor is 3.
>
> But here is the problem:
> A request come to node One at local time PM 10:00:01.000
> B request come to node Two at local time PM 10:00:00.980
>
> The correct order is A --> B
> But the timestamp is B --> A
>
> So is there any way for Cassandra to keep the correct order for read
> operation? (e.g. logical timestamp ?)
>
> Or Cassandra strong depence on time synchronization solution?
>
> BRs
> //Tang
>
>
>
>
>

Re: Cassandra Consistency problem with NTP

Posted by Vijay <vi...@gmail.com>.

Just FYI,
For one of the projects, i got around the NTP Drift problem by always
reading more than i need,
For example i want to read all the messages before x seconds then i would
query cassandra for (x seconds + 500ms) then filter the duplicates in the
client.

Yes it does more network and yes client needs more logic to handle it.

Regards,
</VJ>


On Thu, Jan 17, 2013 at 10:47 AM, Edward Capriolo <ed...@gmail.com>wrote:

> If you have 40ms NTP drift something is VERY VERY wrong. You should have a
> local NTP server on the same subnet, do not try to use one on the moon.
>
>
> On Thu, Jan 17, 2013 at 4:42 AM, Sylvain Lebresne <sy...@datastax.com>wrote:
>
>>
>> So what I want is, Cassandra provide some information for client, to
>>> indicate A is stored before B, e.g. global unique timestamp, or  row order.
>>>
>>
>> The row order is determined by 1) the comparator you use for the column
>> family and 2) the column names you, the client, choose for A and B. So what
>> are the column names you use for A and B?
>>
>> Now what you could do is use a TimeUUID comparator for that column family
>> and use a time uuid for A and B column names. In that case, provided A and
>> B are sent from the same client node and B is sent after A on that client
>> (which you said is the case), then any non buggy time uuid generator will
>> guarantee that the uuid generated for A will be smaller than the one for B
>> and thus that in Cassandra, A will be sorted before B.
>>
>> In any case, the point I want to make is that Cassandra itself cannot do
>> anything for you problem, because by design the row ordering is something
>> entirely controlled client side (and just so there is no misunderstanding,
>> I want to make that point not because I'm not trying to suggest you were
>> wrong asking this mailing list, but because we can't suggest a proper
>> solution unless we clearly understand what the problem is).
>>
>> --
>> Sylvain
>>
>>
>>>
>>>
>>>
>>> 2013/1/17 Sylvain Lebresne <sy...@datastax.com>
>>>
>>>> I'm not sure I fully understand your problem. You seem to be talking of
>>>> ordering the requests, in the order they are generated. But in that case,
>>>> you will rely on the ordering of columns within whatever row you store
>>>> request A and B in, and that order depends on the column names, which in
>>>> turns is client provided and doesn't depend at all of the time
>>>> synchronization of the cluster nodes. And since you are able to say that
>>>> request A comes before B, I suppose this means said requests are generated
>>>> from the same source. In which case you just need to make sure that the
>>>> column names storing each request respect the correct ordering.
>>>>
>>>> The column timestamps Cassandra uses are here to which update *to the
>>>> same column* is the more recent one. So it only comes into play if you
>>>> requests A and B update the same column and you're interested in knowing
>>>> which one of the update will "win" when you read. But even if that's your
>>>> case (which doesn't sound like it at all from your description), the column
>>>> timestamp is only generated server side if you use CQL. And even in that
>>>> latter case, it's a convenience and you can force a timestamp client side
>>>> if you really wish. In other words, Cassandra dependency on time
>>>> synchronization is not a strong one even in that case. But again, that
>>>> doesn't seem at all to be the problem you are trying to solve.
>>>>
>>>> --
>>>> Sylvain
>>>>
>>>>
>>>> On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang <ar...@gmail.com>wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I am using Cassandra in a message bus solution, the major
>>>>> responsibility of cassandra is recording the incoming requests for later
>>>>> consumming.
>>>>>
>>>>> One strategy is First in First out (FIFO), so I need to get the stored
>>>>> request in reversed order.
>>>>>
>>>>> I use NTP to synchronize the system time for the nodes in the cluster.
>>>>> (4 nodes).
>>>>>
>>>>> But the local time of each node are still have some inaccuracy, around
>>>>> 40 ms.
>>>>>
>>>>> The consistency level is write all and read one, and replicate factor
>>>>> is 3.
>>>>>
>>>>> But here is the problem:
>>>>> A request come to node One at local time PM 10:00:01.000
>>>>> B request come to node Two at local time PM 10:00:00.980
>>>>>
>>>>> The correct order is A --> B
>>>>> But the timestamp is B --> A
>>>>>
>>>>> So is there any way for Cassandra to keep the correct order for read
>>>>> operation? (e.g. logical timestamp ?)
>>>>>
>>>>> Or Cassandra strong depence on time synchronization solution?
>>>>>
>>>>> BRs
>>>>> //Tang
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Cassandra Consistency problem with NTP

Posted by Edward Capriolo <ed...@gmail.com>.

If you have 40ms NTP drift something is VERY VERY wrong. You should have a
local NTP server on the same subnet, do not try to use one on the moon.

On Thu, Jan 17, 2013 at 4:42 AM, Sylvain Lebresne <sy...@datastax.com>wrote:

>
> So what I want is, Cassandra provide some information for client, to
>> indicate A is stored before B, e.g. global unique timestamp, or  row order.
>>
>
> The row order is determined by 1) the comparator you use for the column
> family and 2) the column names you, the client, choose for A and B. So what
> are the column names you use for A and B?
>
> Now what you could do is use a TimeUUID comparator for that column family
> and use a time uuid for A and B column names. In that case, provided A and
> B are sent from the same client node and B is sent after A on that client
> (which you said is the case), then any non buggy time uuid generator will
> guarantee that the uuid generated for A will be smaller than the one for B
> and thus that in Cassandra, A will be sorted before B.
>
> In any case, the point I want to make is that Cassandra itself cannot do
> anything for you problem, because by design the row ordering is something
> entirely controlled client side (and just so there is no misunderstanding,
> I want to make that point not because I'm not trying to suggest you were
> wrong asking this mailing list, but because we can't suggest a proper
> solution unless we clearly understand what the problem is).
>
> --
> Sylvain
>
>
>>
>>
>>
>> 2013/1/17 Sylvain Lebresne <sy...@datastax.com>
>>
>>> I'm not sure I fully understand your problem. You seem to be talking of
>>> ordering the requests, in the order they are generated. But in that case,
>>> you will rely on the ordering of columns within whatever row you store
>>> request A and B in, and that order depends on the column names, which in
>>> turns is client provided and doesn't depend at all of the time
>>> synchronization of the cluster nodes. And since you are able to say that
>>> request A comes before B, I suppose this means said requests are generated
>>> from the same source. In which case you just need to make sure that the
>>> column names storing each request respect the correct ordering.
>>>
>>> The column timestamps Cassandra uses are here to which update *to the
>>> same column* is the more recent one. So it only comes into play if you
>>> requests A and B update the same column and you're interested in knowing
>>> which one of the update will "win" when you read. But even if that's your
>>> case (which doesn't sound like it at all from your description), the column
>>> timestamp is only generated server side if you use CQL. And even in that
>>> latter case, it's a convenience and you can force a timestamp client side
>>> if you really wish. In other words, Cassandra dependency on time
>>> synchronization is not a strong one even in that case. But again, that
>>> doesn't seem at all to be the problem you are trying to solve.
>>>
>>> --
>>> Sylvain
>>>
>>>
>>> On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang <ar...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> I am using Cassandra in a message bus solution, the major
>>>> responsibility of cassandra is recording the incoming requests for later
>>>> consumming.
>>>>
>>>> One strategy is First in First out (FIFO), so I need to get the stored
>>>> request in reversed order.
>>>>
>>>> I use NTP to synchronize the system time for the nodes in the cluster.
>>>> (4 nodes).
>>>>
>>>> But the local time of each node are still have some inaccuracy, around
>>>> 40 ms.
>>>>
>>>> The consistency level is write all and read one, and replicate factor
>>>> is 3.
>>>>
>>>> But here is the problem:
>>>> A request come to node One at local time PM 10:00:01.000
>>>> B request come to node Two at local time PM 10:00:00.980
>>>>
>>>> The correct order is A --> B
>>>> But the timestamp is B --> A
>>>>
>>>> So is there any way for Cassandra to keep the correct order for read
>>>> operation? (e.g. logical timestamp ?)
>>>>
>>>> Or Cassandra strong depence on time synchronization solution?
>>>>
>>>> BRs
>>>> //Tang
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Re: Cassandra Consistency problem with NTP

Posted by Sylvain Lebresne <sy...@datastax.com>.

> So what I want is, Cassandra provide some information for client, to
> indicate A is stored before B, e.g. global unique timestamp, or  row order.
>

The row order is determined by 1) the comparator you use for the column
family and 2) the column names you, the client, choose for A and B. So what
are the column names you use for A and B?

Now what you could do is use a TimeUUID comparator for that column family
and use a time uuid for A and B column names. In that case, provided A and
B are sent from the same client node and B is sent after A on that client
(which you said is the case), then any non buggy time uuid generator will
guarantee that the uuid generated for A will be smaller than the one for B
and thus that in Cassandra, A will be sorted before B.

In any case, the point I want to make is that Cassandra itself cannot do
anything for you problem, because by design the row ordering is something
entirely controlled client side (and just so there is no misunderstanding,
I want to make that point not because I'm not trying to suggest you were
wrong asking this mailing list, but because we can't suggest a proper
solution unless we clearly understand what the problem is).

--
Sylvain


>
>
>
> 2013/1/17 Sylvain Lebresne <sy...@datastax.com>
>
>> I'm not sure I fully understand your problem. You seem to be talking of
>> ordering the requests, in the order they are generated. But in that case,
>> you will rely on the ordering of columns within whatever row you store
>> request A and B in, and that order depends on the column names, which in
>> turns is client provided and doesn't depend at all of the time
>> synchronization of the cluster nodes. And since you are able to say that
>> request A comes before B, I suppose this means said requests are generated
>> from the same source. In which case you just need to make sure that the
>> column names storing each request respect the correct ordering.
>>
>> The column timestamps Cassandra uses are here to which update *to the
>> same column* is the more recent one. So it only comes into play if you
>> requests A and B update the same column and you're interested in knowing
>> which one of the update will "win" when you read. But even if that's your
>> case (which doesn't sound like it at all from your description), the column
>> timestamp is only generated server side if you use CQL. And even in that
>> latter case, it's a convenience and you can force a timestamp client side
>> if you really wish. In other words, Cassandra dependency on time
>> synchronization is not a strong one even in that case. But again, that
>> doesn't seem at all to be the problem you are trying to solve.
>>
>> --
>> Sylvain
>>
>>
>> On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang <ar...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> I am using Cassandra in a message bus solution, the major responsibility
>>> of cassandra is recording the incoming requests for later consumming.
>>>
>>> One strategy is First in First out (FIFO), so I need to get the stored
>>> request in reversed order.
>>>
>>> I use NTP to synchronize the system time for the nodes in the cluster.
>>> (4 nodes).
>>>
>>> But the local time of each node are still have some inaccuracy, around
>>> 40 ms.
>>>
>>> The consistency level is write all and read one, and replicate factor is
>>> 3.
>>>
>>> But here is the problem:
>>> A request come to node One at local time PM 10:00:01.000
>>> B request come to node Two at local time PM 10:00:00.980
>>>
>>> The correct order is A --> B
>>> But the timestamp is B --> A
>>>
>>> So is there any way for Cassandra to keep the correct order for read
>>> operation? (e.g. logical timestamp ?)
>>>
>>> Or Cassandra strong depence on time synchronization solution?
>>>
>>> BRs
>>> //Tang
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Cassandra Consistency problem with NTP

Posted by Jason Tang <ar...@gmail.com>.

Yes, Sylvain, you are correct.
When I say "A comes before B",  it means client will secure the order,
actually, B will be sent only after get response of A request.

And Yes, A and B are not update same record, so it is not typical Cassandra
consistency problem.

And Yes, the column name is provide by client, and now I use the local
timestamp, and local time of A and B are not synchronized well, so I have
problem.

So what I want is, Cassandra provide some information for client, to
indicate A is stored before B, e.g. global unique timestamp, or  row order.




2013/1/17 Sylvain Lebresne <sy...@datastax.com>

> I'm not sure I fully understand your problem. You seem to be talking of
> ordering the requests, in the order they are generated. But in that case,
> you will rely on the ordering of columns within whatever row you store
> request A and B in, and that order depends on the column names, which in
> turns is client provided and doesn't depend at all of the time
> synchronization of the cluster nodes. And since you are able to say that
> request A comes before B, I suppose this means said requests are generated
> from the same source. In which case you just need to make sure that the
> column names storing each request respect the correct ordering.
>
> The column timestamps Cassandra uses are here to which update *to the same
> column* is the more recent one. So it only comes into play if you requests
> A and B update the same column and you're interested in knowing which one
> of the update will "win" when you read. But even if that's your case (which
> doesn't sound like it at all from your description), the column timestamp
> is only generated server side if you use CQL. And even in that latter case,
> it's a convenience and you can force a timestamp client side if you really
> wish. In other words, Cassandra dependency on time synchronization is not a
> strong one even in that case. But again, that doesn't seem at all to be the
> problem you are trying to solve.
>
> --
> Sylvain
>
>
> On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang <ar...@gmail.com> wrote:
>
>> Hi
>>
>> I am using Cassandra in a message bus solution, the major responsibility
>> of cassandra is recording the incoming requests for later consumming.
>>
>> One strategy is First in First out (FIFO), so I need to get the stored
>> request in reversed order.
>>
>> I use NTP to synchronize the system time for the nodes in the cluster. (4
>> nodes).
>>
>> But the local time of each node are still have some inaccuracy, around 40
>> ms.
>>
>> The consistency level is write all and read one, and replicate factor is
>> 3.
>>
>> But here is the problem:
>> A request come to node One at local time PM 10:00:01.000
>> B request come to node Two at local time PM 10:00:00.980
>>
>> The correct order is A --> B
>> But the timestamp is B --> A
>>
>> So is there any way for Cassandra to keep the correct order for read
>> operation? (e.g. logical timestamp ?)
>>
>> Or Cassandra strong depence on time synchronization solution?
>>
>> BRs
>> //Tang
>>
>>
>>
>>
>>
>

Re: Cassandra Consistency problem with NTP

Posted by Sylvain Lebresne <sy...@datastax.com>.

I'm not sure I fully understand your problem. You seem to be talking of
ordering the requests, in the order they are generated. But in that case,
you will rely on the ordering of columns within whatever row you store
request A and B in, and that order depends on the column names, which in
turns is client provided and doesn't depend at all of the time
synchronization of the cluster nodes. And since you are able to say that
request A comes before B, I suppose this means said requests are generated
from the same source. In which case you just need to make sure that the
column names storing each request respect the correct ordering.

The column timestamps Cassandra uses are here to which update *to the same
column* is the more recent one. So it only comes into play if you requests
A and B update the same column and you're interested in knowing which one
of the update will "win" when you read. But even if that's your case (which
doesn't sound like it at all from your description), the column timestamp
is only generated server side if you use CQL. And even in that latter case,
it's a convenience and you can force a timestamp client side if you really
wish. In other words, Cassandra dependency on time synchronization is not a
strong one even in that case. But again, that doesn't seem at all to be the
problem you are trying to solve.

--
Sylvain

On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang <ar...@gmail.com> wrote:

> Hi
>
> I am using Cassandra in a message bus solution, the major responsibility
> of cassandra is recording the incoming requests for later consumming.
>
> One strategy is First in First out (FIFO), so I need to get the stored
> request in reversed order.
>
> I use NTP to synchronize the system time for the nodes in the cluster. (4
> nodes).
>
> But the local time of each node are still have some inaccuracy, around 40
> ms.
>
> The consistency level is write all and read one, and replicate factor is 3.
>
> But here is the problem:
> A request come to node One at local time PM 10:00:01.000
> B request come to node Two at local time PM 10:00:00.980
>
> The correct order is A --> B
> But the timestamp is B --> A
>
> So is there any way for Cassandra to keep the correct order for read
> operation? (e.g. logical timestamp ?)
>
> Or Cassandra strong depence on time synchronization solution?
>
> BRs
> //Tang
>
>
>
>
>

Re: Cassandra Consistency problem with NTP

Posted by Michael Kjellman <mk...@barracuda.com>.

I would recommend Kafka instead of Cassandra for your particular problem personally.

From: Jason Tang <ar...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, January 16, 2013 5:56 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Cassandra Consistency problem with NTP

Hi

I am using Cassandra in a message bus solution, the major responsibility of cassandra is recording the incoming requests for later consumming.

One strategy is First in First out (FIFO), so I need to get the stored request in reversed order.

I use NTP to synchronize the system time for the nodes in the cluster. (4 nodes).

But the local time of each node are still have some inaccuracy, around 40 ms.

The consistency level is write all and read one, and replicate factor is 3.

But here is the problem:
A request come to node One at local time PM 10:00:01.000
B request come to node Two at local time PM 10:00:00.980

The correct order is A --> B
But the timestamp is B --> A

So is there any way for Cassandra to keep the correct order for read operation? (e.g. logical timestamp ?)

Or Cassandra strong depence on time synchronization solution?

BRs
//Tang