You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jan Algermissen <ja...@nordsc.com> on 2014/05/18 10:30:52 UTC

Ordering of schema updates and data modifications

Hi,

in our project, we apparently have a problem or misunderstanding of the relationship between schema changes and data updates.

One team is doing automated tests during build and deployment that executes data migration tests on a development cluster. In those migrations there will be schema changes (adding rows) and subsequent data insertions involving these rows.

It seems, there are unpredictable times when the update reaches the cluster *before* the schema change, causing the tests to fail.

What can we do to enforce the schema update to have sufficiently happened before the modification is hitting the database?

Alternatively, what do others do to handle schema migrations during continuous delivery processes.

Jan

Re: Ordering of schema updates and data modifications

Posted by Jan Algermissen <ja...@nordsc.com>.
On 18 May 2014, at 10:30, Jan Algermissen <ja...@nordsc.com> wrote:

> Hi,
> 
> in our project, we apparently have a problem or misunderstanding of the relationship between schema changes and data updates.
> 
> One team is doing automated tests during build and deployment that executes data migration tests on a development cluster. In those migrations there will be schema changes (adding rows) and subsequent data insertions involving these rows.
> 
> It seems, there are unpredictable times when the update reaches the cluster *before* the schema change, causing the tests to fail.
> 
> What can we do to enforce the schema update to have sufficiently happened before the modification is hitting the database?
> 
> Alternatively, what do others do to handle schema migrations during continuous delivery processes.

Surprisingly difficult to find, but here is an example of how to use ‘describe cluster’ to check schema propagation:

http://www.datastax.com/support-forums/topic/schema-versions-disagree#post-7055

Jan





> 
> Jan


Re: Ordering of schema updates and data modifications

Posted by Jan Algermissen <ja...@nordsc.com>.
Colin,

On 18 May 2014, at 15:29, Colin <co...@clark.ws> wrote:

> Hi Jan,
> 
> Try waiting a period of time, say 60 seconds, after modifying the schema so the changes propagate throughout the cluster.
> 
> Also, you could add a step to your automation where you verify the schema change by attempting to insert/delete from the schema with a higher consistency level to make sure a good number of nodes are in agreement before proceeding.
> 
> Does this make sense?

thanks, yes it does.

However, I’d rather like a direct source of information.

Is it possible to check for complete propagation of the change using nodetool or by querying the schema_cf or migrations_cf tables?

http://wiki.apache.org/cassandra/LiveSchemaUpdates

Jan




> 
> --
> Colin Clark 
> +1-320-221-9531
>  
> 
> On May 18, 2014, at 3:30 AM, Jan Algermissen <ja...@nordsc.com> wrote:
> 
>> Hi,
>> 
>> in our project, we apparently have a problem or misunderstanding of the relationship between schema changes and data updates.
>> 
>> One team is doing automated tests during build and deployment that executes data migration tests on a development cluster. In those migrations there will be schema changes (adding rows) and subsequent data insertions involving these rows.
>> 
>> It seems, there are unpredictable times when the update reaches the cluster *before* the schema change, causing the tests to fail.
>> 
>> What can we do to enforce the schema update to have sufficiently happened before the modification is hitting the database?
>> 
>> Alternatively, what do others do to handle schema migrations during continuous delivery processes.
>> 
>> Jan


Re: Ordering of schema updates and data modifications

Posted by Colin <co...@clark.ws>.
Hi Jan,

Try waiting a period of time, say 60 seconds, after modifying the schema so the changes propagate throughout the cluster.

Also, you could add a step to your automation where you verify the schema change by attempting to insert/delete from the schema with a higher consistency level to make sure a good number of nodes are in agreement before proceeding.

Does this make sense?

--
Colin Clark 
+1-320-221-9531
 

> On May 18, 2014, at 3:30 AM, Jan Algermissen <ja...@nordsc.com> wrote:
> 
> Hi,
> 
> in our project, we apparently have a problem or misunderstanding of the relationship between schema changes and data updates.
> 
> One team is doing automated tests during build and deployment that executes data migration tests on a development cluster. In those migrations there will be schema changes (adding rows) and subsequent data insertions involving these rows.
> 
> It seems, there are unpredictable times when the update reaches the cluster *before* the schema change, causing the tests to fail.
> 
> What can we do to enforce the schema update to have sufficiently happened before the modification is hitting the database?
> 
> Alternatively, what do others do to handle schema migrations during continuous delivery processes.
> 
> Jan