You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Mauri Moreno Cesare <mo...@italtel.com> on 2015/06/30 14:34:54 UTC

Insert (and delete) data loss?

Hi all,
I configured a Cassandra Cluster (3 nodes).

Then I created a KEYSPACE:

cql> CREATE  KEYSPACE  test  WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 };

and a table:

cql> CREATE TABLE  chiamate_stabili  (chiave  TEXT  PRIMARY  KEY,  valore  BLOB);

I inserted  (synchronous) 10.000 rows (with a Java Client that connects  to one of 3 nodes during connection phase).

Each row has a "valore" that contains an array of 100K bytes.

When Java Client ends, I wait some seconds and then I try this command inside cql:

cql> SELECT  COUNT(*)  FROM  test.chiamate_stabili  LIMIT  10000;

Result is often 10000 but sometime 6666!

The same 6666 records are found with my Java Client.

Here is Java query code:

for(int i = 1;  i <= 10000; i++) {
  String key = "key-" + i;
  Clause eqClause = QueryBuilder.eq("chiave", key);
  Statement statement = QueryBuilder.select().all().from("test", tableName).where(eqClause);
  session.execute(statement);
}

Same behaviour with deletes.

When I try to delete all the records sometimes the table is empty,
but sometimes 3333 records are still present.

I think that something is wrong in my code or (probably) in cluster configuration
but I don't know which kind of tuning or configuration options I can explore.

Any help is very appreciated.
Many thanks in advance.

Moreno

Internet Email Confidentiality Footer ******************************************************************************************************************************************** La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. ***************** This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. ********************************************************************************************************************************************

RE: Insert (and delete) data loss?

Posted by Mauri Moreno Cesare <mo...@italtel.com>.
Problem is still present ☹

First I changed ConsistencyLevel  in java client code(from ONE to QUORUM).

When I changed ConsistencyLevel  I need to re-config Cassandra Cluster in a proper way
(according to “Cassandra Calculator for Dummies”,  I changed Keyspace’s Replication Factor, from 2 to 3:
without this change Calculator told me that  my cluster didn’t survive to node loss).

Insert was OK
(5 or 6 attempts, all of them with 10.000 records inserted).

Delete was KO
(I tried to delete 10.000 records but, after delete, select count() returned 3334 records: no Exception caught client side ☹).

Even after “nodetool repair” (command executed on all 3 nodes) select count() returned 3334 records.

Is there something else I can change in cluster configuration (or java client code?).

Thanks again
Moreno

From: Jason Kushmaul [mailto:jkushmaul@rocketfuelinc.com]
Sent: martedì 30 giugno 2015 15.25
To: user@cassandra.apache.org
Subject: Re: Insert (and delete) data loss?

Can you try two more tests:

1) Write the way you are, perform a repair on all nodes, then read the way you are.
<wipe data>
2) Write with CL quorum, read with CL quorum.


On Tue, Jun 30, 2015 at 8:34 AM, Mauri Moreno Cesare <mo...@italtel.com>> wrote:
Hi all,
I configured a Cassandra Cluster (3 nodes).

Then I created a KEYSPACE:

cql> CREATE  KEYSPACE  test  WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 };

and a table:

cql> CREATE TABLE  chiamate_stabili  (chiave  TEXT  PRIMARY  KEY,  valore  BLOB);

I inserted  (synchronous) 10.000 rows (with a Java Client that connects  to one of 3 nodes during connection phase).

Each row has a “valore” that contains an array of 100K bytes.

When Java Client ends, I wait some seconds and then I try this command inside cql:

cql> SELECT  COUNT(*)  FROM  test.chiamate_stabili  LIMIT  10000;

Result is often 10000 but sometime 6666!

The same 6666 records are found with my Java Client.

Here is Java query code:

for(int i = 1;  i <= 10000; i++) {
  String key = "key-" + i;
  Clause eqClause = QueryBuilder.eq("chiave", key);
  Statement statement = QueryBuilder.select().all().from("test", tableName).where(eqClause);
  session.execute(statement);
}

Same behaviour with deletes.

When I try to delete all the records sometimes the table is empty,
but sometimes 3333 records are still present.

I think that something is wrong in my code or (probably) in cluster configuration
but I don’t know which kind of tuning or configuration options I can explore.

Any help is very appreciated.
Many thanks in advance.

Moreno

Internet Email Confidentiality Footer ******************************************************************************************************************************************** La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. ***************** This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. ********************************************************************************************************************************************



--
Jason Kushmaul | 517.899.7852
Engineering Manager
Internet Email Confidentiality Footer ******************************************************************************************************************************************** La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. ***************** This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. ********************************************************************************************************************************************

RE: Insert (and delete) data loss?

Posted by Mauri Moreno Cesare <mo...@italtel.com>.
Thank you Jason!


Ok, I will try with QUORUM and, if problem continues to be present, I’ll give “nodetool repair” cmd.

(but it’s a good practice, in production environment, the use of “nodetool repair”?)

Moreno

From: Jason Kushmaul [mailto:jkushmaul@rocketfuelinc.com]
Sent: martedì 30 giugno 2015 15.25
To: user@cassandra.apache.org
Subject: Re: Insert (and delete) data loss?

Can you try two more tests:

1) Write the way you are, perform a repair on all nodes, then read the way you are.
<wipe data>
2) Write with CL quorum, read with CL quorum.


On Tue, Jun 30, 2015 at 8:34 AM, Mauri Moreno Cesare <mo...@italtel.com>> wrote:
Hi all,
I configured a Cassandra Cluster (3 nodes).

Then I created a KEYSPACE:

cql> CREATE  KEYSPACE  test  WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 };

and a table:

cql> CREATE TABLE  chiamate_stabili  (chiave  TEXT  PRIMARY  KEY,  valore  BLOB);

I inserted  (synchronous) 10.000 rows (with a Java Client that connects  to one of 3 nodes during connection phase).

Each row has a “valore” that contains an array of 100K bytes.

When Java Client ends, I wait some seconds and then I try this command inside cql:

cql> SELECT  COUNT(*)  FROM  test.chiamate_stabili  LIMIT  10000;

Result is often 10000 but sometime 6666!

The same 6666 records are found with my Java Client.

Here is Java query code:

for(int i = 1;  i <= 10000; i++) {
  String key = "key-" + i;
  Clause eqClause = QueryBuilder.eq("chiave", key);
  Statement statement = QueryBuilder.select().all().from("test", tableName).where(eqClause);
  session.execute(statement);
}

Same behaviour with deletes.

When I try to delete all the records sometimes the table is empty,
but sometimes 3333 records are still present.

I think that something is wrong in my code or (probably) in cluster configuration
but I don’t know which kind of tuning or configuration options I can explore.

Any help is very appreciated.
Many thanks in advance.

Moreno

Internet Email Confidentiality Footer ******************************************************************************************************************************************** La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. ***************** This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. ********************************************************************************************************************************************



--
Jason Kushmaul | 517.899.7852
Engineering Manager
Internet Email Confidentiality Footer ******************************************************************************************************************************************** La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. ***************** This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. ********************************************************************************************************************************************

Re: Insert (and delete) data loss?

Posted by Jason Kushmaul <jk...@rocketfuelinc.com>.
Can you try two more tests:

1) Write the way you are, perform a repair on all nodes, then read the way
you are.
<wipe data>
2) Write with CL quorum, read with CL quorum.


On Tue, Jun 30, 2015 at 8:34 AM, Mauri Moreno Cesare <
morenocesare.mauri@italtel.com> wrote:

>  Hi all,
>
> I configured a Cassandra Cluster (3 nodes).
>
>
>
> Then I created a KEYSPACE:
>
>
>
> cql> CREATE  KEYSPACE  test  WITH REPLICATION = { 'class' :
> 'SimpleStrategy', 'replication_factor' : 2 };
>
>
>
> and a table:
>
>
>
> cql> CREATE TABLE  chiamate_stabili  (chiave  TEXT  PRIMARY  KEY,  valore
>  BLOB);
>
>
>
> I inserted  (synchronous) 10.000 rows (with a Java Client that connects
>  to one of 3 nodes during connection phase).
>
>
>
> Each row has a “valore” that contains an array of 100K bytes.
>
>
>
> When Java Client ends, I wait some seconds and then I try this command
> inside cql:
>
>
>
> cql> SELECT  COUNT(*)  FROM  test.chiamate_stabili  LIMIT  10000;
>
>
>
> Result is often 10000 but sometime 6666!
>
>
>
> The same 6666 records are found with my Java Client.
>
>
>
> Here is Java query code:
>
>
>
> for(int i = 1;  i <= 10000; i++) {
>
>   String key = "key-" + i;
>
>   Clause eqClause = QueryBuilder.eq("chiave", key);
>
>   Statement statement = QueryBuilder.select().all().from("test",
> tableName).where(eqClause);
>
>   session.execute(statement);
>
> }
>
>
>
> Same behaviour with deletes.
>
>
>
> When I try to delete all the records sometimes the table is empty,
>
> but sometimes 3333 records are still present.
>
>
>
> I think that something is wrong in my code or (probably) in cluster
> configuration
>
> but I don’t know which kind of tuning or configuration options I can
> explore.
>
>
>
> Any help is very appreciated.
>
> Many thanks in advance.
>
>
>
> Moreno
>
>
>  Internet Email Confidentiality Footer
> ********************************************************************************************************************************************
> La presente comunicazione, con le informazioni in essa contenute e ogni
> documento o file allegato, e' rivolta unicamente alla/e persona/e cui e'
> indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete
> i destinatari/autorizzati siete avvisati che qualsiasi azione, copia,
> comunicazione, divulgazione o simili basate sul contenuto di tali
> informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P.,
> D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se
> avete ricevuto questa comunicazione per errore, vi preghiamo di darne
> immediata notizia al mittente e di distruggere il messaggio originale e
> ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il
> contenuto. ***************** This e-mail and its attachments are intended
> for the addressee(s) only and are confidential and/or may contain legally
> privileged information. If you have received this message by mistake or are
> not one of the addressees above, you may take no action based on it, and
> you may not copy or show it to anyone; please reply to this e-mail and
> point out the error which has occurred.
> ********************************************************************************************************************************************
>
>



-- 
Jason Kushmaul | 517.899.7852
Engineering Manager

RE: Insert (and delete) data loss?

Posted by Mauri Moreno Cesare <mo...@italtel.com>.
Hi Carlos,
first I used Consistency ONE, then ALL

(I retry with ALL in order to be sure that problem doesn’t disappear).

Thanks
Moreno


From: Carlos Alonso [mailto:info@mrcalonso.com]
Sent: martedì 30 giugno 2015 15.24
To: user@cassandra.apache.org
Subject: Re: Insert (and delete) data loss?

Hi Moreno,

Which consistency level are you using? If you're using ONE, that may make sense, as, depending on the partitioning and the cluster coordinating the query, different values may be received.

Hope it helps.

Regards

Carlos Alonso | Software Engineer | @calonso<https://twitter.com/calonso>

On 30 June 2015 at 13:34, Mauri Moreno Cesare <mo...@italtel.com>> wrote:
Hi all,
I configured a Cassandra Cluster (3 nodes).

Then I created a KEYSPACE:

cql> CREATE  KEYSPACE  test  WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 };

and a table:

cql> CREATE TABLE  chiamate_stabili  (chiave  TEXT  PRIMARY  KEY,  valore  BLOB);

I inserted  (synchronous) 10.000 rows (with a Java Client that connects  to one of 3 nodes during connection phase).

Each row has a “valore” that contains an array of 100K bytes.

When Java Client ends, I wait some seconds and then I try this command inside cql:

cql> SELECT  COUNT(*)  FROM  test.chiamate_stabili  LIMIT  10000;

Result is often 10000 but sometime 6666!

The same 6666 records are found with my Java Client.

Here is Java query code:

for(int i = 1;  i <= 10000; i++) {
  String key = "key-" + i;
  Clause eqClause = QueryBuilder.eq("chiave", key);
  Statement statement = QueryBuilder.select().all().from("test", tableName).where(eqClause);
  session.execute(statement);
}

Same behaviour with deletes.

When I try to delete all the records sometimes the table is empty,
but sometimes 3333 records are still present.

I think that something is wrong in my code or (probably) in cluster configuration
but I don’t know which kind of tuning or configuration options I can explore.

Any help is very appreciated.
Many thanks in advance.

Moreno

Internet Email Confidentiality Footer ******************************************************************************************************************************************** La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. ***************** This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. ********************************************************************************************************************************************

Internet Email Confidentiality Footer ******************************************************************************************************************************************** La presente comunicazione, con le informazioni in essa contenute e ogni documento o file allegato, e' rivolta unicamente alla/e persona/e cui e' indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete i destinatari/autorizzati siete avvisati che qualsiasi azione, copia, comunicazione, divulgazione o simili basate sul contenuto di tali informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P., D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se avete ricevuto questa comunicazione per errore, vi preghiamo di darne immediata notizia al mittente e di distruggere il messaggio originale e ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il contenuto. ***************** This e-mail and its attachments are intended for the addressee(s) only and are confidential and/or may contain legally privileged information. If you have received this message by mistake or are not one of the addressees above, you may take no action based on it, and you may not copy or show it to anyone; please reply to this e-mail and point out the error which has occurred. ********************************************************************************************************************************************

Re: Insert (and delete) data loss?

Posted by Carlos Alonso <in...@mrcalonso.com>.
Hi Moreno,

Which consistency level are you using? If you're using ONE, that may make
sense, as, depending on the partitioning and the cluster coordinating the
query, different values may be received.

Hope it helps.

Regards

Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>

On 30 June 2015 at 13:34, Mauri Moreno Cesare <
morenocesare.mauri@italtel.com> wrote:

>  Hi all,
>
> I configured a Cassandra Cluster (3 nodes).
>
>
>
> Then I created a KEYSPACE:
>
>
>
> cql> CREATE  KEYSPACE  test  WITH REPLICATION = { 'class' :
> 'SimpleStrategy', 'replication_factor' : 2 };
>
>
>
> and a table:
>
>
>
> cql> CREATE TABLE  chiamate_stabili  (chiave  TEXT  PRIMARY  KEY,  valore
>  BLOB);
>
>
>
> I inserted  (synchronous) 10.000 rows (with a Java Client that connects
>  to one of 3 nodes during connection phase).
>
>
>
> Each row has a “valore” that contains an array of 100K bytes.
>
>
>
> When Java Client ends, I wait some seconds and then I try this command
> inside cql:
>
>
>
> cql> SELECT  COUNT(*)  FROM  test.chiamate_stabili  LIMIT  10000;
>
>
>
> Result is often 10000 but sometime 6666!
>
>
>
> The same 6666 records are found with my Java Client.
>
>
>
> Here is Java query code:
>
>
>
> for(int i = 1;  i <= 10000; i++) {
>
>   String key = "key-" + i;
>
>   Clause eqClause = QueryBuilder.eq("chiave", key);
>
>   Statement statement = QueryBuilder.select().all().from("test",
> tableName).where(eqClause);
>
>   session.execute(statement);
>
> }
>
>
>
> Same behaviour with deletes.
>
>
>
> When I try to delete all the records sometimes the table is empty,
>
> but sometimes 3333 records are still present.
>
>
>
> I think that something is wrong in my code or (probably) in cluster
> configuration
>
> but I don’t know which kind of tuning or configuration options I can
> explore.
>
>
>
> Any help is very appreciated.
>
> Many thanks in advance.
>
>
>
> Moreno
>
>
>  Internet Email Confidentiality Footer
> ********************************************************************************************************************************************
> La presente comunicazione, con le informazioni in essa contenute e ogni
> documento o file allegato, e' rivolta unicamente alla/e persona/e cui e'
> indirizzata ed alle altre da questa autorizzata/e a riceverla. Se non siete
> i destinatari/autorizzati siete avvisati che qualsiasi azione, copia,
> comunicazione, divulgazione o simili basate sul contenuto di tali
> informazioni e' vietata e potrebbe essere contro la legge (art. 616 C.P.,
> D.Lgs n. 196/2003 Codice in materia di protezione dei dati personali). Se
> avete ricevuto questa comunicazione per errore, vi preghiamo di darne
> immediata notizia al mittente e di distruggere il messaggio originale e
> ogni file allegato senza farne copia alcuna o riprodurne in alcun modo il
> contenuto. ***************** This e-mail and its attachments are intended
> for the addressee(s) only and are confidential and/or may contain legally
> privileged information. If you have received this message by mistake or are
> not one of the addressees above, you may take no action based on it, and
> you may not copy or show it to anyone; please reply to this e-mail and
> point out the error which has occurred.
> ********************************************************************************************************************************************
>
>