You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Robert Wille <rw...@fold3.com> on 2015/08/21 00:25:06 UTC

Written data is lost and no exception thrown back to the client

I wrote a data migration application which I was testing, and I pushed it too hard and the FlushWriter thread pool blocked, and I ended up with dropped mutation messages. I compared the source data against what is in my cluster, and as expected I have missing records. The strange thing is that my application didn’t error out. I’ve been doing some forensics, and there’s a lot about this that makes no sense and makes me feel very uneasy.

I use a lot of asynchronous queries, and I thought it was possible that I had bad error handling, so I checked for errors in other, independent ways.

I have a retry policy that on the first failure logs the error and then requests a retry. On the second failure it logs the error and then rethrows. A few retryable errors appeared in my logs, but no fatal errors. In theory, I should have a fatal error in my logs for any error that gets reported back to the client.

I wrap my Session object, and all queries go through this wrapper. This wrapper logs all query errors. Synchronous queries are wrapped in a try/catch which logs and rethrows. Asynchronous queries use a FutureCallback to log any onFailure invocations.

My logs indicate that no errors whatsoever were reported back to me. I do not understand how I can get dropped mutation messages and not know about it. I am running 2.0.16 with datastax Java driver 2.0.8. Three node cluster with RF=1. If someone could help me understand how this can occur, I would greatly appreciate it. A database that errors out is one thing. A database that errors out and makes you think everything was fine is quite another.

Thanks

Robert

Re: Written data is lost and no exception thrown back to the client

Posted by Sebastian Estevez <se...@datastax.com>.

Please let us know the Jira number.

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/?utm_campaign=summit15&utm_medium=summiticon&utm_source=emailsignature>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Aug 21, 2015 at 7:06 AM, Robert Wille <rw...@fold3.com> wrote:

> But it shouldn’t matter. I have missing data, and no errors, which
> shouldn’t be possible except with CL=ANY.
>
> FWIW, I’m working on some sample code so I can post a Jira.
>
> Robert
>
>
> On Aug 21, 2015, at 5:04 AM, Robert Wille <rw...@fold3.com> wrote:
>
> RF=1 with QUORUM consistency. I know QUORUM is weird with RF=1, but it
> should be the same as ONE. If’s QUORUM instead of ONE because production
> has RF=3, and I was running this against my test cluster with RF=1.
>
> On Aug 20, 2015, at 7:28 PM, Jason <jk...@rocketfuelinc.com> wrote:
>
> What consistency level were the writes?
> ------------------------------
> From: Robert Wille <rw...@fold3.com>
> Sent: ‎8/‎20/‎2015 18:25
> To: user@cassandra.apache.org
> Subject: Written data is lost and no exception thrown back to the client
>
> I wrote a data migration application which I was testing, and I pushed it
> too hard and the FlushWriter thread pool blocked, and I ended up with
> dropped mutation messages. I compared the source data against what is in my
> cluster, and as expected I have missing records. The strange thing is that
> my application didn’t error out. I’ve been doing some forensics, and
> there’s a lot about this that makes no sense and makes me feel very uneasy.
>
> I use a lot of asynchronous queries, and I thought it was possible that I
> had bad error handling, so I checked for errors in other, independent ways.
>
> I have a retry policy that on the first failure logs the error and then
> requests a retry. On the second failure it logs the error and then
> rethrows. A few retryable errors appeared in my logs, but no fatal errors.
> In theory, I should have a fatal error in my logs for any error that gets
> reported back to the client.
>
> I wrap my Session object, and all queries go through this wrapper. This
> wrapper logs all query errors. Synchronous queries are wrapped in a
> try/catch which logs and rethrows. Asynchronous queries use a
> FutureCallback to log any onFailure invocations.
>
> My logs indicate that no errors whatsoever were reported back to me. I do
> not understand how I can get dropped mutation messages and not know about
> it. I am running 2.0.16 with datastax Java driver 2.0.8. Three node cluster
> with RF=1. If someone could help me understand how this can occur, I would
> greatly appreciate it. A database that errors out is one thing. A database
> that errors out and makes you think everything was fine is quite another.
>
> Thanks
>
> Robert
>
>
>
>

Re: Written data is lost and no exception thrown back to the client

Posted by Jean Tremblay <je...@zen-innovations.com>.

I have the same problem.

When I bulk load my data, I have a problem with Cassandra Datastax driver.

<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>2.1.4</version> <!-- Driver 2.1.6, 2.1.7.1 gives problems. Some data is lost. -->
</dependency>

With version 2.1.6 and also with version 2.1.7.1 I have lost records with no error message what so ever.
With version 2.1.4 I have no missing records.

I use CL.ONE to write my records. I use RF 3.






On 21 Aug 2015, at 13:06 , Robert Wille <rw...@fold3.com>> wrote:

But it shouldn’t matter. I have missing data, and no errors, which shouldn’t be possible except with CL=ANY.

FWIW, I’m working on some sample code so I can post a Jira.

Robert

On Aug 21, 2015, at 5:04 AM, Robert Wille <rw...@fold3.com>> wrote:

RF=1 with QUORUM consistency. I know QUORUM is weird with RF=1, but it should be the same as ONE. If’s QUORUM instead of ONE because production has RF=3, and I was running this against my test cluster with RF=1.

On Aug 20, 2015, at 7:28 PM, Jason <jk...@rocketfuelinc.com>> wrote:

What consistency level were the writes?
________________________________
From: Robert Wille<ma...@fold3.com>
Sent: ‎8/‎20/‎2015 18:25
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Written data is lost and no exception thrown back to the client

I wrote a data migration application which I was testing, and I pushed it too hard and the FlushWriter thread pool blocked, and I ended up with dropped mutation messages. I compared the source data against what is in my cluster, and as expected I have missing records. The strange thing is that my application didn’t error out. I’ve been doing some forensics, and there’s a lot about this that makes no sense and makes me feel very uneasy.

I use a lot of asynchronous queries, and I thought it was possible that I had bad error handling, so I checked for errors in other, independent ways.

I have a retry policy that on the first failure logs the error and then requests a retry. On the second failure it logs the error and then rethrows. A few retryable errors appeared in my logs, but no fatal errors. In theory, I should have a fatal error in my logs for any error that gets reported back to the client.

I wrap my Session object, and all queries go through this wrapper. This wrapper logs all query errors. Synchronous queries are wrapped in a try/catch which logs and rethrows. Asynchronous queries use a FutureCallback to log any onFailure invocations.

My logs indicate that no errors whatsoever were reported back to me. I do not understand how I can get dropped mutation messages and not know about it. I am running 2.0.16 with datastax Java driver 2.0.8. Three node cluster with RF=1. If someone could help me understand how this can occur, I would greatly appreciate it. A database that errors out is one thing. A database that errors out and makes you think everything was fine is quite another.

Thanks

Robert

Re: Written data is lost and no exception thrown back to the client

Posted by Robert Wille <rw...@fold3.com>.

But it shouldn’t matter. I have missing data, and no errors, which shouldn’t be possible except with CL=ANY.

FWIW, I’m working on some sample code so I can post a Jira.

Robert

On Aug 21, 2015, at 5:04 AM, Robert Wille <rw...@fold3.com>> wrote:

RF=1 with QUORUM consistency. I know QUORUM is weird with RF=1, but it should be the same as ONE. If’s QUORUM instead of ONE because production has RF=3, and I was running this against my test cluster with RF=1.

On Aug 20, 2015, at 7:28 PM, Jason <jk...@rocketfuelinc.com>> wrote:

What consistency level were the writes?
________________________________
From: Robert Wille<ma...@fold3.com>
Sent: ‎8/‎20/‎2015 18:25
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Written data is lost and no exception thrown back to the client

I wrote a data migration application which I was testing, and I pushed it too hard and the FlushWriter thread pool blocked, and I ended up with dropped mutation messages. I compared the source data against what is in my cluster, and as expected I have missing records. The strange thing is that my application didn’t error out. I’ve been doing some forensics, and there’s a lot about this that makes no sense and makes me feel very uneasy.

I use a lot of asynchronous queries, and I thought it was possible that I had bad error handling, so I checked for errors in other, independent ways.

I have a retry policy that on the first failure logs the error and then requests a retry. On the second failure it logs the error and then rethrows. A few retryable errors appeared in my logs, but no fatal errors. In theory, I should have a fatal error in my logs for any error that gets reported back to the client.

I wrap my Session object, and all queries go through this wrapper. This wrapper logs all query errors. Synchronous queries are wrapped in a try/catch which logs and rethrows. Asynchronous queries use a FutureCallback to log any onFailure invocations.

My logs indicate that no errors whatsoever were reported back to me. I do not understand how I can get dropped mutation messages and not know about it. I am running 2.0.16 with datastax Java driver 2.0.8. Three node cluster with RF=1. If someone could help me understand how this can occur, I would greatly appreciate it. A database that errors out is one thing. A database that errors out and makes you think everything was fine is quite another.

Thanks

Robert

Re: Written data is lost and no exception thrown back to the client

Posted by Robert Wille <rw...@fold3.com>.

RF=1 with QUORUM consistency. I know QUORUM is weird with RF=1, but it should be the same as ONE. If’s QUORUM instead of ONE because production has RF=3, and I was running this against my test cluster with RF=1.

On Aug 20, 2015, at 7:28 PM, Jason <jk...@rocketfuelinc.com>> wrote:

What consistency level were the writes?
________________________________
From: Robert Wille<ma...@fold3.com>
Sent: ‎8/‎20/‎2015 18:25
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Written data is lost and no exception thrown back to the client

I wrote a data migration application which I was testing, and I pushed it too hard and the FlushWriter thread pool blocked, and I ended up with dropped mutation messages. I compared the source data against what is in my cluster, and as expected I have missing records. The strange thing is that my application didn’t error out. I’ve been doing some forensics, and there’s a lot about this that makes no sense and makes me feel very uneasy.

I use a lot of asynchronous queries, and I thought it was possible that I had bad error handling, so I checked for errors in other, independent ways.

I have a retry policy that on the first failure logs the error and then requests a retry. On the second failure it logs the error and then rethrows. A few retryable errors appeared in my logs, but no fatal errors. In theory, I should have a fatal error in my logs for any error that gets reported back to the client.

I wrap my Session object, and all queries go through this wrapper. This wrapper logs all query errors. Synchronous queries are wrapped in a try/catch which logs and rethrows. Asynchronous queries use a FutureCallback to log any onFailure invocations.

My logs indicate that no errors whatsoever were reported back to me. I do not understand how I can get dropped mutation messages and not know about it. I am running 2.0.16 with datastax Java driver 2.0.8. Three node cluster with RF=1. If someone could help me understand how this can occur, I would greatly appreciate it. A database that errors out is one thing. A database that errors out and makes you think everything was fine is quite another.

Thanks

Robert

RE: Written data is lost and no exception thrown back to the client

Posted by Jason <jk...@rocketfuelinc.com>.

What consistency level were the writes?

-----Original Message-----
From: "Robert Wille" <rw...@fold3.com>
Sent: ‎8/‎20/‎2015 18:25
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Written data is lost and no exception thrown back to the client

I wrote a data migration application which I was testing, and I pushed it too hard and the FlushWriter thread pool blocked, and I ended up with dropped mutation messages. I compared the source data against what is in my cluster, and as expected I have missing records. The strange thing is that my application didn’t error out. I’ve been doing some forensics, and there’s a lot about this that makes no sense and makes me feel very uneasy.

I use a lot of asynchronous queries, and I thought it was possible that I had bad error handling, so I checked for errors in other, independent ways.

I have a retry policy that on the first failure logs the error and then requests a retry. On the second failure it logs the error and then rethrows. A few retryable errors appeared in my logs, but no fatal errors. In theory, I should have a fatal error in my logs for any error that gets reported back to the client.

I wrap my Session object, and all queries go through this wrapper. This wrapper logs all query errors. Synchronous queries are wrapped in a try/catch which logs and rethrows. Asynchronous queries use a FutureCallback to log any onFailure invocations.

My logs indicate that no errors whatsoever were reported back to me. I do not understand how I can get dropped mutation messages and not know about it. I am running 2.0.16 with datastax Java driver 2.0.8. Three node cluster with RF=1. If someone could help me understand how this can occur, I would greatly appreciate it. A database that errors out is one thing. A database that errors out and makes you think everything was fine is quite another.

Thanks

Robert