You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Michał Łowicki <ml...@gmail.com> on 2015/01/15 18:09:10 UTC

Inconsistencies between two tables if BATCH used

Hi,

We've two tables in:
* First one *entity *has log-like structure - whenever entity is modified
we create new version of it and put into the table with new mtime which is
part of compound key. Old one is removed.
* Second one called *entity_by_id *is manually managed index for *entity*.
By having only id you can get basic entity attributes from *entity_by_id*.

While adding entity we do two inserts - to *entity *and *entity_by_id *(in
this order)
While deleting we do the same using the same order so first we remove
record from *entity *table.

It turned out that these two tables were inconsistent. We had ~260 records
in *entity_by_id *for which there is no corresponding record in *entity. *In
*entity *table it's much worse because ~7000 records in *entity_by_id* are
missing and it was growing much faster.

We were using LOCAL_QUROUM. C* 2.1.2. Two datacenters. We didn't get any
exceptions while inserts or deletes. BatchQuery from cqlengine (0.20.0) has
been used.

If BatchQuery is not used:

with BatchQuery() as b:
-        entity.batch(b).save()
-        entity_by_id = EntityById.copy_fields_from(entity)
-        entity_by_id.batch(b).save()
+    entity.save()
+    entity_by_id = EntityById.copy_fields_from(entity)
+    entity_by_id.save()


Everything is fine. We don't have more inconsistencies. I've check what
cqlengine generates and seems that works as expected:

('BEGIN  BATCH\n  UPDATE sync.entity SET "name" = %(4)s WHERE "user_id" =
%(0)s AND "data_type_id" = %(1)s AND "version" = %(2)s AND "id" = %(3)s\n
INSERT INTO sync.entity_by_id ("user_id", "id", "parent_id", "deleted",
"folder", "data_type_id", "version") VALUES (%(5)s, %(6)s, %(7)s, %(8)s,
%(9)s, %(10)s, %(11)s)\nAPPLY BATCH;',)
We suspect that it's a problem in the C* itself. Any ideas how to debug
what is going on as BATCH is needed in this case?

-- 
BR,
Michał Łowicki

Re: Inconsistencies between two tables if BATCH used

Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Jan 16, 2015 at 12:58 AM, Michał Łowicki <ml...@gmail.com> wrote:

> Done. https://issues.apache.org/jira/browse/CASSANDRA-8636
>

Thank you for closing the loop and including the link in your reply. Future
searchers will appreciate.

=Rob

Re: Inconsistencies between two tables if BATCH used

Posted by Michał Łowicki <ml...@gmail.com>.
Done. https://issues.apache.org/jira/browse/CASSANDRA-8636

On Thu, Jan 15, 2015 at 7:46 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Jan 15, 2015 at 9:09 AM, Michał Łowicki <ml...@gmail.com>
> wrote:
>
>> We were using LOCAL_QUROUM. C* 2.1.2. Two datacenters. We didn't get any
>> exceptions while inserts or deletes. BatchQuery from cqlengine (0.20.0) has
>> been used.
>>
>> If BatchQuery is not used:
>>
>
>
>> Everything is fine. We don't have more inconsistencies. I've check what
>> cqlengine generates and seems that works as expected:
>>
>
> I would file a JIRA including your STR (steps to reproduce). Sounds
> serious enough to investigate.
>
> =Rob
>



-- 
BR,
Michał Łowicki

Re: Inconsistencies between two tables if BATCH used

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Jan 15, 2015 at 9:09 AM, Michał Łowicki <ml...@gmail.com> wrote:

> We were using LOCAL_QUROUM. C* 2.1.2. Two datacenters. We didn't get any
> exceptions while inserts or deletes. BatchQuery from cqlengine (0.20.0) has
> been used.
>
> If BatchQuery is not used:
>


> Everything is fine. We don't have more inconsistencies. I've check what
> cqlengine generates and seems that works as expected:
>

I would file a JIRA including your STR (steps to reproduce). Sounds serious
enough to investigate.

=Rob