You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sandeep Tata (JIRA)" <ji...@apache.org> on 2009/03/16 20:45:50 UTC

[jira] Updated: (CASSANDRA-7) Cassandra silently loses data when a single row gets large

     [ https://issues.apache.org/jira/browse/CASSANDRA-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sandeep Tata updated CASSANDRA-7:
---------------------------------

    Attachment: BigReadWriteTest.java

This program simply writes a bunch of data first and tries to read it all back.
If the write phase spans multiple SSTables, you will notice that the read phase fails with missing values.
The "--numColumns" needs to be large enough -- try something like 6000. It might take a couple of minutes to run the test.





> Cassandra silently loses data when a single row gets large
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-7
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, Red Hat 4.1.2-33,  Linux version 2.6.23.1-42.fc8, java version "1.7.0-nio2"
>            Reporter: Sandeep Tata
>            Priority: Critical
>         Attachments: BigReadWriteTest.java
>
>
> When you insert a large number of columns in a single row, Cassandra silently loses some of these inserts.
> This does not happen until the cumulative size of the columns in a single row exceeds several megabytes.
> Say each value is 1MB large, 
> insert("row", "col0", value, timestamp)
> insert("row", "col1", value, timestamp)
> insert("row", "col2", value, timestamp)
> ...
> ...
> insert("row", "col100", value, timestamp)
> Running: 
> get_column("row", "col0")
> get_column("row", "col1")
> ...
> ..
> get_column("row", "col100")
> The sequence of get_columns will fail at some point before 100. This was a problem with the old code in code.google also.
> I will attach a small program that will help you reproduce this. 
> 1. This only happens when the cumulative size of the row exceeds several megabytes. 
> 2. In fact, the single row should be large enough to trigger an SSTable flush to trigger this error.
> 3. No OutOfMemory errors are thrown, there is nothing relevant in the logs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.