You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Neophytos Demetriou (JIRA)" <ji...@apache.org> on 2009/03/22 12:35:50 UTC

[jira] Created: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Cassandra silently loses data when a single row gets large (under "heavy load")
-------------------------------------------------------------------------------

                 Key: CASSANDRA-9
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
             Project: Cassandra
          Issue Type: Bug
         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
            Reporter: Neophytos Demetriou


When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.


In storage-conf.xml:

   <HashingStrategy>RANDOM</HashingStrategy>
   <MemtableSizeInMB>32</MemtableSizeInMB>
   <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
   <Tables>
      <Table Name="MyTable">
          <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
      </Table>
  </Tables>

You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
    private int thresholdCount_ = 512*1024;


Here is a small program that will help you reproduce this (hopefully):

    private static void doWrite() throws Throwable
    {
        int numRequests=0;
        int numRequestsPerSecond = 3;
        Table table = Table.open("MyTable");
        Random random = new Random();
        byte[] bytes = new byte[8];
        String key = "MyKey";
        int totalUsed = 0;
        int total = 0;
        for (int i = 0; i < 1500; ++i) {
            RowMutation rm = new RowMutation("MyTable", key);
            random.nextBytes(bytes);
            int[] used = new int[500*1024];
            for (int z=0; z<500*1024;z++) {
                used[z]=0;
            }
            int n = random.nextInt(16*1024);
            for ( int k = 0; k < n; ++k ) {
                int j = random.nextInt(500*1024);
                if ( used[j] == 0 ) {
                    used[j] = 1;
                    ++totalUsed;
                    //int w = random.nextInt(4);
                    int w = 0;
                    rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
                }
            }
            rm.apply();
            total += n;
            System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
            //Thread.sleep(1000*numRequests/numRequestsPerSecond);
            numRequests++;
        }
        System.out.println("Write done");
    }


PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697846#action_12697846 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

Forgot to mention one more benefit to executor-per-memtable: this lets us easily call forceFlush in tests and then wait for the flush to finish before proceeding to do tests on the flushed sstable.  (That is why #59 blocks on this.)

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697528#action_12697528 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

(Todd pointed out that having a per-Memtable executor is also more efficient by not needing to hash CF names to look up the executor.)

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Neophytos Demetriou (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688143#action_12688143 ] 

Neophytos Demetriou commented on CASSANDRA-9:
---------------------------------------------

Last line should read: I've not tried it by forcing flushes (FlushKey), no. 

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-9:
-----------------------------------

    Attachment: shutdown-before-flush-v2.patch

Here is a cleaner solution that adds the flush to the ExecutorService terminated() method which seems cleaner than having the flush itself (running in the Manager service) reach back to the Memtable service and block while waiting for shutdown.  (In a busy system we don't want to block the Manager service.)

Note that this also handles waiting for gets() to finish before flushing -- any logic purely in put() will not be able to do that, because get never checks threshold or acquires a lock.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688932#action_12688932 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

Gets won't cause the CME, flush will. :)  calling columnFamily.clear(); as it goes can cause a CME as Get looks through it.  Of course even if it did not you would get back invalid results operating on a half-cleared-out Memtable.


In general it is just difficult to reason about concurrency when an object is being accessed from multiple threads at a time, even if it were okay today to "cheat" a bit, it will probably bite us down the road.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-9:
-----------------------------------

    Attachment: 0001-better-fix-for-9.patch

Fixes potential CME with GETs.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Reopened: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reopened CASSANDRA-9:
------------------------------------

      Assignee: Jonathan Ellis

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688983#action_12688983 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

Worst case, the flush clear() happens while the getter is iterating columns, and you get CME.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689652#action_12689652 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

The put thread does not run the flush. You submit to the put thread. It submits it to the manager service. Maybe I am missing something here?

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Sandeep Tata (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688960#action_12688960 ] 

Sandeep Tata commented on CASSANDRA-9:
--------------------------------------

Ah, there's a whole bunch of worker threads talking to CFStore (and therefore memtable) -- I see how we can end up with Getters after adding a Flusher

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697850#action_12697850 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

Why would you want to wait to do tests? In the real world that is not what happens. You should be able to do reads even before the flush is complete. It should be seamless. Even a new memtable is served out the old one is maintained till the flush is complete. So this should really matter. If you just want to test the writes into SSTable then write into it and then test. I think this should not be a reason for the proposed change. May I missing something here.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-9:
-----------------------------------

    Attachment: executor.patch

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697582#action_12697582 ] 

Johan Oskarsson commented on CASSANDRA-9:
-----------------------------------------

>From my limited understanding of that code the latest patch gets a +1, looks clean. But I'd recommend that someone with a bit more experience look at it.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-9:
-----------------------------------

    Attachment: shutdown-before-flush-v3-trunk.patch

v3 applies cleanly against trunk.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avinash Lakshman resolved CASSANDRA-9.
--------------------------------------

    Resolution: Fixed

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688903#action_12688903 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

the problem with r758044 is it does not address GETs -- you can still have Getter ops on the the service added after the Flusher, so they will execute during / after the flush.  that is why I split the apartments_ into a per-memtable instance var and run the flush on terminate.  it's the cleanest way to be correct w/ gets w/o introducing explicit locks.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698600#action_12698600 ] 

Todd Lipcon commented on CASSANDRA-9:
-------------------------------------

Here's a review against the newest patch:

First, some style nits in Memtable.java:
  - runningExecutorServices member variable should have a trailing _ for style consistency
  - same with executor_

Regarding the actual contents of the patch, I sort of dislike subclassing the executor to do work on terminate, but it's the cleanest solution I can think of, so +1

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688953#action_12688953 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

Ahh. I see. For some reason I was seeing from inside the apartment. This is no good. I will fix it tonight. 

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697835#action_12697835 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

Right.  Get iterates over the columns, depending on the filter used.  Flush still clears out each CF as it is flushed:

                ssTable.append(partitioner.decorateKey(key), buffer);
                bf.add(key);
                columnFamily.clear();

This is behavior I want to keep since the overhead can be relatively high when column values are small.  And the new code is simpler to reason about since you only ever have one thread accessing things rather than executing gets during the flush.  (If we took the clear() out, we would be ok for now, but what if someone changes flush in six months?  One thread at a time is safer, especially vs _almost_ always one thread at a time which becomes easy to forget the exceptions.)

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-9:
-----------------------------------

    Attachment: 0001-better-fix-for-9-v2.patch

rebased to HEAD

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688980#action_12688980 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

It is an issue that is actually a non-issue. In the worst case the Getter will return NULL since it read an empty memtable (maybe memtable got cleared). But that is fine because now the disk read will happen from buffer cache. It is not incorrect. No harm will be done.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688986#action_12688986 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

Get rid of clear() :). It is a useless call anyway.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689079#action_12689079 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

Ah, I see, we were talking about different clear(). Yes, the one from the end that you removed is always irrelevant (and not going to cause a CME).

It is the columnFamily.clear() in the middle that is still there that both frees up memory (\yes, of course by "free up memory" I mean "make it available to be GC'd") and can cause CME on the iterations that GET does.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-9:
-----------------------------------

    Attachment: shutdown-before-flush.patch

Here is the patch, following my proposed fix above.  Works like a champ.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698653#action_12698653 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

I will make the style changes; thanks for the review, Todd.

Any further discussion needed before commit?

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688991#action_12688991 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

I thought the point of clear was to free up memory as the flush progresses.  Isn't that worth a dozen more lines of code?

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688955#action_12688955 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

Sorry, that's the right result but the wrong explanation.  (Son was howling at me -- very distracting.)

It is the getter creation / submit that is problematic, not the apartment get/submit.

that is,

{code}
thread A                                                              thread B
new Getter(key, cfName, filter);
                                                                             new Flusher(cLogCtx);
                                                                             submit(flusher);
submit(getter);
{code}

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688436#action_12688436 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

Okay, I see the problem.

Doing an insert (-> CFS.apply -> MT.put) checks for threshold violation; if it's okay, it schedules a Putter with the mutation on the Memtable.apartments thread pool.

If it is NOT okay, it schedules a flush -- _on another thread pool_ (MemtableManager's flusher).

So, what happens is, you have a bunch of Putter objects, each with a reference to the old Memtable, in the first threadpool, when the flush starts in the 2nd.  These Putters cause the CME when they get to resolve() while the flush is computing the column index.  (This is why it is easier to make this happen on large rows: index computation takes longer).

I think the easiest fix will be to make the apartments threadpool (executorservice) non-static, and just have one per memtable; then flush could wait for the service to finish before doing its thing.  Memory and thread churn will be nominal since memtable flush is so rare, relatively speaking.  Creating a new thread after flush is no big deal.

I'll get on the patch, just wanted to post an update in the meantime so nobody else needs to bang his head on this.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689648#action_12689648 ] 

Jun Rao commented on CASSANDRA-9:
---------------------------------

Looking at the latest code. Both flush and put on a CF are submitted to the same ExecutorPool for that CF. Since the ExecutorPool has 1 thread in it, this means that the flushing of an old memtable will not run concurrently with the updates on a new memtable in the same CF. This limits concurrency.
 

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688988#action_12688988 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

Actually I take that back. There is no CME unless iterators are involved. But nevertheless the safest thing would be to not do the clear(). And I think everything will be good.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688919#action_12688919 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

Hmm. Should that matter. Gets do not modify the collection. I was under the impression that CME occurs when one thread tries to modify a collection when another is iterating over it. I will continue to look. Of course the whole apartment concept was introduced to eliminate locks.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697851#action_12697851 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

It's a side benefit for the change, not a motivation.

Certainly testing a flush and making sure the resulting sstable has the same data that the memtable did is a good test to have.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689654#action_12689654 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

Wrong.  The Flusher that goes on the Memtable executor is just a stub that kicks off the real flush in the MemtableManager's executor.

So when you have a Getter queued after that flusher, which can happen as I described above, the getter can get a CME while it is iterating through columns at the same time as the real flush calls cf.clear().

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688138#action_12688138 ] 

Jun Rao commented on CASSANDRA-9:
---------------------------------

Neo,

Is the problem specific to super CF or does it show up in regular CF too? Also, have you tried flushing smaller amount of data to disk. In cassandra, if you insert a row with key "FlushKey", it forces a flush on the CF referenced in the insertion.

Jun

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-9:
-----------------------------------

    Fix Version/s: 0.3

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688898#action_12688898 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

The problem was identified by Jonathan Ellis. I have a fix checked in that requires a change only in the Memtable class. Neophytos has verified that my change actually works. But the credit goes to Jonathan for identifying the problem which was the harder part of this whole exercise. I am deeming this case as closed.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688410#action_12688410 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

This is your old friend the ConcurrentModificationException, Neophytos.  Only the ThreadPoolExecutor is eating the exception.  Took me hours to figure out where the hell the exception was disappearing to...  Here's a patch that exposes it.  Not sure what the fix for the CME is yet but at least it's out in the open and reproducible. :)

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Neophytos Demetriou (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688456#action_12688456 ] 

Neophytos Demetriou commented on CASSANDRA-9:
---------------------------------------------

Confirmed. Thank you Jonathan.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Neophytos Demetriou (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688142#action_12688142 ] 

Neophytos Demetriou commented on CASSANDRA-9:
---------------------------------------------

Hi Jun, I've done most of the tests using super CFs. Having said that I just tried it with a name-sorted regular CF and the outcome seems to be the same (zero-sized data files when no throttle is used). Please don't take my word for regular CFs and try it out. One of the reasons it took me so long to report this in public was that I was not sure if (a) it was a problem specific to the hardware I'm using or (b) misuse of Cassandra's constructs. 

For the case of super CFs, I did extensive testing with the code.google codebase and I did try lowering the thresholds (i.e. threshold_ and thresholdCount_ in Memtable.java). The result of that was that it would write some of the files fine while others were zero-sized. I've tried it by forcing flushes (FlushKey), no.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688722#action_12688722 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

I am going to look at this once I get into work. I will apply/fix this today.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-9.
------------------------------------

    Resolution: Fixed

committed

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Neophytos Demetriou (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neophytos Demetriou updated CASSANDRA-9:
----------------------------------------

    Attachment: shutdown-before-flush-against-trunk.patch

Just a quick diff against code in trunk for your convenience. Please verify with Jonathan's patch before commit.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697834#action_12697834 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

I am just confused about one thing here. Why is there a chance of a CME on a get? I mean as far as I know a CME occurs when one thread is iterating (using an iterator) and another tries to modify the collection. That is not something that can happen here on a get, I think. So if that is the case there is no need for this change. Hash function cf name lookup is a non issue here.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>            Assignee: Jonathan Ellis
>             Fix For: 0.3
>
>         Attachments: 0001-better-fix-for-9-v2.patch, 0001-better-fix-for-9.patch, executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Sandeep Tata (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688956#action_12688956 ] 

Sandeep Tata commented on CASSANDRA-9:
--------------------------------------

I think today, because of the way the code stands, you won't enqueue Getters on the table after you enqueue a flusher. But, I don't see how simply adding a flusher to the apartment's DebuggableThreadPool (instead of running the flusher in a separate thread) guarantees that there are no concurrent Putters/Getters still in the threadpool. Am I'm missing something? 

I agree that running flush in terminated() by overriding the method is the cleanest approach. You don't have to rely on the fact that the rest of the code (today) is such that you won't end up queuing a Getter after a Flusher (I'm guessing this is what Jonathan meant by "cheat" a bit today). This guarantee is precisely the reason ThreadPoolExecutor provides this hook.





> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689668#action_12689668 ] 

Jun Rao commented on CASSANDRA-9:
---------------------------------

OK. I see it now. The real work of Flush is done in a separate thread. Sorry for the false alarm.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Avinash Lakshman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688937#action_12688937 ] 

Avinash Lakshman commented on CASSANDRA-9:
------------------------------------------

Not sure if there can ever be a Getter on the queue after a Flusher has been enqueued. Once you are in the state where a Flusher() has been enqueued there can be no Getter() for the same Memtable. Anyways I will look into it tonight again.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688952#action_12688952 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

> Not sure if there can ever be a Getter on the queue after a Flusher has been enqueued. Once you are in the state where a Flusher() has been enqueued there can be no Getter() for the same Memtable.

That is how easy it is to be fooled in these things -- that is what we want, but we are not enforcing that.

In particular note that the line

    		cf = apartments_.get(cfName_).submit(call).get();

is not atomic.

GET thread can execute apartments_.get(cfName_)

then PUT thread gets CPU (or executes concurrently on another core), switches memtable, and queues Flusher.

Thread A gets the CPU back and calls submit.  Getter is now on queue after Flusher.

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-9) Cassandra silently loses data when a single row gets large (under "heavy load")

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689655#action_12689655 ] 

Jonathan Ellis commented on CASSANDRA-9:
----------------------------------------

(I was writing my comment at the same time as Avinash, so my "Wrong" was referring to Jun's assertion that "the flushing of an old memtable will not run concurrently with the updates on a new memtable".)

> Cassandra silently loses data when a single row gets large (under "heavy load")
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: code in trunk, linux-2.6.27-gentoo-r1/, java version "1.7.0-nio2", 4GB, Intel Core 2 Duo
>            Reporter: Neophytos Demetriou
>         Attachments: executor.patch, shutdown-before-flush-against-trunk.patch, shutdown-before-flush-v2.patch, shutdown-before-flush-v3-trunk.patch, shutdown-before-flush.patch
>
>
> When you insert a large number of columns in a single row, cassandra silently loses some or all of these inserts while flushing memtable to disk (potentialy leaving you with zero-sized data files). This happens when the memtable threshold is violated, i.e. when currentSize_ >= threshold_ (MemTableSizeInMB) OR  currentObjectCount_ >= thresholdCount_ (MemTableObjectCountInMillions). This was a problem with the old code in code.google and the code with the jdk7 dependencies also. No OutOfMemory errors are thrown, there is nothing relevant in the logs. It is not clear why this happens under heavy load (when no throttle is used) as it works fine when when you pace requests. I have confirmed this with another member of the community.
> In storage-conf.xml:
>    <HashingStrategy>RANDOM</HashingStrategy>
>    <MemtableSizeInMB>32</MemtableSizeInMB>
>    <MemtableObjectCountInMillions>1</MemtableObjectCountInMillions>
>    <Tables>
>       <Table Name="MyTable">
>           <ColumnFamily ColumnType="Super" ColumnSort="Name" Name="MySuper"></ColumnFamily>
>       </Table>
>   </Tables>
> You can also test it with different values for thresholdCount_ In db/Memtable.java, say:
>     private int thresholdCount_ = 512*1024;
> Here is a small program that will help you reproduce this (hopefully):
>     private static void doWrite() throws Throwable
>     {
>         int numRequests=0;
>         int numRequestsPerSecond = 3;
>         Table table = Table.open("MyTable");
>         Random random = new Random();
>         byte[] bytes = new byte[8];
>         String key = "MyKey";
>         int totalUsed = 0;
>         int total = 0;
>         for (int i = 0; i < 1500; ++i) {
>             RowMutation rm = new RowMutation("MyTable", key);
>             random.nextBytes(bytes);
>             int[] used = new int[500*1024];
>             for (int z=0; z<500*1024;z++) {
>                 used[z]=0;
>             }
>             int n = random.nextInt(16*1024);
>             for ( int k = 0; k < n; ++k ) {
>                 int j = random.nextInt(500*1024);
>                 if ( used[j] == 0 ) {
>                     used[j] = 1;
>                     ++totalUsed;
>                     //int w = random.nextInt(4);
>                     int w = 0;
>                     rm.add("MySuper:SuperColumn-" + j + ":Column-" + i, bytes, w);
>                 }
>             }
>             rm.apply();
>             total += n;
>             System.out.println("n="+n+ " total="+ total+" totalUsed="+totalUsed);
>             //Thread.sleep(1000*numRequests/numRequestsPerSecond);
>             numRequests++;
>         }
>         System.out.println("Write done");
>     }
> PS. Please note that (a) I'm no java guru and (b) I have tried this initially with a C++ thrift client. The outcome is always the same: zero-sized data files under heavy load --- it works fine when you pace requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.