You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2014/08/25 09:26:01 UTC

[jira] [Updated] (CASSANDRA-7360) CQLSSTableWriter consumes all memory for table with compound primary key

     [ https://issues.apache.org/jira/browse/CASSANDRA-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-7360:
----------------------------------------

    Attachment: 7360.txt

Attaching patch for this. I first modified {{SSTableSimpleUnsortedWriter}} so it checks if it needs to "sync" after each column, to realize that {{CQLSSTableWriter}} was bypassing this and adding cells directly to th CF that {{SSTableSimpleUnsortedWriter}} expose. So {{CQLSSTableWriter}} uses a modified CF object that calls back the proper method after adding each cells: not the cleanest thing ever but simple enough to fix this.

> CQLSSTableWriter consumes all memory for table with compound primary key
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7360
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7360
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Xu Zhongxing
>            Assignee: Sylvain Lebresne
>         Attachments: 7360.txt
>
>
> When using CQLSSTableWriter to write a table with compound primary key, if the partition key is identical for a huge amount of records, the sync() method is never called, and the memory usage keeps growing until the memory is exhausted. 
> Could the code be improved to do sync() even when there is no new row  created? The relevant code is in SSTableSimpleUnsortedWriter.java and AbstractSSTableSimpleWriter.java. I am new to the code and cannot produce a reasonable patch for now.
> The problem can be reproduced by the following test case:
> {code}
> import org.apache.cassandra.io.sstable.CQLSSTableWriter;
> import org.apache.cassandra.exceptions.InvalidRequestException;
> import java.io.IOException;
> import java.util.UUID;
> class SS {
>     public static void main(String[] args) {
>         String schema = "create table test.t (x uuid, y uuid, primary key (x, y))";
>         String insert = "insert into test.t (x, y) values (?, ?)";
>         CQLSSTableWriter writer = CQLSSTableWriter.builder()
>             .inDirectory("/tmp/test/t")
>             .forTable(schema).withBufferSizeInMB(32)
>             .using(insert).build();
>         UUID id = UUID.randomUUID();
>         try {
>             for (int i = 0; i < 50000000; i++) {
>                 UUID id2 = UUID.randomUUID();
>                 writer.addRow(id, id2);
>             }
>             writer.close();
>         } catch (Exception e) {
>             System.err.println("hell");
>         }
>     }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)