You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2014/08/25 09:26:01 UTC
[jira] [Updated] (CASSANDRA-7360) CQLSSTableWriter consumes all
memory for table with compound primary key
[ https://issues.apache.org/jira/browse/CASSANDRA-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sylvain Lebresne updated CASSANDRA-7360:
----------------------------------------
Attachment: 7360.txt
Attaching patch for this. I first modified {{SSTableSimpleUnsortedWriter}} so it checks if it needs to "sync" after each column, to realize that {{CQLSSTableWriter}} was bypassing this and adding cells directly to th CF that {{SSTableSimpleUnsortedWriter}} expose. So {{CQLSSTableWriter}} uses a modified CF object that calls back the proper method after adding each cells: not the cleanest thing ever but simple enough to fix this.
> CQLSSTableWriter consumes all memory for table with compound primary key
> ------------------------------------------------------------------------
>
> Key: CASSANDRA-7360
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7360
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Xu Zhongxing
> Assignee: Sylvain Lebresne
> Attachments: 7360.txt
>
>
> When using CQLSSTableWriter to write a table with compound primary key, if the partition key is identical for a huge amount of records, the sync() method is never called, and the memory usage keeps growing until the memory is exhausted.
> Could the code be improved to do sync() even when there is no new row created? The relevant code is in SSTableSimpleUnsortedWriter.java and AbstractSSTableSimpleWriter.java. I am new to the code and cannot produce a reasonable patch for now.
> The problem can be reproduced by the following test case:
> {code}
> import org.apache.cassandra.io.sstable.CQLSSTableWriter;
> import org.apache.cassandra.exceptions.InvalidRequestException;
> import java.io.IOException;
> import java.util.UUID;
> class SS {
> public static void main(String[] args) {
> String schema = "create table test.t (x uuid, y uuid, primary key (x, y))";
> String insert = "insert into test.t (x, y) values (?, ?)";
> CQLSSTableWriter writer = CQLSSTableWriter.builder()
> .inDirectory("/tmp/test/t")
> .forTable(schema).withBufferSizeInMB(32)
> .using(insert).build();
> UUID id = UUID.randomUUID();
> try {
> for (int i = 0; i < 50000000; i++) {
> UUID id2 = UUID.randomUUID();
> writer.addRow(id, id2);
> }
> writer.close();
> } catch (Exception e) {
> System.err.println("hell");
> }
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)