You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jacek Furmankiewicz (JIRA)" <ji...@apache.org> on 2014/09/24 18:14:33 UTC

[jira] [Commented] (CASSANDRA-7303) OutOfMemoryError during prolonged batch processing

    [ https://issues.apache.org/jira/browse/CASSANDRA-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14146469#comment-14146469 ] 

Jacek Furmankiewicz commented on CASSANDRA-7303:
------------------------------------------------

I am really surprised. 

We can bring down an entire server down with a query and it won't be fixed?

The worst part about this bug is that it is customer-specific.
On one installation with maybe less data, the exact same query works great.

On a different site, with just a bit too much data, the entire server crashes.
Actually, in one case we had the entire cluster (3 nodes) crash as we were rebooting app servers at the same time.

I would expect a QueryTooLargeException or something but not an entire server/cluster to crash.



> OutOfMemoryError during prolonged batch processing
> --------------------------------------------------
>
>                 Key: CASSANDRA-7303
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7303
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Server: RedHat 6, 64-bit, Oracle JDK 7, Cassandra 2.0.6
> Client: Java 7, Astyanax
>            Reporter: Jacek Furmankiewicz
>              Labels: crash, outofmemory, qa-resolved
>
> We have a prolonged batch processing job. 
> It writes a lot of records, every batch mutation creates probably on average 300-500 columns per row key (with many disparate row keys).
> It works fine but within a few hours we get error like this:
> ERROR [Thrift:15] 2014-05-24 14:16:20,192 CassandraDaemon.java (line |
> |196) Except                                                          |
> |ion in thread Thread[Thrift:15,5,main]                               |
> |java.lang.OutOfMemoryError: Requested array size exceeds VM limit    |
> |at java.util.Arrays.copyOf(Arrays.java:2271)                         |
> |at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)|
> |at java.io.ByteArrayOutputStream.ensureCapacity                      |
> |(ByteArrayOutputStream.ja                                            |
> |va:93)                                                               |
> |at java.io.ByteArrayOutputStream.write                               |
> |(ByteArrayOutputStream.java:140)                                     |
> |at org.apache.thrift.transport.TFramedTransport.write                |
> |(TFramedTransport.j                                                  |
> |ava:146)                                                             |
> |at org.apache.thrift.protocol.TBinaryProtocol.writeBinary            |
> |(TBinaryProtoco                                                      |
> |l.java:183)                                                          |
> |at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write     |
> |(Column.                                                             |
> |java:678)                                                            |
> |at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write     |
> |(Column.                                                             |
> |java:611)                                                            |
> |at org.apache.cassandra.thrift.Column.write(Column.java:538)         |
> |at org.apache.cassandra.thrift.ColumnOrSuperColumn                   |
> |$ColumnOrSuperColumnSt                                               |
> |andardScheme.write(ColumnOrSuperColumn.java:673)                     |
> |at org.apache.cassandra.thrift.ColumnOrSuperColumn                   |
> |$ColumnOrSuperColumnSt                                               |
> |andardScheme.write(ColumnOrSuperColumn.java:607)                     |
> |at org.apache.cassandra.thrift.ColumnOrSuperColumn.write             |
> |(ColumnOrSuperCo                                                     |
> |lumn.java:517)                                                       |
> |at org.apache.cassandra.thrift.Cassandra$get_slice_result            |
> |$get_slice_resu                                                      |
> |ltStandardScheme.write(Cassandra.java:11682)                         |
> |at org.apache.cassandra.thrift.Cassandra$get_slice_result            |
> |$get_slice_resu                                                      |
> |ltStandardScheme.write(Cassandra.java:11603)                         |
> |at org.apache.cassandra.thrift.Cassandra
> The server already has 16 GB heap, which we hear is the max Cassandra can run with. The writes are heavily multi-threaded from a single server.
> The jist of the issue is that Cassandra should not crash with OOM when under heavy load. It is  OK  to slow down, even maybe start throwing operation timeout exceptions, etc.
> But to just crash in the middle of the processing should not be allowed.
> is there any internal monitoring of heap usage in Cassandra where it could detect that it is getting close to the heap limit and start throttling the incoming requests to avoid this type of error?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)