You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Spacejatsi <sp...@gmail.com> on 2010/04/28 15:35:00 UTC

Inserting files to Cassandra timeouts

Hi all,

i'm trying to run a scenario of adding files from specific folder to cassandra. Now I have 64 files(about 15-20 MB per file) and overall of 1GB of data. 
I'm able to insert a round 40 files, but after that the cassandra goes to some GC loop and I finally get an timeout to the client. 
It is not going to OOM, but it just jams. 

Here is what I had last marks in log file:
NFO [GC inspection] 2010-04-28 10:07:55,297 GCInspector.java (line 110) GC for ParNew: 232 ms, 25731128 reclaimed leaving 553241120 used; max is 4108386304
 INFO [GC inspection] 2010-04-28 10:09:02,331 GCInspector.java (line 110) GC for ParNew: 2844 ms, 238909856 reclaimed leaving 1435582832 used; max is 4108386304
 INFO [GC inspection] 2010-04-28 10:09:49,421 GCInspector.java (line 110) GC for ParNew: 30666 ms, 11185824 reclaimed leaving 1679795336 used; max is 4108386304
 INFO [GC inspection] 2010-04-28 10:11:18,090 GCInspector.java (line 110) GC for ParNew: 895 ms, 17921680 reclaimed leaving 1589308456 used; max is 4108386304



I think that I must have something wrong in my configurations or in how I use cassandra, because here people are inserting 10 times more stuff and it works. 

Column family I using:
<ColumnFamily CompareWith="BytesType" Name="Standard1"/>
Basically inserting with key name is "Folder_name" and column name is "file name" and value is the file content. 
I tried with Hector(mainly) and directly using thrift(insert and batch_mutate). 

In my case, the data does not need to readable immediately after insert, but I don't know it that helps in anyway. 


My environment :
mac and/or linux, tested in both
java 1.6.0_17 
Cassandra 0.6.1



 <RpcTimeoutInMillis>60000</RpcTimeoutInMillis>
<CommitLogRotationThresholdInMB>32</CommitLogRotationThresholdInMB>
<RowWarningThresholdInMB>512</RowWarningThresholdInMB>
  <SlicedBufferSizeInKB>32</SlicedBufferSizeInKB>
  <FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB>
  <FlushIndexBufferSizeInMB>8</FlushIndexBufferSizeInMB>
  <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB>
  <MemtableThroughputInMB>64</MemtableThroughputInMB>
  <BinaryMemtableThroughputInMB>256</BinaryMemtableThroughputInMB>
  <MemtableOperationsInMillions>0.1</MemtableOperationsInMillions>
  <MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes>
  <ConcurrentReads>8</ConcurrentReads>
  <ConcurrentWrites>32</ConcurrentWrites>
  <CommitLogSync>batch</CommitLogSync>
  <!-- CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS -->
  <CommitLogSyncBatchWindowInMS>1.0</CommitLogSyncBatchWindowInMS>
  <GCGraceSeconds>500</GCGraceSeconds>

JVM_OPTS=" \
        -server \
        -Xms3G \
        -Xmx3G \
        -XX:PermSize=512m \
        -XX:MaxPermSize=800m \
        -XX:MaxNewSize=256m \
        -XX:NewSize=128m \
        -XX:TargetSurvivorRatio=90 \
        -XX:+AggressiveOpts \
        -XX:+UseParNewGC \
        -XX:+UseConcMarkSweepGC \
        -XX:+CMSParallelRemarkEnabled \
        -XX:+HeapDumpOnOutOfMemoryError \
        -XX:SurvivorRatio=128 \
        -XX:MaxTenuringThreshold=0 \
        -XX:+DisableExplicitGC \
        -Dcom.sun.management.jmxremote.port=8080 \
        -Dcom.sun.management.jmxremote.ssl=false \
        -Dcom.sun.management.jmxremote.authenticate=false"