You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Himansh verma (JIRA)" <ji...@apache.org> on 2016/05/16 14:11:15 UTC

[jira] [Commented] (CASSANDRA-9923) stress write and counter_write hangs

    [ https://issues.apache.org/jira/browse/CASSANDRA-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284625#comment-15284625 ] 

Himansh verma commented on CASSANDRA-9923:
------------------------------------------

I am getting the following exception while running load on my table.
If thread count exceeds 20 only then it shows the following exception. In case of less threads its working fine.
I also tried to increase the HEAP size by giving -Xmx3G in  cassandra-stress shell file

	  
java.lang.RuntimeException: Timed out waiting for a timer thread - seems one got stuck. Check GC/Heap size
        at org.apache.cassandra.stress.util.Timing.snap(Timing.java:98)
        at org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:157)
        at org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:38)
        at org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:105)
        at java.lang.Thread.run(Unknown Source)

		
I am using the below command to run my stress : 

cassandra-stress user profile=myTable.yaml ops\(insert=5, get1=500 , get2=500 , update1=50 , update2=50 , delete1=5\) cl=QUORUM no-warmup duration=7200s -rate threads=24 -log file=myTableReport.txt -node xxx.xxx.xxx.xxx -mode thrift user=xxx password=xxx

Here is the yaml file.
-----------------------
keyspace: loadtest

# The CQL for creating a keyspace (optional if it already exists)
keyspace_definition: |
  CREATE KEYSPACE  loadtest WITH replication = {'class':'NetworkTopologyStrategy','load':3};

# Table name
table: myTable

# The CQL for creating a table you wish to stress (optional if it already exists)
table_definition: |
  CREATE TABLE myTable (
    col1 text PRIMARY KEY,
    col2 text,
    col3 text,
    col4 text,
    col5 text
  ) WITH bloom_filter_fp_chance = 0.01
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'class':'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression':'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

columnspec:
  - name: col1            
    size: gaussian(36..36)
    population: uniform(1..10M)  

	- name: col2
    size: gaussian(1..320)       

  - name: col3
    size: gaussian(1..320)         

  - name: col4
    size: gaussian(1..15)       

  - name: col5
    size: gaussian(1..15)

   
### Batch Ratio Distribution Specifications ###

insert:
  partitions: fixed(100)            
  batchtype: UNLOGGED             # Unlogged batches


#
# A list of queries you wish to run against the schema
#
queries:

   get1: 
      cql: select * from myTable where col1 = ? 
      fields: samerow

   get2:
      cql: select col1, col2, col3 from myTable where col1 = ? 
      fields: samerow

   update1:
      cql: update myTable set col3 = ?  where col1 = ?
      fields: samerow

   update2:
      cql: update myTable set col2 = ? , col3 = ? , col4 = ? , col5 = ? where col1 = ?
      fields: samerow

   delete1:
      cql: delete from myTable where col1 = ?
      fields: samerow

----------------------------------------

currently I'm running it on 2 vCPU machine.
Is it Ok? or 
How much vCPU machine is required ??	  



> stress write and counter_write hangs
> ------------------------------------
>
>                 Key: CASSANDRA-9923
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9923
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Robert Stupp
>            Assignee: T Jake Luciani
>            Priority: Minor
>              Labels: stress
>
> (Sorry for the vague description)
> I tried some cstar tests against counter columns. But all these tests against 2.1 and 2.2 ended (hang) during with the following output:
> {noformat}
> Created keyspaces. Sleeping 3s for propagation.
> Sleeping 2s...
> Warming up COUNTER_WRITE with 150000 iterations...
> Running COUNTER_WRITE with 300 threads for 15000000 iteration
> type,      total ops,    op/s,    pk/s,   row/s,    mean,     med,     .95,     .99,    .999,     max,   time,   stderr, errors,  gc: #,  max ms,  sum ms,  sdv ms,      mb
> total,         98073,   98054,   98054,   98054,     3.1,     1.7,     8.9,    23.2,    89.9,   107.7,    1.0,  0.00000,      0,      0,       0,       0,       0,       0
> total,        188586,   72492,   72492,   72492,     4.1,     1.5,    10.0,    61.4,   202.8,   214.7,    2.2,  0.13101,      0,      3,     564,     564,       6,    3447
> total,        363880,  137986,  137986,  137986,     2.2,     1.4,     4.1,     9.6,   207.1,   253.3,    3.5,  0.18684,      0,      0,       0,       0,       0,       0
> total,        460122,  105062,  105062,  105062,     2.8,     1.4,     4.6,    14.7,   225.6,   236.2,    4.4,  0.13969,      0,      1,     214,     214,       0,    1199
> total,        600625,  111453,  111453,  111453,     2.7,     1.4,     3.8,    10.4,   231.5,   241.6,    5.7,  0.11366,      0,      2,     442,     442,       1,    2389
> total,        745680,  149583,  149583,  149583,     2.0,     1.4,     3.6,     6.7,   155.8,   159.7,    6.7,  0.11318,      0,      0,       0,       0,       0,       0
> total,        828453,   63632,   63632,   63632,     4.7,     1.4,     4.8,   261.9,   274.5,   279.3,    8.0,  0.12645,      0,      3,     782,     782,       1,    3542
> total,       1009560,  172429,  172429,  172429,     1.7,     1.4,     3.7,     6.1,    16.2,    29.7,    9.0,  0.11629,      0,      0,       0,       0,       0,       0
> total,       1062409,   53860,   53860,   53860,     5.5,     1.3,     8.6,   270.3,   293.4,   324.3,   10.0,  0.12738,      0,      2,     542,     542,       7,    2354
> total,       1186672,   96540,   96540,   96540,     3.1,     1.5,     5.9,    14.5,   266.4,   277.6,   11.3,  0.11451,      0,      1,     260,     260,       0,    1183
> {noformat}
> ...
> {noformat}
> total,       4977251,     238,     238,     238,     0.7,     0.6,     0.7,     1.3,     3.4,   158.5,  352.3,  0.11749,      0,      0,       0,       0,       0,       0
> total,       4979839,     214,     214,     214,     0.6,     0.6,     0.7,     1.3,     2.5,     2.8,  364.4,  0.11761,      0,      0,       0,       0,       0,       0
> total,       4981729,     191,     191,     191,     0.6,     0.6,     0.7,     1.3,     3.2,     3.3,  374.3,  0.11774,      0,      0,       0,       0,       0,       0
> total,       4983362,     167,     167,     167,     0.8,     0.7,     1.8,     2.7,     3.9,     5.8,  384.0,  0.11787,      0,      0,       0,       0,       0,       0
> total,       4985171,     153,     153,     153,     0.7,     0.6,     1.2,     1.4,     2.0,     3.3,  395.9,  0.11799,      0,      0,       0,       0,       0,       0
> total,       4986684,     137,     137,     137,     0.7,     0.6,     0.8,     1.3,     2.0,     2.0,  406.9,  0.11812,      0,      0,       0,       0,       0,       0
> total,       4988410,     121,     121,     121,     0.7,     0.7,     0.8,     1.3,     2.0,     2.8,  421.1,  0.11824,      0,      0,       0,       0,       0,       0
> total,       4990216,      99,      99,      99,     0.7,     0.7,     0.8,     1.4,     2.6,     2.8,  439.5,  0.11836,      0,      0,       0,       0,       0,       0
> total,       4991765,      81,      81,      81,     0.8,     0.7,     0.8,     1.4,    30.1,    81.6,  458.7,  0.11848,      0,      1,     159,     159,       0,    1179
> total,       4993731,      67,      67,      67,     0.7,     0.7,     0.8,     1.4,     3.2,     3.2,  488.1,  0.11860,      0,      0,       0,       0,       0,       0
> total,       4996565,      45,      45,      45,     0.9,     0.7,     0.9,     1.5,    84.7,   218.3,  551.5,  0.11872,      0,      1,     248,     248,       0,    1180
> java.lang.RuntimeException: Timed out waiting for a timer thread - seems one got stuck. Check GC/Heap size
> 	at org.apache.cassandra.stress.util.Timing.snap(Timing.java:98)
> 	at org.apache.cassandra.stress.StressMetrics.update(StressMetrics.java:156)
> 	at org.apache.cassandra.stress.StressMetrics.access$300(StressMetrics.java:37)
> 	at org.apache.cassandra.stress.StressMetrics$2.run(StressMetrics.java:104)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}
> I don't know whether it's a stress or a counter issue - but wanted to record it for further investigation.
> [Console log file|http://cstar.datastax.com/tests/artifacts/6cd54260-35d6-11e5-a6ff-42010af0688f/console]
> cstar test steps:
> # counter_write n=15M -pop dist=gaussian\(1..5M\) -rate threads=300
> # counter_write n=15M -pop dist=gaussian\(1..500k\) -rate threads=300
> # counter_write n=15M -pop dist=gaussian\(1..500k\) -rate threads=50



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)