You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Rocco Varela (JIRA)" <ji...@apache.org> on 2014/09/03 23:13:52 UTC
[jira] [Commented] (CASSANDRA-7444) Performance drops when creating
large amount of tables
[ https://issues.apache.org/jira/browse/CASSANDRA-7444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120465#comment-14120465 ]
Rocco Varela commented on CASSANDRA-7444:
-----------------------------------------
How much speed improvement should one expect with this new patch. I've created several thousands of tables in different sized batches interleaved with wait periods for schema agreement, and I'm still see creation times on the order of hours. Is this to be expected?
> Performance drops when creating large amount of tables
> -------------------------------------------------------
>
> Key: CASSANDRA-7444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7444
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: [cqlsh 3.1.8 | Cassandra 1.2.15.1 | CQL spec 3.0.0 | Thrift protocol 19.36.2][cqlsh 4.1.1 | Cassandra 2.0.7.31 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
> Reporter: Jose Martinez Poblete
> Assignee: Aleksey Yeschenko
> Priority: Minor
> Labels: cassandra
> Fix For: 2.1.1
>
> Attachments: 7444-2.0.txt, 7444.txt
>
>
> We are creating 4000 tables from a script and using cqlsh to create the tables. As the tables are being created, the time taken grows exponentially and it becomes very slow and takes a lot of time.
> We read a file get the keyspace append a random number and then create keyspace with this new name example Airplane_12345678, Airplane_123575849... then fed into cqlsh via script
> Similarly each table is created via script use Airplane_12345678; create table1...table25 , then use Airplane_123575849; create table1...create table25
> It is all done in singleton fashion, doing one after the other in a loop.
> We tested using the following bash script
> {noformat}
> #!/bin/bash
> SEED=0
> ITERATIONS=20
> while [ ${SEED} -lt ${ITERATIONS} ]; do
> COUNT=0
> KEYSPACE=t10789_${SEED}
> echo "CREATE KEYSPACE ${KEYSPACE} WITH replication = { 'class': 'NetworkTopologyStrategy', 'Cassandra': '1' };" > ${KEYSPACE}.ddl
> echo "USE ${KEYSPACE};" >> ${KEYSPACE}.ddl
> while [ ${COUNT} -lt 25 ]; do
> echo "CREATE TABLE user_colors${COUNT} (user_id int PRIMARY KEY, colors list<ascii> );" >> ${KEYSPACE}.ddl
> ((COUNT++))
> done
> ((SEED++))
> time cat ${KEYSPACE}.ddl | cqlsh
> if [ "$?" -gt 0 ]; then
> echo "[ERROR] Failure at ${KEYSPACE}"
> exit 1
> else
> echo "[OK] Created ${KEYSPACE}"
> fi
> echo "==============================="
> sleep 3
> done
> #EOF
> {noformat}
> The timing we got on an otherwise idle system were inconsistent
> {noformat}
> real 0m42.649s
> user 0m0.332s
> sys 0m0.092s
> [OK] Created t10789_0
> ===============================
> real 1m22.211s
> user 0m0.332s
> sys 0m0.096s
> [OK] Created t10789_1
> ===============================
> real 2m45.907s
> user 0m0.304s
> sys 0m0.124s
> [OK] Created t10789_2
> ===============================
> real 3m24.098s
> user 0m0.340s
> sys 0m0.108s
> [OK] Created t10789_3
> ===============================
> real 2m38.930s
> user 0m0.324s
> sys 0m0.116s
> [OK] Created t10789_4
> ===============================
> real 3m4.186s
> user 0m0.336s
> sys 0m0.104s
> [OK] Created t10789_5
> ===============================
> real 2m55.391s
> user 0m0.344s
> sys 0m0.092s
> [OK] Created t10789_6
> ===============================
> real 2m14.290s
> user 0m0.328s
> sys 0m0.108s
> [OK] Created t10789_7
> ===============================
> real 2m44.880s
> user 0m0.344s
> sys 0m0.092s
> [OK] Created t10789_8
> ===============================
> real 1m52.785s
> user 0m0.336s
> sys 0m0.128s
> [OK] Created t10789_9
> ===============================
> real 1m18.404s
> user 0m0.344s
> sys 0m0.108s
> [OK] Created t10789_10
> ===============================
> real 2m20.681s
> user 0m0.348s
> sys 0m0.104s
> [OK] Created t10789_11
> ===============================
> real 1m11.860s
> user 0m0.332s
> sys 0m0.096s
> [OK] Created t10789_12
> ===============================
> real 1m37.887s
> user 0m0.324s
> sys 0m0.100s
> [OK] Created t10789_13
> ===============================
> real 1m31.616s
> user 0m0.316s
> sys 0m0.132s
> [OK] Created t10789_14
> ===============================
> real 1m12.103s
> user 0m0.360s
> sys 0m0.088s
> [OK] Created t10789_15
> ===============================
> real 0m36.378s
> user 0m0.340s
> sys 0m0.092s
> [OK] Created t10789_16
> ===============================
> real 0m40.883s
> user 0m0.352s
> sys 0m0.096s
> [OK] Created t10789_17
> ===============================
> real 0m40.661s
> user 0m0.332s
> sys 0m0.096s
> [OK] Created t10789_18
> ===============================
> real 0m44.943s
> user 0m0.324s
> sys 0m0.104s
> [OK] Created t10789_19
> ===============================
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)