You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Rocco Varela (JIRA)" <ji...@apache.org> on 2014/09/03 23:13:52 UTC

[jira] [Commented] (CASSANDRA-7444) Performance drops when creating large amount of tables

    [ https://issues.apache.org/jira/browse/CASSANDRA-7444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120465#comment-14120465 ] 

Rocco Varela commented on CASSANDRA-7444:
-----------------------------------------

How much speed improvement should one expect with this new patch. I've created several thousands of tables in different sized batches interleaved with wait periods for schema agreement, and I'm still see creation times on the order of hours. Is this to be expected?

> Performance drops when creating large amount of tables 
> -------------------------------------------------------
>
>                 Key: CASSANDRA-7444
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7444
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: [cqlsh 3.1.8 | Cassandra 1.2.15.1 | CQL spec 3.0.0 | Thrift protocol 19.36.2][cqlsh 4.1.1 | Cassandra 2.0.7.31 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
>            Reporter: Jose Martinez Poblete
>            Assignee: Aleksey Yeschenko
>            Priority: Minor
>              Labels: cassandra
>             Fix For: 2.1.1
>
>         Attachments: 7444-2.0.txt, 7444.txt
>
>
> We are creating 4000 tables from a script and using cqlsh to create the tables. As the tables are being created, the time taken grows exponentially and it becomes very slow and takes a lot of time.
> We read a file get the keyspace append a random number and then create keyspace with this new name example Airplane_12345678, Airplane_123575849... then fed into cqlsh via script
> Similarly each table is created via script use Airplane_12345678; create table1...table25 , then use Airplane_123575849; create table1...create table25
> It is all done in singleton fashion, doing one after the other in a loop.
> We tested using the following bash script
> {noformat}
> #!/bin/bash
> SEED=0
> ITERATIONS=20
> while [ ${SEED} -lt ${ITERATIONS} ]; do
>    COUNT=0
>    KEYSPACE=t10789_${SEED}
>    echo "CREATE KEYSPACE ${KEYSPACE} WITH replication = { 'class': 'NetworkTopologyStrategy', 'Cassandra': '1' };"  > ${KEYSPACE}.ddl
>    echo "USE ${KEYSPACE};" >> ${KEYSPACE}.ddl
>    while [ ${COUNT} -lt 25 ]; do
>       echo "CREATE TABLE user_colors${COUNT} (user_id int PRIMARY KEY, colors list<ascii> );" >> ${KEYSPACE}.ddl
>       ((COUNT++))
>    done 
>    ((SEED++))
>    time cat ${KEYSPACE}.ddl | cqlsh
>    if [ "$?" -gt 0 ]; then
>       echo "[ERROR] Failure at ${KEYSPACE}"
>       exit 1
>    else
>       echo "[OK]    Created ${KEYSPACE}"
>    fi
>    echo "==============================="
>    sleep 3
> done
> #EOF
> {noformat}
> The timing we got on an otherwise idle system were inconsistent
> {noformat}
> real    0m42.649s
> user    0m0.332s
> sys     0m0.092s
> [OK]    Created t10789_0
> ===============================
> real    1m22.211s
> user    0m0.332s
> sys     0m0.096s
> [OK]    Created t10789_1
> ===============================
> real    2m45.907s
> user    0m0.304s
> sys     0m0.124s
> [OK]    Created t10789_2
> ===============================
> real    3m24.098s
> user    0m0.340s
> sys     0m0.108s
> [OK]    Created t10789_3
> ===============================
> real    2m38.930s
> user    0m0.324s
> sys     0m0.116s
> [OK]    Created t10789_4
> ===============================
> real    3m4.186s
> user    0m0.336s
> sys     0m0.104s
> [OK]    Created t10789_5
> ===============================
> real    2m55.391s
> user    0m0.344s
> sys     0m0.092s
> [OK]    Created t10789_6
> ===============================
> real    2m14.290s
> user    0m0.328s
> sys     0m0.108s
> [OK]    Created t10789_7
> ===============================
> real    2m44.880s
> user    0m0.344s
> sys     0m0.092s
> [OK]    Created t10789_8
> ===============================
> real    1m52.785s
> user    0m0.336s
> sys     0m0.128s
> [OK]    Created t10789_9
> ===============================
> real    1m18.404s
> user    0m0.344s
> sys     0m0.108s
> [OK]    Created t10789_10
> ===============================
> real    2m20.681s
> user    0m0.348s
> sys     0m0.104s
> [OK]    Created t10789_11
> ===============================
> real    1m11.860s
> user    0m0.332s
> sys     0m0.096s
> [OK]    Created t10789_12
> ===============================
> real    1m37.887s
> user    0m0.324s
> sys     0m0.100s
> [OK]    Created t10789_13
> ===============================
> real    1m31.616s
> user    0m0.316s
> sys     0m0.132s
> [OK]    Created t10789_14
> ===============================
> real    1m12.103s
> user    0m0.360s
> sys     0m0.088s
> [OK]    Created t10789_15
> ===============================
> real    0m36.378s
> user    0m0.340s
> sys     0m0.092s
> [OK]    Created t10789_16
> ===============================
> real    0m40.883s
> user    0m0.352s
> sys     0m0.096s
> [OK]    Created t10789_17
> ===============================
> real    0m40.661s
> user    0m0.332s
> sys     0m0.096s
> [OK]    Created t10789_18
> ===============================
> real    0m44.943s
> user    0m0.324s
> sys     0m0.104s
> [OK]    Created t10789_19
> ===============================
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)