You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Philip Thompson (JIRA)" <ji...@apache.org> on 2015/06/04 23:17:39 UTC
[jira] [Commented] (CASSANDRA-9552) COPY FROM times out after
110000 inserts
[ https://issues.apache.org/jira/browse/CASSANDRA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573600#comment-14573600 ]
Philip Thompson commented on CASSANDRA-9552:
--------------------------------------------
What version of C* are you using?
> COPY FROM times out after 110000 inserts
> ----------------------------------------
>
> Key: CASSANDRA-9552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9552
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Brian Hess
> Labels: cqlsh
>
> I am trying to test out performance of COPY FROM on various schemas. I have a 100-BIGINT-column table defined as:
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
> CREATE TABLE test.test100 (
> pkey bigint, ccol bigint, col0 bigint, col1 bigint, col10 bigint,
> col11 bigint, col12 bigint, col13 bigint, col14 bigint, col15 bigint,
> col16 bigint, col17 bigint, col18 bigint, col19 bigint, col2 bigint,
> col20 bigint, col21 bigint, col22 bigint, col23 bigint, col24 bigint,
> col25 bigint, col26 bigint, col27 bigint, col28 bigint, col29 bigint,
> col3 bigint, col30 bigint, col31 bigint, col32 bigint, col33 bigint,
> col34 bigint, col35 bigint, col36 bigint, col37 bigint, col38 bigint,
> col39 bigint, col4 bigint, col40 bigint, col41 bigint, col42 bigint,
> col43 bigint, col44 bigint, col45 bigint, col46 bigint, col47 bigint,
> col48 bigint, col49 bigint, col5 bigint, col50 bigint, col51 bigint,
> col52 bigint, col53 bigint, col54 bigint, col55 bigint, col56 bigint,
> col57 bigint, col58 bigint, col59 bigint, col6 bigint, col60 bigint,
> col61 bigint, col62 bigint, col63 bigint, col64 bigint, col65 bigint,
> col66 bigint, col67 bigint, col68 bigint, col69 bigint, col7 bigint,
> col70 bigint, col71 bigint, col72 bigint, col73 bigint, col74 bigint,
> col75 bigint, col76 bigint, col77 bigint, col78 bigint, col79 bigint,
> col8 bigint, col80 bigint, col81 bigint, col82 bigint, col83 bigint,
> col84 bigint, col85 bigint, col86 bigint, col87 bigint, col88 bigint,
> col89 bigint, col9 bigint, col90 bigint, col91 bigint, col92 bigint,
> col93 bigint, col94 bigint, col95 bigint, col96 bigint, col97 bigint,
> PRIMARY KEY (pkey, ccol)
> ) WITH CLUSTERING ORDER BY (ccol ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
> AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> {code}
> I then try to load the linked file of 120,000 rows of 100 BIGINT columns via:
> {code}
> cqlsh -e "COPY test.test100(pkey,ccol,col0,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16,col17,col18,col19,col20,col21,col22,col23,col24,col25,col26,col27,col28,col29,col30,col31,col32,col33,col34,col35,col36,col37,col38,col39,col40,col41,col42,col43,col44,col45,col46,col47,col48,col49,col50,col51,col52,col53,col54,col55,col56,col57,col58,col59,col60,col61,col62,col63,col64,col65,col66,col67,col68,col69,col70,col71,col72,col73,col74,col75,col76,col77,col78,col79,col80,col81,col82,col83,col84,col85,col86,col87,col88,col89,col90,col91,col92,col93,col94,col95,col96,col97) FROM 'data120K.csv'"
> {code}
> Data file here: https://drive.google.com/file/d/0B87-Pevy14fuUVcxemFRcFFtRjQ/view?usp=sharing
> After 110000 rows, it errors and hangs:
> {code}
> <stdin>:1:110000 rows; Write: 19848.21 rows/s
> Connection heartbeat failure
> <stdin>:1:Aborting import at record #1196. Previously inserted records are still present, and some records after that may be present as well.
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)