You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Stefania (JIRA)" <ji...@apache.org> on 2016/05/02 10:09:13 UTC

[jira] [Updated] (CASSANDRA-11542) Create a benchmark to compare HDFS and Cassandra bulk read times

     [ https://issues.apache.org/jira/browse/CASSANDRA-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stefania updated CASSANDRA-11542:
---------------------------------
    Attachment: jfr_recordings.zip

> Create a benchmark to compare HDFS and Cassandra bulk read times
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-11542
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11542
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Testing
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 3.x
>
>         Attachments: jfr_recordings.zip, spark-load-perf-results-001.zip, spark-load-perf-results-002.zip
>
>
> I propose creating a benchmark for comparing Cassandra and HDFS bulk reading performance. Simple Spark queries will be performed on data stored in HDFS or Cassandra, and the entire duration will be measured. An example query would be the max or min of a column or a count\(*\).
> This benchmark should allow determining the impact of:
> * partition size
> * number of clustering columns
> * number of value columns (cells)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)