You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2013/10/04 17:11:42 UTC

[jira] [Commented] (CASSANDRA-6146) CQL-native stress

    [ https://issues.apache.org/jira/browse/CASSANDRA-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786212#comment-13786212 ] 

Jonathan Ellis commented on CASSANDRA-6146:
-------------------------------------------

What I'd like to see is a drastic reduction in the amount of flags we support, in favor of allowing the user to pre-create a table for stress-ng (stress-cql?) to take its cues from.

So here's what our new Config might look like:

{code}
        availableOptions.addOption("h", "help", false, "Show this help message and exit");
        // NB only SELECT makes sense for compound PK unless we add some kind of scan-for-PK support
        availableOptions.addOption("cql", "cql", true, "CQL to execute for each operation. Use ? for partition key bind placeholder");
        availableOptions.addOption("d", "distribution", true, "Partition key distribution: uniform or gaussian.  Default: uniform");
        availableOptions.addOption("ks", "keyspace", true, "Keyspace. Default: stress");
        availableOptions.addOption("n", "nodes", true, "Nodes to connect to (CDL). Default: 127.0.0.1");
        availableOptions.addOption("p", "partitions", true, "Number of distinct partitions to use.  Default: 1,000,000");
        availableOptions.addOption("pop", "populate", false, "Populate mode. Enable to generate random inserts for the given table");
        availableOptions.addOption("r", "requests", true, "Number of requests to execute.  Default: 1,000,000");
        availableOptions.addOption("std", "stdev", true, "Standard deviation from mean, for gaussian distribution only. Default: 0.1");
        availableOptions.addOption("t", "table", true, "Table. Default: data");
{code}

So, you'd have command lines like this:

# {{stress -cql "SELECT * FROM data WHERE key = ?"}}
# {{stress -cql "SELECT username, password FROM users WHERE user_id = ?"}}
# {{stress -cql "SELECT collected_at, value FROM timeseries WHERE sensor_id = ? LIMIT 100"}}
# {{stress -cql "SELECT * FROM timeseries WHERE sensor_id = ? AND collected_at = ?"}}
# {{stress --populate}}
# {stress --populate --table timeseries}}

There's some asymmetry between inserts and reads; I'm not sure it makes sense to customize INSERT all that much, and I want people to be able to get a quick smoke test up with a minimum of ceremony, i.e., creating a default {{data}} table for them rather than requiring explicit {{CREATE TABLE}} first.  But, if you want to create a custom table, we should be able to introspect it and populate it for you.

The populate code might look something like this:

{code}
    private static void populate(Config config, Session session)
    {
        KeyspaceMetadata ks = session.getCluster().getMetadata().getKeyspace(config.keyspace);
        TableMetadata table = ks.getTable(config.table);
        if (table == null)
        {
            System.out.println("NOTICE: Creating table with 6 int columns.  Create manually if you prefer otherwise.");
            session.execute("CREATE TABLE " + config.table + " (key int PRIMARY KEY, i1 int, i2 int, i3 int, i4 int, i5 int");
        }
        List<ColumnMetadata> pkColumns = table.getPrimaryKey();
        List<ColumnMetadata> columns = table.getColumns();

        String cql = "INSERT INTO " + config.table + " VALUES (";
        for (int i = 0; i < columns.size(); i++)
        {
            ColumnMetadata c = columns.get(i);
            if (i > 0)
                cql += ",";
            cql += c.getName();
        }
        cql += ")";
        PreparedStatement statement = session.prepare(cql);

        for (int n = 0; n < config.requests; n++)
        {
            BoundStatement bs = new BoundStatement(statement);

            // partition key gets treated by distribution
            if (config.distribution == Config.Distribution.UNIFORM)
            {
                if (config.partitions == config.requests)
                    bs.setInt(0, n);
                else
                    bs.setInt(0, random.nextInt(config.partitions));
            }
            else
            {
                int k;
                while (true)
                {
                    // loop until we get a result within the necessary bounds
                    k = (int) (config.mean + (random.nextGaussian() + config.sigma));
                    if (k >= 0 && k < config.partitions)
                        break;
                }
                bs.setInt(0, k);
            }

            // non-partition key columns get random data
            for (int i = 1; i < columns.size(); i++)
            {
                ColumnMetadata c = columns.get(i);
                if (c.getType() == DataType.cint())
                    bs.setInt(i, random.nextInt());
                else
                    throw new UnsupportedOperationException("Flesh this out with support for more types");
            }

            executeLimitedAsync(session, bs);
        }
    }

    private static void executeLimitedAsync(Session session, BoundStatement statement)
    {
        while (executing.size() == MAX_EXECUTING)
        {
            for (Iterator<ResultSetFuture> iter = executing.iterator(); iter.hasNext(); )
            {
                ResultSetFuture future = iter.next();
                if (future.isDone())
                    iter.remove();
            }
            Uninterruptibles.sleepUninterruptibly(1, TimeUnit.MILLISECONDS);
        }

        executing.add(session.executeAsync(statement));
    }
{code} 

> CQL-native stress
> -----------------
>
>                 Key: CASSANDRA-6146
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6146
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tools
>            Reporter: Jonathan Ellis
>
> The existing CQL "support" in stress is not worth discussing.  We need to start over, and we might as well kill two birds with one stone and move to the native protocol while we're at it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)