You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Eric Fenderbosch (JIRA)" <ji...@apache.org> on 2015/11/03 15:12:27 UTC

[jira] [Created] (CASSANDRA-10637) Extract LoaderOptions and refactor BulkLoader to be able to be used from within existing Java code instead of just through main()

Eric Fenderbosch created CASSANDRA-10637:
--------------------------------------------

             Summary: Extract LoaderOptions and refactor BulkLoader to be able to be used from within existing Java code instead of just through main()
                 Key: CASSANDRA-10637
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10637
             Project: Cassandra
          Issue Type: Improvement
          Components: Tools
            Reporter: Eric Fenderbosch
            Priority: Minor
             Fix For: 3.x


We are writing a service to migrate data from various RDMBS tables in to Cassandra. We write out a CSV from the source system, use CQLSSTableWriter to write sstables to disk, then call sstableloader to stream to the Cassandra cluster.

Right now, we either have to:

* return a CSV location from one Java process to a wrapper script which then kicks off sstableloader
* or call sstableloader via Runtime.getRuntime().exec
* or call BulkLoader.main from within our Java code, using a custom SecurityManager to trap the System.exit calls
* or subclass BulkLoader putting the subclass in the org.apache.cassandra.tools package in order to access the package scoped inner classes

None of these solutions are ideal. Ideally, we should be able to use the functionality of BulkLoader.main directly. I've extracted LoaderOptions to a top level class that uses the builder pattern so that it can be used as part of a Java migration service directly.

Creating the builder can now be performed with a fluent builder interface:

```java
LoaderOptions options = LoaderOptions.builder(). //
                connectionsPerHost(2). //
                directory(directory). //
                hosts(hosts). //
                build();
```

Or used to parse command line arguments:

```java
LoaderOptions options = LoaderOptions.builder().parseArgs(args).build();
```

A new load method takes a ``LoaderOptions`` parameter and throws ``BulkLoadException`` instead of ```System.exit(1)```.

Fork on github can be found here:

https://github.com/efenderbosch/cassandra



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)