You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by jb...@apache.org on 2013/06/20 20:56:08 UTC

[1/5] git commit: Never allow partition range queries in CQL3 without token()

Updated Branches:
  refs/heads/cassandra-1.2 3814af808 -> 8d17ccb7b
  refs/heads/trunk 515116972 -> b9de5de23


Never allow partition range queries in CQL3 without token()

patch by slebresne; reviewed by jbellis for CASSANDRA-5666


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/41f418a0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/41f418a0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/41f418a0

Branch: refs/heads/trunk
Commit: 41f418a09e03e36911b404ff01c96adefc75b988
Parents: 9ba0ff0
Author: Sylvain Lebresne <sy...@datastax.com>
Authored: Thu Jun 20 19:10:24 2013 +0200
Committer: Sylvain Lebresne <sy...@datastax.com>
Committed: Thu Jun 20 19:10:24 2013 +0200

----------------------------------------------------------------------
 CHANGES.txt                                                    | 1 +
 doc/cql3/CQL.textile                                           | 5 +++--
 .../org/apache/cassandra/cql3/statements/SelectStatement.java  | 6 +-----
 3 files changed, 5 insertions(+), 7 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/cassandra/blob/41f418a0/CHANGES.txt
----------------------------------------------------------------------
diff --git a/CHANGES.txt b/CHANGES.txt
index e1282aa..bd52eab 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -29,6 +29,7 @@
  * Suppress custom exceptions thru jmx (CASSANDRA-5652)
  * Update CREATE CUSTOM INDEX syntax (CASSANDRA-5639)
  * Fix PermissionDetails.equals() method (CASSANDRA-5655)
+ * Never allow partition key ranges in CQL3 without token() (CASSANDRA-5666)
 Merged from 1.1:
  * Remove buggy thrift max message length option (CASSANDRA-5529)
  * Fix NPE in Pig's widerow mode (CASSANDRA-5488)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/41f418a0/doc/cql3/CQL.textile
----------------------------------------------------------------------
diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile
index 5fa36ab..f7d2dda 100644
--- a/doc/cql3/CQL.textile
+++ b/doc/cql3/CQL.textile
@@ -626,7 +626,7 @@ h4(#selectWhere). @<where-clause>@
 
 The @<where-clause>@ specifies which rows must be queried. It is composed of relations on the columns that are part of the @PRIMARY KEY@ and/or have a "secondary index":#createIndexStmt defined on them.
 
-Not all relations are allowed in a query. For instance, non-equal relations (where @IN@ is considered as an equal relation) on a partition key is only supported if the partitioner for the keyspace is an ordered one. Moreover, for a given partition key, the clustering keys induce an ordering of rows and relations on them is restricted to the relations that allow to select a *contiguous* (for the ordering) set of rows. For instance, given
+Not all relations are allowed in a query. For instance, non-equal relations (where @IN@ is considered as an equal relation) on a partition key are not supported (but see the use of the @TOKEN@ method below to do non-equal queries on the partition key). Moreover, for a given partition key, the clustering keys induce an ordering of rows and relations on them is restricted to the relations that allow to select a *contiguous* (for the ordering) set of rows. For instance, given
 
 bc(sample). 
 CREATE TABLE posts (
@@ -650,7 +650,7 @@ bc(sample).
 // Needs a blog_title to be set to select ranges of posted_at
 SELECT entry_title, content FROM posts WHERE userid='john doe' AND posted_at >= 2012-01-01 AND posted_at < 2012-01-31
 
-When specifying relations, the @TOKEN@ function can be used on the @PARTITION KEY@ column to query. In that case, rows will be selected based on the token of their @PARTITION_KEY@ rather than on the value (note that the token of a key depends on the partitioner in use, and that in particular the RandomPartitioner won't yeld a meaningful order). Example:
+When specifying relations, the @TOKEN@ function can be used on the @PARTITION KEY@ column to query. In that case, rows will be selected based on the token of their @PARTITION_KEY@ rather than on the value. Note that the token of a key depends on the partitioner in use, and that in particular the RandomPartitioner won't yeld a meaningful order. Also note that ordering partitioners always order token values by bytes (so even if the partition key is of type int, @token(-1) > token(0)@ in particular). Example:
 
 bc(sample). 
 SELECT * FROM posts WHERE token(userid) > token('tom') AND token(userid) < token('bob')
@@ -1051,6 +1051,7 @@ The following describes the addition/changes brought for each version of CQL.
 h3. 3.0.4
 
 * Updated the syntax for custom "secondary indexes":#createIndexStmt.
+* Non-equal condition on the partition key are now never supported, even for ordering partitioner as this was not correct (the order was *not* the one of the type of the partition key). Instead, the @token@ method should always be used for range queries on the partition key (see "WHERE clauses":#selectWhere).
 
 h3. 3.0.3
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/41f418a0/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
----------------------------------------------------------------------
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index 57913fe..03f222b 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -1046,11 +1046,7 @@ public class SelectStatement implements CQLStatement
                 }
                 else
                 {
-                    if (!partitioner.preservesOrder())
-                        throw new InvalidRequestException("Only EQ and IN relation are supported on the partition key for random partitioners (unless you use the token() function)");
-
-                    stmt.isKeyRange = true;
-                    shouldBeDone = true;
+                    throw new InvalidRequestException("Only EQ and IN relation are supported on the partition key (you will need to use the token() function for non equality based relation)");
                 }
                 previous = cname;
             }


[5/5] git commit: merge from 1.2

Posted by jb...@apache.org.
merge from 1.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b9de5de2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b9de5de2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b9de5de2

Branch: refs/heads/trunk
Commit: b9de5de235267154f6a6fea5f2ca6710c5efefc5
Parents: 5151169 8d17ccb
Author: Jonathan Ellis <jb...@apache.org>
Authored: Thu Jun 20 13:56:04 2013 -0500
Committer: Jonathan Ellis <jb...@apache.org>
Committed: Thu Jun 20 13:56:04 2013 -0500

----------------------------------------------------------------------
 CHANGES.txt                                     |   1 +
 NEWS.txt                                        |  71 +++---
 bin/sstableupgrade                              |  55 +++++
 debian/cassandra.install                        |   1 +
 doc/cql3/CQL.textile                            |   5 +-
 .../cql3/statements/SelectStatement.java        |  11 +-
 .../cassandra/db/compaction/Upgrader.java       | 167 ++++++++++++++
 .../cassandra/tools/StandaloneUpgrader.java     | 223 +++++++++++++++++++
 8 files changed, 494 insertions(+), 40 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9de5de2/CHANGES.txt
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9de5de2/NEWS.txt
----------------------------------------------------------------------
diff --cc NEWS.txt
index 7e06aa7,dbc9aab..c838d48
--- a/NEWS.txt
+++ b/NEWS.txt
@@@ -8,64 -8,11 +8,73 @@@ upgrade, just in case you need to roll 
  (Cassandra version X + 1 will always be able to read data files created
  by version X, but the inverse is not necessarily the case.)
  
++<<<<<<< HEAD
 +2.0.0
 +=====
 +
 +Upgrading
 +---------
 +    - CAS and new features in CQL such as DROP COLUMN assume that cell
 +      timestamps are microseconds-since-epoch.  Do not use these
 +      features if you are using client-specified timestamps with some
 +      other source.
 +    - Upgrading is ONLY supported from Cassandra 1.2.5 or later.  This
 +      goes for sstable compatibility as well as network.  When
 +      upgrading from an earlier release, upgrade to 1.2.5 first and
 +      run upgradesstables before proceeding to 2.0.
 +    - Replication and strategy options do not accept unknown options anymore.
 +      This was already the case for CQL3 in 1.2 but this is now the case for
 +      thrift too.
 +    - auto_bootstrap of a single-token node with no initial_token will
 +      now pick a random token instead of bisecting an existing token
 +      range.  We recommend upgrading to vnodes; failing that, we
 +      recommend specifying initial_token.
 +    - reduce_cache_sizes_at, reduce_cache_capacity_to, and
 +      flush_largest_memtables_at options have been removed from cassandra.yaml.
 +    - CacheServiceMBean.reduceCacheSizes() has been removed.
 +      Use CacheServiceMBean.set{Key,Row}CacheCapacityInMB() instead.
 +    - authority option in cassandra.yaml has been deprecated since 1.2.0,
 +      but it has been completely removed in 2.0. Please use 'authorizer' option.
 +    - ASSUME command has been removed from cqlsh. Use CQL3 blobAsType() and
 +      typeAsBlob() conversion functions instead.
 +      See https://cassandra.apache.org/doc/cql3/CQL.html#blobFun for details.
 +    - Inputing blobs as string constants is now fully deprecated in
 +      favor of blob constants. Make sure to update your applications to use
 +      the new syntax while you are still on 1.2 (which supports both string
 +      and blob constants for blob input) before upgrading to 2.0.
 +
 +Operations
 +----------
 +    - Major compactions, cleanup, scrub, and upgradesstables will interrupt 
 +      any in-progress compactions (but not repair validations) when invoked.
 +    - Disabling autocompactions by setting min/max compaction threshold to 0
 +      has been deprecated, instead, use the nodetool commands 'disableautocompaction'
 +      and 'enableautocompaction' or set the compaction strategy option enabled = false
 +    - ALTER TABLE DROP has been reenabled for CQL3 tables and has new semantics now.
 +      See https://cassandra.apache.org/doc/cql3/CQL.html#alterTableStmt and
 +      https://issues.apache.org/jira/browse/CASSANDRA-3919 for details.
 +    - CAS uses gc_grace_seconds to determine how long to keep unused paxos
 +      state around for, or a minimum of three hours.
 +
 +Features
 +--------
 +    - Alias support has been added to CQL3 SELECT statement. Refer to
 +      CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html) for details.
 +    - JEMalloc support (see memory_allocator in cassandra.yaml)
 +    - Experimental triggers support.  See examples/ for how to use.  "Experimental"
 +      means "tied closely to internal data structures; we plan to decouple this in
 +      the future, which will probably break triggers written against this initial
 +      API."
 +
 +
++||||||| merged common ancestors
++=======
+ When upgrading major versions of Cassandra, you will be unable to
+ restore snapshots created with the previous major version using the
+ 'sstableloader' tool. You can upgrade the file format of your snapshots
+ using the provided 'sstableupgrade' tool.
+ 
++>>>>>>> cassandra-1.2
  1.2.6
  =====
  

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9de5de2/debian/cassandra.install
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9de5de2/doc/cql3/CQL.textile
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9de5de2/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
----------------------------------------------------------------------
diff --cc src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index fd47ba8,03f222b..3815a9d
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@@ -1019,19 -1046,7 +1019,16 @@@ public class SelectStatement implement
                  }
                  else
                  {
-                     if (!partitioner.preservesOrder())
 -                    throw new InvalidRequestException("Only EQ and IN relation are supported on the partition key (you will need to use the token() function for non equality based relation)");
++                    if (hasQueriableIndex)
 +                    {
-                         if (hasQueriableIndex)
-                         {
-                             stmt.usesSecondaryIndexing = true;
-                             break;
-                         }
-                         throw new InvalidRequestException("Only EQ and IN relation are supported on the partition key for random partitioners (unless you use the token() function)");
++                        stmt.usesSecondaryIndexing = true;
++                        break;
 +                    }
++                    throw new InvalidRequestException("Only EQ and IN relation are supported on the partition key for random partitioners (unless you use the token() function)");
 +
 +                    stmt.isKeyRange = true;
 +                    lastRestrictedPartitionKey = i;
 +                    shouldBeDone = true;
                  }
                  previous = cname;
              }


[4/5] git commit: add license

Posted by jb...@apache.org.
add license


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8d17ccb7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8d17ccb7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8d17ccb7

Branch: refs/heads/cassandra-1.2
Commit: 8d17ccb7b26d705e815863554d7e15f4eca46c89
Parents: 3814af8
Author: Jonathan Ellis <jb...@apache.org>
Authored: Thu Jun 20 10:21:23 2013 -0500
Committer: Jonathan Ellis <jb...@apache.org>
Committed: Thu Jun 20 13:53:59 2013 -0500

----------------------------------------------------------------------
 .../cassandra/metrics/ReadRepairMetrics.java    | 21 ++++++++++++++++++++
 1 file changed, 21 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8d17ccb7/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
----------------------------------------------------------------------
diff --git a/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java b/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
index 3f48fee..5b61e42 100644
--- a/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
+++ b/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
@@ -1,4 +1,25 @@
 package org.apache.cassandra.metrics;
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * 
+ */
+
 
 import java.util.concurrent.TimeUnit;
 


[3/5] git commit: add license

Posted by jb...@apache.org.
add license


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8d17ccb7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8d17ccb7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8d17ccb7

Branch: refs/heads/trunk
Commit: 8d17ccb7b26d705e815863554d7e15f4eca46c89
Parents: 3814af8
Author: Jonathan Ellis <jb...@apache.org>
Authored: Thu Jun 20 10:21:23 2013 -0500
Committer: Jonathan Ellis <jb...@apache.org>
Committed: Thu Jun 20 13:53:59 2013 -0500

----------------------------------------------------------------------
 .../cassandra/metrics/ReadRepairMetrics.java    | 21 ++++++++++++++++++++
 1 file changed, 21 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/cassandra/blob/8d17ccb7/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
----------------------------------------------------------------------
diff --git a/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java b/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
index 3f48fee..5b61e42 100644
--- a/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
+++ b/src/java/org/apache/cassandra/metrics/ReadRepairMetrics.java
@@ -1,4 +1,25 @@
 package org.apache.cassandra.metrics;
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * 
+ */
+
 
 import java.util.concurrent.TimeUnit;
 


[2/5] git commit: Add standalone sstableupgrade utility. Patch by Nick Bailey, reviewed by brandonwilliams for CASSANDRA-5524

Posted by jb...@apache.org.
Add standalone sstableupgrade utility.
Patch by Nick Bailey, reviewed by brandonwilliams for CASSANDRA-5524


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3814af80
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3814af80
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3814af80

Branch: refs/heads/trunk
Commit: 3814af8087c8b5541bea563344afcc344f5efa2a
Parents: 41f418a
Author: Brandon Williams <br...@apache.org>
Authored: Thu Jun 20 13:15:54 2013 -0500
Committer: Brandon Williams <br...@apache.org>
Committed: Thu Jun 20 13:15:54 2013 -0500

----------------------------------------------------------------------
 NEWS.txt                                        |  67 +++---
 bin/sstableupgrade                              |  55 +++++
 debian/cassandra.install                        |   1 +
 .../cassandra/db/compaction/Upgrader.java       | 167 ++++++++++++++
 .../cassandra/tools/StandaloneUpgrader.java     | 223 +++++++++++++++++++
 5 files changed, 482 insertions(+), 31 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3814af80/NEWS.txt
----------------------------------------------------------------------
diff --git a/NEWS.txt b/NEWS.txt
index 5cb06da..dbc9aab 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -8,6 +8,11 @@ upgrade, just in case you need to roll back to the previous version.
 (Cassandra version X + 1 will always be able to read data files created
 by version X, but the inverse is not necessarily the case.)
 
+When upgrading major versions of Cassandra, you will be unable to
+restore snapshots created with the previous major version using the
+'sstableloader' tool. You can upgrade the file format of your snapshots
+using the provided 'sstableupgrade' tool.
+
 1.2.6
 =====
 
@@ -217,7 +222,7 @@ Features
     - num_tokens can now be specified in cassandra.yaml. This defines the
       number of tokens assigned to the host on the ring (default: 1).
       Also specifying initial_token will override any num_tokens setting.
-    - disk_failure_policy allows blacklisting failed disks in JBOD 
+    - disk_failure_policy allows blacklisting failed disks in JBOD
       configuration instead of erroring out indefinitely
     - event tracing can be configured per-connection ("trace_next_query")
       or globally/probabilistically ("nodetool settraceprobability")
@@ -314,7 +319,7 @@ Upgrading
       throw an InvalidRequestException when used for reads.  (Previous
       versions would silently perform a ONE read for range queries;
       single-row and multiget reads already rejected ANY.)
-    - The largest mutation batch accepted by the commitlog is now 128MB.  
+    - The largest mutation batch accepted by the commitlog is now 128MB.
       (In practice, batches larger than ~10MB always caused poor
       performance due to load volatility and GC promotion failures.)
       Larger batches will continue to be accepted but will not be
@@ -514,7 +519,7 @@ Upgrading
     - Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling
       restart, one node at a time.  (0.8.0 or 0.8.1 are NOT network-compatible
       with 1.0: upgrade to the most recent 0.8 release first.)
-      You do not need to bring down the whole cluster at once. 
+      You do not need to bring down the whole cluster at once.
     - After upgrading, run nodetool scrub against each node before running
       repair, moving nodes, or adding new ones.
     - CQL inserts/updates now generate microsecond resolution timestamps
@@ -695,7 +700,7 @@ Upgrading
 ---------
     - Upgrading from version 0.7.1 or later can be done with a rolling
       restart, one node at a time.  You do not need to bring down the
-      whole cluster at once. 
+      whole cluster at once.
     - After upgrading, run nodetool scrub against each node before running
       repair, moving nodes, or adding new ones.
     - Running nodetool drain before shutting down the 0.7 node is
@@ -706,8 +711,8 @@ Upgrading
       to use your 0.7 clients.
     - Avro record classes used in map/reduce and Hadoop streaming code have
       been removed. Map/reduce can be switched to Thrift by changing
-      org.apache.cassandra.avro in import statements to 
-      org.apache.cassandra.thrift (no class names change). Streaming support 
+      org.apache.cassandra.avro in import statements to
+      org.apache.cassandra.thrift (no class names change). Streaming support
       has been removed for the time being.
     - The loadbalance command has been removed from nodetool.  For similar
       behavior, decommission then rebootstrap with empty initial_token.
@@ -721,15 +726,15 @@ Features
 --------
     - added CQL client API and JDBC/DBAPI2-compliant drivers for Java and
       Python, respectively (see: drivers/ subdirectory and doc/cql)
-    - added distributed Counters feature; 
+    - added distributed Counters feature;
       see http://wiki.apache.org/cassandra/Counters
     - optional intranode encryption; see comments around 'encryption_options'
       in cassandra.yaml
-    - compaction multithreading and rate-limiting; see 
+    - compaction multithreading and rate-limiting; see
       'concurrent_compactors' and 'compaction_throughput_mb_per_sec' in
       cassandra.yaml
     - cassandra will limit total memtable memory usage to 1/3 of the heap
-      by default.  This can be ajusted or disabled with the 
+      by default.  This can be ajusted or disabled with the
       memtable_total_space_in_mb option.  The old per-ColumnFamily
       throughput, operations, and age settings are still respected but
       will be removed in a future major release once we are satisfied that
@@ -738,7 +743,7 @@ Features
 Tools
 -----
     - stress and py_stress moved from contrib/ to tools/
-    - clustertool was removed (see 
+    - clustertool was removed (see
       https://issues.apache.org/jira/browse/CASSANDRA-2607 for examples
       of how to script nodetool across the cluster instead)
 
@@ -814,7 +819,7 @@ Upgrading
     - 0.7.1 and 0.7.2 shipped with a bug that caused incorrect row-level
       bloom filters to be generated when compacting sstables generated
       with earlier versions.  This would manifest in IOExceptions during
-      column name-based queries.  0.7.3 provides "nodetool scrub" to 
+      column name-based queries.  0.7.3 provides "nodetool scrub" to
       rebuild sstables with correct bloom filters, with no data lost.
       (If your cluster was never on 0.7.0 or earlier, you don't have to
       worry about this.)  Note that nodetool scrub will snapshot your
@@ -862,10 +867,10 @@ Features
     - Row size limit increased from 2GB to 2 billion columns.  rows
       are no longer read into memory during compaction.
     - Keyspace and ColumnFamily definitions may be added and modified live
-    - Streaming data for repair or node movement no longer requires 
+    - Streaming data for repair or node movement no longer requires
       anticompaction step first
-    - NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for 
-      use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC.  See comments 
+    - NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for
+      use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC.  See comments
       in `cassandra.yaml.`
     - Optional per-Column time-to-live field allows expiring data without
       have to issue explicit remove commands
@@ -879,9 +884,9 @@ Features
     - Optional round-robin scheduling between keyspaces for multitenant
       clusters
     - Dynamic endpoint snitch mitigates the impact of impaired nodes
-    - New `IntegerType`, faster than LongType and allows integers of 
+    - New `IntegerType`, faster than LongType and allows integers of
       both less and more bits than Long's 64
-    - A revamped authentication system that decouples authorization and 
+    - A revamped authentication system that decouples authorization and
       allows finer-grained control of resources.
 
 Upgrading
@@ -893,9 +898,9 @@ Upgrading
     The Cassandra inter-node protocol is incompatible with 0.6.x
     releases (and with 0.7 beta1), meaning you will have to bring your
     cluster down prior to upgrading: you cannot mix 0.6 and 0.7 nodes.
-    
+
     The hints schema was changed from 0.6 to 0.7. Cassandra automatically
-    snapshots and then truncates the hints column family as part of 
+    snapshots and then truncates the hints column family as part of
     starting up 0.7 for the first time.
 
     Keyspace and ColumnFamily definitions are stored in the system
@@ -904,13 +909,13 @@ Upgrading
     The process to upgrade is:
     1) run "nodetool drain" on _each_ 0.6 node.  When drain finishes (log
        message "Node is drained" appears), stop the process.
-    2) Convert your storage-conf.xml to the new cassandra.yaml using 
-       "bin/config-converter".  
+    2) Convert your storage-conf.xml to the new cassandra.yaml using
+       "bin/config-converter".
     3) Rename any of your keyspace or column family names that do not adhere
        to the '^\w+' regex convention.
     4) Start up your cluster with the 0.7 version.
-    5) Initialize your Keyspace and ColumnFamily definitions using 
-       "bin/schematool <host> <jmxport> import".  _You only need to do 
+    5) Initialize your Keyspace and ColumnFamily definitions using
+       "bin/schematool <host> <jmxport> import".  _You only need to do
        this to one node_.
 
 Thrift API
@@ -935,7 +940,7 @@ Configuraton
 ------------
     - Configuration file renamed to cassandra.yaml and log4j.properties to
       log4j-server.properties
-    - PropertyFileSnitch configuration file renamed to 
+    - PropertyFileSnitch configuration file renamed to
       cassandra-topology.properties
     - The ThriftAddress and ThriftPort directives have been renamed to
       RPCAddress and RPCPort respectively.
@@ -952,7 +957,7 @@ Configuraton
       one node_.
     - In addition to an authenticator, an authority must be configured as
       well. Users of SimpleAuthenticator should use SimpleAuthority for this
-      value (the default is AllowAllAuthority, which corresponds with 
+      value (the default is AllowAllAuthority, which corresponds with
       AllowAllAuthenticator).
     - The format of access.properties has changed, see the sample configuration
       conf/access.properties for documentation on the new format.
@@ -1011,7 +1016,7 @@ Features
 Configuraton
 ------------
     - MemtableSizeInMB has been replaced by MemtableThroughputInMB which
-      triggers a memtable flush when the specified amount of data has 
+      triggers a memtable flush when the specified amount of data has
       been written, including overwrites.
     - MemtableObjectCountInMillions has been replaced by the
       MemtableOperationsInMillions directive which causes a memtable flush
@@ -1047,7 +1052,7 @@ JMX metrics
       progress of the current compaction has been added.
     - commitlog JMX metrics are moved to org.apache.cassandra.db.Commitlog
     - progress of data streaming during bootstrap, loadbalance, or other
-      data migration, is available under 
+      data migration, is available under
       org.apache.cassandra.streaming.StreamingService.
       See http://wiki.apache.org/cassandra/Streaming for details.
 
@@ -1061,8 +1066,8 @@ Installation/Upgrade
 0.5.0
 =====
 
-0. The commitlog format has changed (but sstable format has not). 
-   When upgrading from 0.4, empty the commitlog either by running 
+0. The commitlog format has changed (but sstable format has not).
+   When upgrading from 0.4, empty the commitlog either by running
    bin/nodeprobe flush on each machine and waiting for the flush to finish,
    or simply remove the commitlog directory if you only have test data.
    (If more writes come in after the flush command, starting 0.5 will error
@@ -1083,7 +1088,7 @@ Installation/Upgrade
 
 3. Configuration:
      - Added "comment" field to ColumnFamily definition.
-     - Added MemtableFlushAfterMinutes, a global replacement for the 
+     - Added MemtableFlushAfterMinutes, a global replacement for the
        old per-CF FlushPeriodInMinutes setting
      - Key cache settings
 
@@ -1121,7 +1126,7 @@ Installation/Upgrade
    create and modify ColumnFamilies at will without worrying about
    collisions with others in the same cluster.
 
-3. Many Thrift API changes and documentation.  See 
+3. Many Thrift API changes and documentation.  See
    http://wiki.apache.org/cassandra/API
 
 4. Removed the web interface in favor of JMX and bin/nodeprobe, which
@@ -1166,4 +1171,4 @@ key in a given ColumnFamily) is limited by available memory, because
 compaction deserializes each row before merging.
 
 See https://issues.apache.org/jira/browse/CASSANDRA-16
-   
+

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3814af80/bin/sstableupgrade
----------------------------------------------------------------------
diff --git a/bin/sstableupgrade b/bin/sstableupgrade
new file mode 100755
index 0000000..b5ddd6a
--- /dev/null
+++ b/bin/sstableupgrade
@@ -0,0 +1,55 @@
+#!/bin/sh
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+if [ "x$CASSANDRA_INCLUDE" = "x" ]; then
+    for include in /usr/share/cassandra/cassandra.in.sh \
+                   /usr/local/share/cassandra/cassandra.in.sh \
+                   /opt/cassandra/cassandra.in.sh \
+                   ~/.cassandra.in.sh \
+                   `dirname $0`/cassandra.in.sh; do
+        if [ -r $include ]; then
+            . $include
+            break
+        fi
+    done
+elif [ -r $CASSANDRA_INCLUDE ]; then
+    . $CASSANDRA_INCLUDE
+fi
+
+# Use JAVA_HOME if set, otherwise look for java in PATH
+if [ -x $JAVA_HOME/bin/java ]; then
+    JAVA=$JAVA_HOME/bin/java
+else
+    JAVA=`which java`
+fi
+
+if [ -z $CLASSPATH ]; then
+    echo "You must set the CLASSPATH var" >&2
+    exit 1
+fi
+
+if [ "x$MAX_HEAP_SIZE" = "x" ]; then
+    MAX_HEAP_SIZE="256M"
+fi
+
+$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
+        -Dlog4j.configuration=log4j-tools.properties \
+        org.apache.cassandra.tools.StandaloneUpgrader "$@"
+
+# vi:ai sw=4 ts=4 tw=0 et
+

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3814af80/debian/cassandra.install
----------------------------------------------------------------------
diff --git a/debian/cassandra.install b/debian/cassandra.install
index 6d7ba8f..a504b78 100644
--- a/debian/cassandra.install
+++ b/debian/cassandra.install
@@ -17,6 +17,7 @@ bin/sstablekeys usr/bin
 bin/sstableloader usr/bin
 bin/cqlsh usr/bin
 bin/sstablescrub usr/bin
+bin/sstableupgrade usr/bin
 bin/cassandra-shuffle usr/bin
 tools/bin/cassandra-stress usr/bin
 tools/bin/token-generator usr/bin

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3814af80/src/java/org/apache/cassandra/db/compaction/Upgrader.java
----------------------------------------------------------------------
diff --git a/src/java/org/apache/cassandra/db/compaction/Upgrader.java b/src/java/org/apache/cassandra/db/compaction/Upgrader.java
new file mode 100644
index 0000000..e7211ba
--- /dev/null
+++ b/src/java/org/apache/cassandra/db/compaction/Upgrader.java
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.db.compaction;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.IOError;
+import java.util.*;
+
+import com.google.common.base.Throwables;
+
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.db.ColumnFamilyStore;
+import org.apache.cassandra.db.DecoratedKey;
+import org.apache.cassandra.db.RowIndexEntry;
+import org.apache.cassandra.db.compaction.AbstractCompactedRow;
+import org.apache.cassandra.db.compaction.AbstractCompactionStrategy;
+import org.apache.cassandra.db.compaction.AbstractCompactionIterable;
+import org.apache.cassandra.db.compaction.CompactionIterable;
+import org.apache.cassandra.db.compaction.CompactionController;
+import org.apache.cassandra.db.compaction.CompactionManager;
+import org.apache.cassandra.db.compaction.CompactionTask;
+import org.apache.cassandra.db.compaction.OperationType;
+import org.apache.cassandra.io.sstable.*;
+import org.apache.cassandra.io.util.RandomAccessReader;
+import org.apache.cassandra.utils.CloseableIterator;
+import org.apache.cassandra.utils.OutputHandler;
+
+public class Upgrader
+{
+    private final ColumnFamilyStore cfs;
+    private final SSTableReader sstable;
+    private final Collection<SSTableReader> toUpgrade;
+    private final File directory;
+
+    private final OperationType compactionType = OperationType.UPGRADE_SSTABLES;
+    private final CompactionController controller;
+    private final AbstractCompactionStrategy strategy;
+    private final long estimatedRows;
+
+    private final int gcBefore = CompactionManager.NO_GC;
+
+    private final OutputHandler outputHandler;
+
+    public Upgrader(ColumnFamilyStore cfs, SSTableReader sstable, OutputHandler outputHandler)
+    {
+        this.cfs = cfs;
+        this.sstable = sstable;
+        this.toUpgrade = Collections.singletonList(sstable);
+        this.outputHandler = outputHandler;
+
+        this.directory = new File(sstable.getFilename()).getParentFile();
+
+        this.controller = new UpgradeController(cfs);
+
+        this.strategy = cfs.getCompactionStrategy();
+        long estimatedTotalKeys = Math.max(DatabaseDescriptor.getIndexInterval(), SSTableReader.getApproximateKeyCount(toUpgrade));
+        long estimatedSSTables = Math.max(1, SSTable.getTotalBytes(this.toUpgrade) / strategy.getMaxSSTableSize());
+        this.estimatedRows = (long) Math.ceil((double) estimatedTotalKeys / estimatedSSTables);
+    }
+
+    private SSTableWriter createCompactionWriter()
+    {
+        SSTableMetadata.Collector sstableMetadataCollector = SSTableMetadata.createCollector();
+
+        // Get the max timestamp of the precompacted sstables
+        // and adds generation of live ancestors
+        for (SSTableReader sstable : toUpgrade)
+        {
+            sstableMetadataCollector.addAncestor(sstable.descriptor.generation);
+            for (Integer i : sstable.getAncestors())
+            {
+                if (new File(sstable.descriptor.withGeneration(i).filenameFor(Component.DATA)).exists())
+                    sstableMetadataCollector.addAncestor(i);
+            }
+        }
+
+        return new SSTableWriter(cfs.getTempSSTablePath(directory), estimatedRows, cfs.metadata, cfs.partitioner, sstableMetadataCollector);
+    }
+
+    public void upgrade()
+    {
+        outputHandler.output("Upgrading " + sstable);
+
+
+        AbstractCompactionIterable ci = new CompactionIterable(compactionType, strategy.getScanners(this.toUpgrade), controller);
+
+        CloseableIterator<AbstractCompactedRow> iter = ci.iterator();
+
+        Collection<SSTableReader> sstables = new ArrayList<SSTableReader>();
+        Collection<SSTableWriter> writers = new ArrayList<SSTableWriter>();
+
+        try
+        {
+            SSTableWriter writer = createCompactionWriter();
+            writers.add(writer);
+            while (iter.hasNext())
+            {
+                AbstractCompactedRow row = iter.next();
+
+                RowIndexEntry indexEntry = writer.append(row);
+            }
+
+            long maxAge = CompactionTask.getMaxDataAge(this.toUpgrade);
+            for (SSTableWriter completedWriter : writers)
+                sstables.add(completedWriter.closeAndOpenReader(maxAge));
+
+            outputHandler.output("Upgrade of " + sstable + " complete.");
+
+        }
+        catch (Throwable t)
+        {
+            for (SSTableWriter writer : writers)
+                writer.abort();
+            // also remove already completed SSTables
+            for (SSTableReader sstable : sstables)
+            {
+                sstable.markCompacted();
+                sstable.releaseReference();
+            }
+            throw Throwables.propagate(t);
+        }
+        finally
+        {
+            controller.close();
+
+            try
+            {
+                iter.close();
+            }
+            catch (IOException e)
+            {
+                throw new RuntimeException(e);
+            }
+        }
+    }
+
+    private static class UpgradeController extends CompactionController
+    {
+        public UpgradeController(ColumnFamilyStore cfs)
+        {
+            super(cfs, Integer.MAX_VALUE);
+        }
+
+        @Override
+        public boolean shouldPurge(DecoratedKey key, long delTimestamp)
+        {
+            return false;
+        }
+    }
+}
+

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3814af80/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java
----------------------------------------------------------------------
diff --git a/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java b/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java
new file mode 100644
index 0000000..357e99c
--- /dev/null
+++ b/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java
@@ -0,0 +1,223 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.cassandra.tools;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.*;
+
+import com.google.common.base.Throwables;
+
+import org.apache.commons.cli.*;
+
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.config.Schema;
+import org.apache.cassandra.db.ColumnFamilyStore;
+import org.apache.cassandra.db.Directories;
+import org.apache.cassandra.db.Table;
+import org.apache.cassandra.db.compaction.Upgrader;
+import org.apache.cassandra.io.sstable.*;
+import org.apache.cassandra.service.CassandraDaemon;
+import org.apache.cassandra.utils.OutputHandler;
+
+import static org.apache.cassandra.tools.BulkLoader.CmdLineOptions;
+
+public class StandaloneUpgrader
+{
+    static
+    {
+        CassandraDaemon.initLog4j();
+    }
+
+    private static final String TOOL_NAME = "sstableupgrade";
+    private static final String DEBUG_OPTION  = "debug";
+    private static final String HELP_OPTION  = "help";
+
+    public static void main(String args[]) throws IOException
+    {
+        Options options = Options.parseArgs(args);
+        try
+        {
+            // load keyspace descriptions.
+            DatabaseDescriptor.loadSchemas();
+
+            if (Schema.instance.getCFMetaData(options.keyspace, options.cf) == null)
+                throw new IllegalArgumentException(String.format("Unknown keyspace/columnFamily %s.%s",
+                                                                 options.keyspace,
+                                                                 options.cf));
+
+            Table table = Table.openWithoutSSTables(options.keyspace);
+            ColumnFamilyStore cfs = table.getColumnFamilyStore(options.cf);
+
+            OutputHandler handler = new OutputHandler.SystemOutput(false, options.debug);
+            Directories.SSTableLister lister = cfs.directories.sstableLister();
+            if (options.snapshot != null)
+                lister.onlyBackups(true).snapshots(options.snapshot);
+            else
+                lister.includeBackups(false);
+
+            Collection<SSTableReader> readers = new ArrayList<SSTableReader>();
+
+            // Upgrade sstables
+            for (Map.Entry<Descriptor, Set<Component>> entry : lister.list().entrySet())
+            {
+                Set<Component> components = entry.getValue();
+                if (!components.contains(Component.DATA) || !components.contains(Component.PRIMARY_INDEX))
+                    continue;
+
+                try
+                {
+                    SSTableReader sstable = SSTableReader.openNoValidation(entry.getKey(), components, cfs.metadata);
+                    if (sstable.descriptor.version.equals(Descriptor.Version.CURRENT))
+                        continue;
+                    readers.add(sstable);
+                }
+                catch (Exception e)
+                {
+                    System.err.println(String.format("Error Loading %s: %s", entry.getKey(), e.getMessage()));
+                    if (options.debug)
+                        e.printStackTrace(System.err);
+
+                    continue;
+                }
+            }
+
+            int numSSTables = readers.size();
+            handler.output("Found " + numSSTables + " sstables that need upgrading.");
+
+            for (SSTableReader sstable : readers)
+            {
+                try
+                {
+                    Upgrader upgrader = new Upgrader(cfs, sstable, handler);
+                    upgrader.upgrade();
+
+                    sstable.markCompacted();
+                    sstable.releaseReference();
+                }
+                catch (Exception e)
+                {
+                    System.err.println(String.format("Error upgrading %s: %s", sstable, e.getMessage()));
+                    if (options.debug)
+                        e.printStackTrace(System.err);
+                }
+            }
+
+            SSTableDeletingTask.waitForDeletions();
+            System.exit(0);
+        }
+        catch (Exception e)
+        {
+            System.err.println(e.getMessage());
+            if (options.debug)
+                e.printStackTrace(System.err);
+            System.exit(1);
+        }
+    }
+
+    private static class Options
+    {
+        public final String keyspace;
+        public final String cf;
+        public final String snapshot;
+
+        public boolean debug;
+
+        private Options(String keyspace, String cf, String snapshot)
+        {
+            this.keyspace = keyspace;
+            this.cf = cf;
+            this.snapshot = snapshot;
+        }
+
+        public static Options parseArgs(String cmdArgs[])
+        {
+            CommandLineParser parser = new GnuParser();
+            CmdLineOptions options = getCmdLineOptions();
+            try
+            {
+                CommandLine cmd = parser.parse(options, cmdArgs, false);
+
+                if (cmd.hasOption(HELP_OPTION))
+                {
+                    printUsage(options);
+                    System.exit(0);
+                }
+
+                String[] args = cmd.getArgs();
+                if (args.length >= 4 || args.length < 2)
+                {
+                    String msg = args.length < 2 ? "Missing arguments" : "Too many arguments";
+                    errorMsg(msg, options);
+                    System.exit(1);
+                }
+
+                String keyspace = args[0];
+                String cf = args[1];
+                String snapshot = null;
+                if (args.length == 3)
+                    snapshot = args[2];
+
+                Options opts = new Options(keyspace, cf, snapshot);
+
+                opts.debug = cmd.hasOption(DEBUG_OPTION);
+
+                return opts;
+            }
+            catch (ParseException e)
+            {
+                errorMsg(e.getMessage(), options);
+                return null;
+            }
+        }
+
+        private static void errorMsg(String msg, CmdLineOptions options)
+        {
+            System.err.println(msg);
+            printUsage(options);
+            System.exit(1);
+        }
+
+        private static CmdLineOptions getCmdLineOptions()
+        {
+            CmdLineOptions options = new CmdLineOptions();
+            options.addOption(null, DEBUG_OPTION,          "display stack traces");
+            options.addOption("h",  HELP_OPTION,           "display this help message");
+            return options;
+        }
+
+        public static void printUsage(CmdLineOptions options)
+        {
+            String usage = String.format("%s [options] <keyspace> <cf> [snapshot]", TOOL_NAME);
+            StringBuilder header = new StringBuilder();
+            header.append("--\n");
+            header.append("Upgrade the sstables in the given cf (or snapshot) to the current version of Cassandra." );
+            header.append("This operation will rewrite the sstables in the specified cf to match the " );
+            header.append("currently installed version of Cassandra.\n");
+            header.append("The snapshot option will only upgrade the specified snapshot. Upgrading " );
+            header.append("snapshots is required before attempting to restore a snapshot taken in a " );
+            header.append("major version older than the major version Cassandra is currently running. " );
+            header.append("This will replace the files in the given snapshot as well as break any " );
+            header.append("hard links to live sstables." );
+            header.append("\n--\n");
+            header.append("Options are:");
+            new HelpFormatter().printHelp(usage, header.toString(), options, "");
+        }
+    }
+}
+