You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Lanny Ripple (Commented) (JIRA)" <ji...@apache.org> on 2011/11/03 22:37:32 UTC

[jira] [Commented] (CASSANDRA-3450) maybeInit in ColumnFamilyRecordReader can cause rows to be empty but not null

    [ https://issues.apache.org/jira/browse/CASSANDRA-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143547#comment-13143547 ] 

Lanny Ripple commented on CASSANDRA-3450:
-----------------------------------------

Don't have the time to work this up formally but here's a suggested fix.

{noformat}

diff --git a/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java b/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java
index 7b4b0d6..13e1fee 100644
--- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java
+++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java
@@ -234,51 +234,57 @@ public class ColumnFamilyRecordReader extends RecordReader<ByteBuffer, SortedMap
             if (startToken == null)
             {
                 startToken = split.getStartToken();
-            } 
-            else if (startToken.equals(split.getEndToken()))
-            {
-                rows = null;
-                return;
             }
-            
-            KeyRange keyRange = new KeyRange(batchRowCount)
-                                .setStart_token(startToken)
-                                .setEnd_token(split.getEndToken());
-            try
+
+            // The removal of empty CF rows could result in an empty List<KeySlice> rows.
+            // Keep trying until we return on reaching the end of the range or rows is nonEmpty.
+            while (rows == null || rows.isEmpty)
             {
-                rows = client.get_range_slices(new ColumnParent(cfName),
-                                               predicate,
-                                               keyRange,
-                                               consistencyLevel);
-                  
-                // nothing new? reached the end
-                if (rows.isEmpty())
+                if (startToken.equals(split.getEndToken()))
                 {
                     rows = null;
                     return;
                 }
 
-                // Pre-compute the last row key, before removing empty rows
-                ByteBuffer lastRowKey = rows.get(rows.size() - 1).key;
+                KeyRange keyRange = new KeyRange(batchRowCount)
+                                    .setStart_token(startToken)
+                                    .setEnd_token(split.getEndToken());
+                try
+                {
+                    rows = client.get_range_slices(new ColumnParent(cfName),
+                                                   predicate,
+                                                   keyRange,
+                                                   consistencyLevel);
+
+                    // nothing new? reached the end
+                    if (rows.isEmpty())
+                    {
+                        rows = null;
+                        return;
+                    }
+
+                    // Pre-compute the last row key, before removing empty rows
+                    ByteBuffer lastRowKey = rows.get(rows.size() - 1).key;
+
+                    // only remove empty rows if the slice predicate is empty
+                    if (isPredicateEmpty(predicate))
+                    {
+                        Iterator<KeySlice> rowsIterator = rows.iterator();
+                        while (rowsIterator.hasNext())
+                            if (rowsIterator.next().columns.isEmpty())
+                                rowsIterator.remove();
+                    }
 
-                // only remove empty rows if the slice predicate is empty
-                if (isPredicateEmpty(predicate))
+                    // reset to iterate through the new batch
+                    i = 0;
+
+                    // prepare for the next slice to be read
+                    startToken = partitioner.getTokenFactory().toString(partitioner.getToken(lastRowKey));
+                }
+                catch (Exception e)
                 {
-                    Iterator<KeySlice> rowsIterator = rows.iterator();
-                    while (rowsIterator.hasNext())
-                        if (rowsIterator.next().columns.isEmpty())
-                            rowsIterator.remove();
+                    throw new RuntimeException(e);
                 }
-                
-                // reset to iterate through the new batch
-                i = 0;
-
-                // prepare for the next slice to be read
-                startToken = partitioner.getTokenFactory().toString(partitioner.getToken(lastRowKey));
-            }
-            catch (Exception e)
-            {
-                throw new RuntimeException(e);
             }
         }
 
{noformat}
                
> maybeInit in ColumnFamilyRecordReader can cause rows to be empty but not null
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3450
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3450
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.0.1
>            Reporter: Lanny Ripple
>            Priority: Minor
>
> In {{ColumnFamilyRecordReader}} {{computeNext()}} calls {{maybeInit()}} and then if {{rows}} is not null it is indexed into.  {{maybeInit()}} could fetch new data, determine the associated slice predicate is empty, and end up removing all the rows if all columns turned out to be empty.  There is no check for {{rows.isEmpty()}} after the possible removal of all rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira