You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by GitBox <gi...@apache.org> on 2023/01/04 16:37:49 UTC

[GitHub] [cassandra] jacek-lewandowski opened a new pull request, #2065: Cassandra 18134

jacek-lewandowski opened a new pull request, #2065:
URL: https://github.com/apache/cassandra/pull/2065

   Thanks for sending a pull request! Here are some tips if you're new here:
    
    * Ensure you have added or run the [appropriate tests](https://cassandra.apache.org/_/development/testing.html) for your PR.
    * Be sure to keep the PR description updated to reflect all changes.
    * Write your PR title to summarize what this PR proposes.
    * If possible, provide a concise example to reproduce the issue for a faster review.
    * Read our [contributor guidelines](https://cassandra.apache.org/_/development/index.html)
    * If you're making a documentation change, see our [guide to documentation contribution](https://cassandra.apache.org/_/development/documentation.html)
    
   Commit messages should follow the following format:
   
   ```
   <One sentence description, usually Jira title or CHANGES.txt summary>
   
   <Optional lengthier description (context on patch)>
   
   patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####
   
   Co-authored-by: Name1 <email1>
   Co-authored-by: Name2 <email2>
   
   ```
   
   The [Cassandra Jira](https://issues.apache.org/jira/projects/CASSANDRA/issues/)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] smiklosovic closed pull request #2065: CASSANDRA-18134: More accurate skipping of sstables in read path

Posted by "smiklosovic (via GitHub)" <gi...@apache.org>.
smiklosovic closed pull request #2065: CASSANDRA-18134: More accurate skipping of sstables in read path
URL: https://github.com/apache/cassandra/pull/2065


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] blambov commented on a diff in pull request #2065: Cassandra 18134

Posted by GitBox <gi...@apache.org>.
blambov commented on code in PR #2065:
URL: https://github.com/apache/cassandra/pull/2065#discussion_r1063458672


##########
src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java:
##########
@@ -163,12 +167,14 @@ static class BigVersion extends Version
             hasCommitLogLowerBound = version.compareTo("mb") >= 0;
             hasCommitLogIntervals = version.compareTo("mc") >= 0;
             hasAccurateMinMax = version.compareTo("md") >= 0;
-            hasOriginatingHostId = version.matches("(m[e-z])|(n[b-z])");
+            hasOriginatingHostId = version.matches("(m[e-z])|(n[b-z])|(o[a-z])");

Review Comment:
   This would be easy to forget to update. How about `version.compareTo("nb") >= 0 || version.matches("m[e-z]")`?



##########
src/java/org/apache/cassandra/db/Slice.java:
##########
@@ -229,20 +237,24 @@ public Slice forPaging(ClusteringComparator comparator, Clustering<?> lastReturn
     }
 
     /**
-     * Given the per-clustering column minimum and maximum value a sstable contains, whether or not this slice potentially
-     * intersects that sstable or not.
+     * Whether this slice and the provided slice intersects.
      *
      * @param comparator the comparator for the table this is a slice of.
-     * @param minClusteringValues the smallest values for each clustering column that a sstable contains.
-     * @param maxClusteringValues the biggest values for each clustering column that a sstable contains.
-     *
-     * @return whether the slice might intersects with the sstable having {@code minClusteringValues} and
-     * {@code maxClusteringValues}.
+     * @param other      the other slice to check intersection with.
+     * @return whether this slice intersects {@code other}.
      */
-    public boolean intersects(ClusteringComparator comparator, List<ByteBuffer> minClusteringValues, List<ByteBuffer> maxClusteringValues)
+    public boolean intersects(ClusteringComparator comparator, Slice other)
     {
-        // If this slice starts after max clustering or ends before min clustering, it can't intersect
-        return start.compareTo(comparator, maxClusteringValues) <= 0 && end.compareTo(comparator, minClusteringValues) >= 0;
+        // Empty slices never intersect anything (and we have to special case it as there is many ways to build an
+        // empty slice; for instance, without this, (0, 0) would intersect Slice.ALL or [-1, 1]).
+        if (isEmpty(comparator) || other.isEmpty(comparator))

Review Comment:
   A simpler implementation of intersection when empty slices can occur is to take the intersection and check if it is empty:
   ```
           return comparator.compare(Comparables.max(start, other.start, comparator), 
                                     Comparables.min(end, other.end, comparator)) < 0;
   ```
   This saves one comparison and is no less readable to me.



##########
src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java:
##########
@@ -259,11 +278,52 @@ public MetadataCollector sstableLevel(int sstableLevel)
         return this;
     }
 
-    public MetadataCollector updateClusteringValues(ClusteringPrefix<?> clustering)
+    public void updateClusteringValues(Clustering<?> clustering)
     {
-        minClustering = minClustering == null || comparator.compare(clustering, minClustering) < 0 ? clustering : minClustering;
-        maxClustering = maxClustering == null || comparator.compare(clustering, maxClustering) > 0 ? clustering : maxClustering;
-        return this;
+        if (clustering == Clustering.STATIC_CLUSTERING)
+            return;
+
+        if (!clusteringInitialized)

Review Comment:
   Does this special case exist only to make sure we don't call `retainable` twice? Per the comment below, this is not necessary, and `MAX_START` and `MIN_END` will be greater/smaller than any valid clustering prefix -- this code, its version below and the variable can be removed altogether.



##########
src/java/org/apache/cassandra/db/Slices.java:
##########
@@ -243,17 +249,12 @@ private List<Slice> normalize(List<Slice> slices)
             if (slices.size() <= 1)
                 return slices;
 
-            Collections.sort(slices, new Comparator<Slice>()
-            {
-                @Override
-                public int compare(Slice s1, Slice s2)
-                {
-                    int c = comparator.compare(s1.start(), s2.start());
-                    if (c != 0)
-                        return c;
+            slices.sort((s1, s2) -> {
+                int c = comparator.compare(s1.start(), s2.start());
+                if (c != 0)
+                    return c;
 
-                    return comparator.compare(s1.end(), s2.end());
-                }
+                return comparator.compare(s1.end(), s2.end());

Review Comment:
   Reversing the direction here would be a tad more efficient (it would avoid going through `Slice.make` on 283 below).



##########
src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorWithLowerBound.java:
##########
@@ -186,7 +193,7 @@ private static <V> ClusteringBound<V> createArtificialLowerBound(boolean isRever
     /**
      * @return the lower bound stored on the index entry for this partition, if available.
      */
-    private ClusteringBound<?> getPartitionIndexLowerBound()
+    private ClusteringBound<?> getKeyCacheLowerBound()
     {
         // NOTE: CASSANDRA-11206 removed the lookup against the key-cache as the IndexInfo objects are no longer

Review Comment:
   Isn't this note invalid?
   
   Also, I don't think we want to load the partition if `canUseMetadataLowerBound` is false -- we can just forego the bound in that case.



##########
src/java/org/apache/cassandra/db/ClusteringBound.java:
##########
@@ -78,7 +85,7 @@ default int compareTo(ClusteringComparator comparator, List<ByteBuffer> sstableB
             // Say the slice bound is a start. It means we're in the case where the max
             // sstable bound is say (1:5) while the slice start is (1). So the start
             // does start before the sstable end bound (and intersect it). It's the exact
-            // inverse with a end slice bound.
+            // inverse with an end slice bound.

Review Comment:
   This method is no longer needed.



##########
src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java:
##########
@@ -259,11 +278,52 @@ public MetadataCollector sstableLevel(int sstableLevel)
         return this;
     }
 
-    public MetadataCollector updateClusteringValues(ClusteringPrefix<?> clustering)
+    public void updateClusteringValues(Clustering<?> clustering)
     {
-        minClustering = minClustering == null || comparator.compare(clustering, minClustering) < 0 ? clustering : minClustering;
-        maxClustering = maxClustering == null || comparator.compare(clustering, maxClustering) > 0 ? clustering : maxClustering;
-        return this;
+        if (clustering == Clustering.STATIC_CLUSTERING)
+            return;
+
+        if (!clusteringInitialized)
+        {
+            clusteringInitialized = true;
+            minClustering = clustering.retainable();
+            maxClustering = minClustering;
+        }
+        else if (comparator.compare((ClusteringPrefix<?>) clustering, (ClusteringPrefix<?>) maxClustering) > 0)
+        {
+            maxClustering = clustering.retainable();

Review Comment:
   Why not call `retainable` in `finalizeMetadata` instead of doing it repeatedly here?



##########
src/java/org/apache/cassandra/db/Slice.java:
##########
@@ -164,8 +174,7 @@ else if (cmp > 0)
      * Returns whether a given clustering or bound is included in this slice.

Review Comment:
   Please replace the above with just
   ```
   assert start.isStart() && end.isEnd();
   return comparator.compare(start, end) >= 0;
   // Note: the comparator orders inclusive starts and exclusive ends as equal, and inclusive ends as being greater than starts.
   ```



##########
src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java:
##########
@@ -712,42 +712,57 @@ private UnfilteredRowIterator queryMemtableAndDiskInternal(ColumnFamilyStore cfs
                     break;
                 }
 
-                if (shouldInclude(sstable))
+                boolean intersects = intersects(sstable);
+                boolean hasRequiredStatics = hasRequiredStatics(sstable);
+                boolean hasPartitionLevelDeletions = hasPartitionLevelDeletions(sstable);
+
+                if (!intersects && !hasRequiredStatics && !hasPartitionLevelDeletions)
+                {
+                    nonIntersectingSSTables++;
+                    continue;
+                }
+
+                if (intersects || hasRequiredStatics)
                 {
                     if (!sstable.isRepaired())
                         controller.updateMinOldestUnrepairedTombstone(sstable.getMinLocalDeletionTime());
 
                     // 'iter' is added to iterators which is closed on exception, or through the closing of the final merged iterator
                     @SuppressWarnings("resource")
-                    UnfilteredRowIteratorWithLowerBound iter = makeIterator(cfs, sstable, metricsCollector);
+                    UnfilteredRowIterator iter = intersects ? makeRowIteratorWithLowerBound(cfs, sstable, metricsCollector)
+                                                            : makeRowIteratorWithSkippedNonStaticContent(cfs, sstable, metricsCollector);
+
                     inputCollector.addSSTableIterator(sstable, iter);
                     mostRecentPartitionTombstone = Math.max(mostRecentPartitionTombstone,
                                                             iter.partitionLevelDeletion().markedForDeleteAt());
                 }
                 else
                 {
                     nonIntersectingSSTables++;
-                    // sstable contains no tombstone if maxLocalDeletionTime == Integer.MAX_VALUE, so we can safely skip those entirely
-                    if (sstable.mayHaveTombstones())
+
+                    // if the sstable contained range or cell tombstones, it would intersect; since we are here, it means
+                    // that there are no cell or range tombstones we are interested in (due to the filter)
+                    // however, we know that there are partition level deletions in this sstable and we need to make
+                    // an iterator figure out that (see `StatsMetadata.hasPartitionLevelDeletions`)
+
+                    // 'iter' is added to iterators which is closed on exception, or through the closing of the final merged iterator
+                    @SuppressWarnings("resource")
+                    UnfilteredRowIterator iter = makeRowIteratorWithLowerBound(cfs, sstable, metricsCollector);

Review Comment:
   Would `WithSkippedNonStaticContent` not be preferable?



##########
src/java/org/apache/cassandra/io/sstable/metadata/StatsMetadata.java:
##########
@@ -453,29 +478,30 @@ public StatsMetadata deserialize(Version version, DataInputPlus in) throws IOExc
             int sstableLevel = in.readInt();
             long repairedAt = in.readLong();
 
+            List<AbstractType<?>> clusteringTypes = null;
+            Slice coveredClustering = Slice.ALL;
             if (version.hasImprovedMinMax())
             {
-                throw new UnsupportedEncodingException();
+                clusteringTypes = typeSerializer.deserializeList(in);
+                coveredClustering = Slice.serializer.deserialize(in, version.correspondingMessagingVersion(), clusteringTypes);
             }
-
-            // for legacy sstables, we skip deserializing the min and max clustering value
-            // to prevent erroneously excluding sstables from reads (see CASSANDRA-14861)
-            int colCount = in.readInt();
-            List<ByteBuffer> minClusteringValues = new ArrayList<>(colCount);
-            for (int i = 0; i < colCount; i++)
+            else
             {
-                ByteBuffer val = ByteBufferUtil.readWithShortLength(in);
-                if (version.hasAccurateMinMax())
-                    minClusteringValues.add(val);
-            }
+                // for legacy sstables, we skip deserializing the min and max clustering value

Review Comment:
   I prefer the DSE version of the comment:
   ```
                   // We always deserialize the min/max clustering values if they are there, but we ignore them for
                   // legacy sstables where !hasAccurateMinMax due to CASSANDRA-14861.
   ```



##########
src/java/org/apache/cassandra/db/Slice.java:
##########
@@ -106,6 +109,13 @@ public static Slice make(Clustering<?> start, Clustering<?> end)
         return new Slice(ClusteringBound.inclusiveStartOf(start), ClusteringBound.inclusiveEndOf(end));
     }
 
+    public static Slice make(ClusteringPrefix<?> start, ClusteringPrefix<?> end)
+    {
+        // This doesn't give us what we want with the clustering prefix
+        assert start != Clustering.STATIC_CLUSTERING && end != Clustering.STATIC_CLUSTERING;
+        return make(start.asStartBound(), end.asEndBound());

Review Comment:
   The version above is no longer needed.
   
   However, this (and the version above) does not match
   ```
    * A slice selects every row whose clustering is bigger than the slice start prefix but smaller
    * than the end prefix.
    ```
   and makes me worry that `asStartBound` and `asEndBound` without an inclusiveness parameter are dangerous. The javadoc of all three methods should state very loudly that this is an inclusive conversion for clusterings.



##########
src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorWithLowerBound.java:
##########
@@ -245,21 +252,32 @@ private ClusteringBound<?> getPartitionIndexLowerBound()
      */
     private boolean canUseMetadataLowerBound()
     {
-        // Side-note: pre-2.1 sstable stat file had clustering value arrays whose size may not match the comparator size
-        // and that would break getMetadataLowerBound. We don't support upgrade from 2.0 to 3.0 directly however so it's
-        // not a true concern. Besides, !sstable.mayHaveTombstones already ensure this is a 3.0 sstable anyway.
-        return !sstable.mayHaveTombstones() && !sstable.metadata().isCompactTable();
+        if (sstable.metadata().isCompactTable())

Review Comment:
   The comment on top of this method does not appear to be valid.



##########
src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorWithLowerBound.java:
##########
@@ -245,21 +252,32 @@ private ClusteringBound<?> getPartitionIndexLowerBound()
      */
     private boolean canUseMetadataLowerBound()
     {
-        // Side-note: pre-2.1 sstable stat file had clustering value arrays whose size may not match the comparator size
-        // and that would break getMetadataLowerBound. We don't support upgrade from 2.0 to 3.0 directly however so it's
-        // not a true concern. Besides, !sstable.mayHaveTombstones already ensure this is a 3.0 sstable anyway.
-        return !sstable.mayHaveTombstones() && !sstable.metadata().isCompactTable();
+        if (sstable.metadata().isCompactTable())
+            return false;
+
+        Slices requestedSlices = slices;
+
+        if (requestedSlices.isEmpty())
+            return true;
+
+        if (!isReverseOrder())

Review Comment:
   We need a comment here that if the requested slice falls within the covered slice, there's no need to delay producing data.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] blambov commented on a diff in pull request #2065: Cassandra 18134

Posted by GitBox <gi...@apache.org>.
blambov commented on code in PR #2065:
URL: https://github.com/apache/cassandra/pull/2065#discussion_r1067035265


##########
src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorWithLowerBound.java:
##########
@@ -185,70 +183,8 @@ public Row staticRow()
         return super.staticRow();
     }
 
-    private static <V> ClusteringBound<V> createArtificialLowerBound(boolean isReversed, ClusteringPrefix<V> from)
-    {
-        return from.accessor().factory().inclusiveOpen(isReversed, from.getRawValues()).artificialLowerBound(isReversed);
-    }
-
-    /**
-     * @return the lower bound stored on the index entry for this partition, if available.
-     */
-    private ClusteringBound<?> getKeyCacheLowerBound()
-    {
-        // NOTE: CASSANDRA-11206 removed the lookup against the key-cache as the IndexInfo objects are no longer
-        // in memory for not heap backed IndexInfo objects (so, these are on disk).
-        // CASSANDRA-11369 is there to fix this afterwards.
-
-        // Creating the iterator ensures that rowIndexEntry is loaded if available (partitions bigger than
-        // DatabaseDescriptor.column_index_size)
-        if (!canUseMetadataLowerBound())
-            maybeInit();
-
-        RowIndexEntry rowIndexEntry = sstable.getCachedPosition(partitionKey(), false);

Review Comment:
   I think I have misled you here, checking the key cache still makes sense.
   
   What I meant is that we do not want to initialize the iterator so that we can check the key cache to decide that we did not want to initialize the iterator. Also, calling `canUseMetadataLowerBound()` here is a bad idea in its own right.
   
   I would remove the lines above this in the method, but still try it first in `lowerBound()`.



##########
src/java/org/apache/cassandra/db/Slice.java:
##########
@@ -106,6 +109,13 @@ public static Slice make(Clustering<?> start, Clustering<?> end)
         return new Slice(ClusteringBound.inclusiveStartOf(start), ClusteringBound.inclusiveEndOf(end));
     }
 
+    public static Slice make(ClusteringPrefix<?> start, ClusteringPrefix<?> end)
+    {
+        // This doesn't give us what we want with the clustering prefix
+        assert start != Clustering.STATIC_CLUSTERING && end != Clustering.STATIC_CLUSTERING;
+        return make(start.asStartBound(), end.asEndBound());

Review Comment:
   Could you add an inclusiveness comment to `asStartBound` and `asEndBound` too?



##########
src/java/org/apache/cassandra/db/Slice.java:
##########
@@ -252,16 +253,8 @@ public Slice forPaging(ClusteringComparator comparator, Clustering<?> lastReturn
      */
     public boolean intersects(ClusteringComparator comparator, Slice other)
     {
-        // Empty slices never intersect anything (and we have to special case it as there is many ways to build an
-        // empty slice; for instance, without this, (0, 0) would intersect Slice.ALL or [-1, 1]).
-        if (isEmpty(comparator) || other.isEmpty(comparator))
-            return false;
-
-        // Otherwise, the slice intersects if it contains more than just their boundaries. That is, the comparison
-        // below needs to be strict, because for instance, a=[0, 3] and b=(3, 5] do not intersect, yet the end of `a`
-        // is equal to end start of `b` as far as `ClusteringPrefix.Kind#compare` goes (see the javadoc on that method
-        // for why that is).
-        return comparator.compare(start, other.end) < 0 && comparator.compare(end, other.start) > 0;
+        return comparator.compare(Comparables.max(start, other.start, comparator),

Review Comment:
   I think there should be a small comment why this works, e.g.
   ```
   // Construct the intersection of the two slices and check if it is non-empty.
   // This also works correctly when one or more of the inputs can be empty (i.e. with end <= start).
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] jacek-lewandowski commented on a diff in pull request #2065: Cassandra 18134

Posted by GitBox <gi...@apache.org>.
jacek-lewandowski commented on code in PR #2065:
URL: https://github.com/apache/cassandra/pull/2065#discussion_r1065748563


##########
src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java:
##########
@@ -712,42 +712,57 @@ private UnfilteredRowIterator queryMemtableAndDiskInternal(ColumnFamilyStore cfs
                     break;
                 }
 
-                if (shouldInclude(sstable))
+                boolean intersects = intersects(sstable);
+                boolean hasRequiredStatics = hasRequiredStatics(sstable);
+                boolean hasPartitionLevelDeletions = hasPartitionLevelDeletions(sstable);
+
+                if (!intersects && !hasRequiredStatics && !hasPartitionLevelDeletions)
+                {
+                    nonIntersectingSSTables++;
+                    continue;
+                }
+
+                if (intersects || hasRequiredStatics)
                 {
                     if (!sstable.isRepaired())
                         controller.updateMinOldestUnrepairedTombstone(sstable.getMinLocalDeletionTime());
 
                     // 'iter' is added to iterators which is closed on exception, or through the closing of the final merged iterator
                     @SuppressWarnings("resource")
-                    UnfilteredRowIteratorWithLowerBound iter = makeIterator(cfs, sstable, metricsCollector);
+                    UnfilteredRowIterator iter = intersects ? makeRowIteratorWithLowerBound(cfs, sstable, metricsCollector)
+                                                            : makeRowIteratorWithSkippedNonStaticContent(cfs, sstable, metricsCollector);
+
                     inputCollector.addSSTableIterator(sstable, iter);
                     mostRecentPartitionTombstone = Math.max(mostRecentPartitionTombstone,
                                                             iter.partitionLevelDeletion().markedForDeleteAt());
                 }
                 else
                 {
                     nonIntersectingSSTables++;
-                    // sstable contains no tombstone if maxLocalDeletionTime == Integer.MAX_VALUE, so we can safely skip those entirely
-                    if (sstable.mayHaveTombstones())
+
+                    // if the sstable contained range or cell tombstones, it would intersect; since we are here, it means
+                    // that there are no cell or range tombstones we are interested in (due to the filter)
+                    // however, we know that there are partition level deletions in this sstable and we need to make
+                    // an iterator figure out that (see `StatsMetadata.hasPartitionLevelDeletions`)
+
+                    // 'iter' is added to iterators which is closed on exception, or through the closing of the final merged iterator
+                    @SuppressWarnings("resource")
+                    UnfilteredRowIterator iter = makeRowIteratorWithLowerBound(cfs, sstable, metricsCollector);

Review Comment:
   🤦🏻‍♂️ indeed
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] jacek-lewandowski commented on a diff in pull request #2065: Cassandra 18134

Posted by GitBox <gi...@apache.org>.
jacek-lewandowski commented on code in PR #2065:
URL: https://github.com/apache/cassandra/pull/2065#discussion_r1065579876


##########
src/java/org/apache/cassandra/db/Slice.java:
##########
@@ -229,20 +237,24 @@ public Slice forPaging(ClusteringComparator comparator, Clustering<?> lastReturn
     }
 
     /**
-     * Given the per-clustering column minimum and maximum value a sstable contains, whether or not this slice potentially
-     * intersects that sstable or not.
+     * Whether this slice and the provided slice intersects.
      *
      * @param comparator the comparator for the table this is a slice of.
-     * @param minClusteringValues the smallest values for each clustering column that a sstable contains.
-     * @param maxClusteringValues the biggest values for each clustering column that a sstable contains.
-     *
-     * @return whether the slice might intersects with the sstable having {@code minClusteringValues} and
-     * {@code maxClusteringValues}.
+     * @param other      the other slice to check intersection with.
+     * @return whether this slice intersects {@code other}.
      */
-    public boolean intersects(ClusteringComparator comparator, List<ByteBuffer> minClusteringValues, List<ByteBuffer> maxClusteringValues)
+    public boolean intersects(ClusteringComparator comparator, Slice other)
     {
-        // If this slice starts after max clustering or ends before min clustering, it can't intersect
-        return start.compareTo(comparator, maxClusteringValues) <= 0 && end.compareTo(comparator, minClusteringValues) >= 0;
+        // Empty slices never intersect anything (and we have to special case it as there is many ways to build an
+        // empty slice; for instance, without this, (0, 0) would intersect Slice.ALL or [-1, 1]).
+        if (isEmpty(comparator) || other.isEmpty(comparator))

Review Comment:
   Actually... In this case we need to do 3 comparisons in  each case. In the current implementation we do 1 or 2 comparisons when one slice is empty and 4 (indeed) when we have no empty slices. Alternatively, we could associate an emptiness flag with a slice upon creation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] blambov commented on a diff in pull request #2065: Cassandra 18134

Posted by GitBox <gi...@apache.org>.
blambov commented on code in PR #2065:
URL: https://github.com/apache/cassandra/pull/2065#discussion_r1067855935


##########
src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorWithLowerBound.java:
##########
@@ -186,7 +193,7 @@ private static <V> ClusteringBound<V> createArtificialLowerBound(boolean isRever
     /**
      * @return the lower bound stored on the index entry for this partition, if available.
      */
-    private ClusteringBound<?> getPartitionIndexLowerBound()
+    private ClusteringBound<?> getKeyCacheLowerBound()
     {
         // NOTE: CASSANDRA-11206 removed the lookup against the key-cache as the IndexInfo objects are no longer

Review Comment:
   I believe this comment is not correct (row indexes can also reside on the heap, if they are small enough, and the `rowIndexEntry.indexOnHeap()` call below checks this).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] jacek-lewandowski commented on a diff in pull request #2065: Cassandra 18134

Posted by GitBox <gi...@apache.org>.
jacek-lewandowski commented on code in PR #2065:
URL: https://github.com/apache/cassandra/pull/2065#discussion_r1067793853


##########
src/java/org/apache/cassandra/db/rows/UnfilteredRowIteratorWithLowerBound.java:
##########
@@ -185,70 +183,8 @@ public Row staticRow()
         return super.staticRow();
     }
 
-    private static <V> ClusteringBound<V> createArtificialLowerBound(boolean isReversed, ClusteringPrefix<V> from)
-    {
-        return from.accessor().factory().inclusiveOpen(isReversed, from.getRawValues()).artificialLowerBound(isReversed);
-    }
-
-    /**
-     * @return the lower bound stored on the index entry for this partition, if available.
-     */
-    private ClusteringBound<?> getKeyCacheLowerBound()
-    {
-        // NOTE: CASSANDRA-11206 removed the lookup against the key-cache as the IndexInfo objects are no longer
-        // in memory for not heap backed IndexInfo objects (so, these are on disk).
-        // CASSANDRA-11369 is there to fix this afterwards.
-
-        // Creating the iterator ensures that rowIndexEntry is loaded if available (partitions bigger than
-        // DatabaseDescriptor.column_index_size)
-        if (!canUseMetadataLowerBound())
-            maybeInit();
-
-        RowIndexEntry rowIndexEntry = sstable.getCachedPosition(partitionKey(), false);

Review Comment:
   Not you, the comment:
   ```
           // Creating the iterator ensures that rowIndexEntry is loaded if available (partitions bigger than
           // DatabaseDescriptor.column_index_size)
           if (!canUseMetadataLowerBound())
               maybeInit();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org