You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by GitBox <gi...@apache.org> on 2021/07/02 22:14:05 UTC

[GitHub] [cassandra] maedhroz opened a new pull request #1098: modify SecondaryIndexManager#indexPartition() to retrieve only columns for which indexes are actually being built

maedhroz opened a new pull request #1098:
URL: https://github.com/apache/cassandra/pull/1098


   patch by Caleb Rackliffe for CASSANDRA-16776


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #1098: modify SecondaryIndexManager#indexPartition() to retrieve only columns for which indexes are actually being built

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #1098:
URL: https://github.com/apache/cassandra/pull/1098#discussion_r683633618



##########
File path: src/java/org/apache/cassandra/index/internal/CollatedViewIndexBuilder.java
##########
@@ -55,29 +57,57 @@ public CollatedViewIndexBuilder(ColumnFamilyStore cfs, Set<Index> indexers, Redu
     public CompactionInfo getCompactionInfo()
     {
         return new CompactionInfo(cfs.metadata(),
-                OperationType.INDEX_BUILD,
-                iter.getBytesRead(),
-                iter.getTotalBytes(),
-                compactionId,
-                sstables);
+                                  OperationType.INDEX_BUILD,
+                                  iter.getBytesRead(),
+                                  iter.getTotalBytes(),
+                                  compactionId,
+                                  sstables);
     }
 
     public void build()
     {
         try
         {
             int pageSize = cfs.indexManager.calculateIndexingPageSize();
+            RegularAndStaticColumns targetPartitionColumns = extractIndexedColumns();
+            
             while (iter.hasNext())
             {
                 if (isStopRequested())
                     throw new CompactionInterruptedException(getCompactionInfo());
                 DecoratedKey key = iter.next();
-                cfs.indexManager.indexPartition(key, indexers, pageSize);
+                cfs.indexManager.indexPartition(key, indexers, pageSize, targetPartitionColumns);
             }
         }
         finally
         {
             iter.close();
         }
     }
+
+    private RegularAndStaticColumns extractIndexedColumns()
+    {
+        RegularAndStaticColumns.Builder builder = RegularAndStaticColumns.builder();
+        
+        for (Index index : indexers)
+        {
+            boolean isPartitionIndex = true;
+            
+            for (ColumnMetadata column : cfs.metadata().regularAndStaticColumns())
+            {
+                if (index.dependsOn(column))
+                {
+                    builder.add(column);
+                    isPartitionIndex = false;
+                }
+
+                // if any index declares no dependency on any column, it is a full partition index
+                // so we can use the base partition columns as the input source
+                if (isPartitionIndex)

Review comment:
       Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] alex-ninja commented on a change in pull request #1098: modify SecondaryIndexManager#indexPartition() to retrieve only columns for which indexes are actually being built

Posted by GitBox <gi...@apache.org>.
alex-ninja commented on a change in pull request #1098:
URL: https://github.com/apache/cassandra/pull/1098#discussion_r667220087



##########
File path: test/memory/org/apache/cassandra/db/compaction/CompactionAllocationTest.java
##########
@@ -764,4 +800,135 @@ public void widePartitionsOverlappingRows3() throws Throwable
     {
         testWidePartitions("widePartitionsOverlappingRows3", 3, maybeInflate(24), true, true);
     }
+
+
+    private static void testIndexingWidePartitions(String name,
+                                                   int numSSTable,
+                                                   int sstablePartitions,
+                                                   IndexDef...indexes) throws Throwable
+    {
+        String ksname = "ks_" + name.toLowerCase();
+        SchemaLoader.createKeyspace(ksname, KeyspaceParams.simple(1),
+                CreateTableStatement.parse("CREATE TABLE tbl (k text, c text, v1 text, v2 text, v3 text, v4 text, PRIMARY KEY (k, c))", ksname).build());
+
+        ColumnFamilyStore cfs = Schema.instance.getColumnFamilyStoreInstance(Schema.instance.getTableMetadata(ksname, "tbl").id);
+        Assert.assertNotNull(cfs);
+        cfs.disableAutoCompaction();
+        int rowWidth = 100;
+        int rowsPerPartition = 1000;
+
+        measure(new Workload()
+        {
+            @SuppressWarnings("UnstableApiUsage")
+            public void setup()
+            {
+                cfs.disableAutoCompaction();
+                String insert = String.format("INSERT INTO %s.%s (k, c, v1, v2, v3, v4) VALUES (?, ?, ?, ?, ?, ?)", ksname, "tbl");
+                for (int f=0; f<numSSTable; f++)
+                {
+                    for (int p = 0; p < sstablePartitions; p++)
+                    {
+                        String key = String.format("%08d", (f * sstablePartitions) + p);
+                        for (int r=0; r<rowsPerPartition; r++)

Review comment:
       nit: missing spaces

##########
File path: test/memory/org/apache/cassandra/db/compaction/CompactionAllocationTest.java
##########
@@ -764,4 +800,135 @@ public void widePartitionsOverlappingRows3() throws Throwable
     {
         testWidePartitions("widePartitionsOverlappingRows3", 3, maybeInflate(24), true, true);
     }
+
+
+    private static void testIndexingWidePartitions(String name,
+                                                   int numSSTable,
+                                                   int sstablePartitions,
+                                                   IndexDef...indexes) throws Throwable
+    {
+        String ksname = "ks_" + name.toLowerCase();
+        SchemaLoader.createKeyspace(ksname, KeyspaceParams.simple(1),
+                CreateTableStatement.parse("CREATE TABLE tbl (k text, c text, v1 text, v2 text, v3 text, v4 text, PRIMARY KEY (k, c))", ksname).build());
+
+        ColumnFamilyStore cfs = Schema.instance.getColumnFamilyStoreInstance(Schema.instance.getTableMetadata(ksname, "tbl").id);
+        Assert.assertNotNull(cfs);
+        cfs.disableAutoCompaction();
+        int rowWidth = 100;
+        int rowsPerPartition = 1000;
+
+        measure(new Workload()
+        {
+            @SuppressWarnings("UnstableApiUsage")
+            public void setup()
+            {
+                cfs.disableAutoCompaction();
+                String insert = String.format("INSERT INTO %s.%s (k, c, v1, v2, v3, v4) VALUES (?, ?, ?, ?, ?, ?)", ksname, "tbl");
+                for (int f=0; f<numSSTable; f++)

Review comment:
       nit: missing spaces

##########
File path: test/memory/org/apache/cassandra/db/compaction/CompactionAllocationTest.java
##########
@@ -764,4 +800,135 @@ public void widePartitionsOverlappingRows3() throws Throwable
     {
         testWidePartitions("widePartitionsOverlappingRows3", 3, maybeInflate(24), true, true);
     }
+
+
+    private static void testIndexingWidePartitions(String name,
+                                                   int numSSTable,
+                                                   int sstablePartitions,
+                                                   IndexDef...indexes) throws Throwable
+    {
+        String ksname = "ks_" + name.toLowerCase();
+        SchemaLoader.createKeyspace(ksname, KeyspaceParams.simple(1),
+                CreateTableStatement.parse("CREATE TABLE tbl (k text, c text, v1 text, v2 text, v3 text, v4 text, PRIMARY KEY (k, c))", ksname).build());
+
+        ColumnFamilyStore cfs = Schema.instance.getColumnFamilyStoreInstance(Schema.instance.getTableMetadata(ksname, "tbl").id);
+        Assert.assertNotNull(cfs);
+        cfs.disableAutoCompaction();
+        int rowWidth = 100;
+        int rowsPerPartition = 1000;
+
+        measure(new Workload()
+        {
+            @SuppressWarnings("UnstableApiUsage")
+            public void setup()
+            {
+                cfs.disableAutoCompaction();

Review comment:
       is it necessary to disable automatic compaction twice? it has been already done on line 816.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] smiklosovic closed pull request #1098: modify SecondaryIndexManager#indexPartition() to retrieve only columns for which indexes are actually being built

Posted by GitBox <gi...@apache.org>.
smiklosovic closed pull request #1098:
URL: https://github.com/apache/cassandra/pull/1098


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] belliottsmith commented on a change in pull request #1098: modify SecondaryIndexManager#indexPartition() to retrieve only columns for which indexes are actually being built

Posted by GitBox <gi...@apache.org>.
belliottsmith commented on a change in pull request #1098:
URL: https://github.com/apache/cassandra/pull/1098#discussion_r680809100



##########
File path: src/java/org/apache/cassandra/index/internal/CollatedViewIndexBuilder.java
##########
@@ -55,29 +57,57 @@ public CollatedViewIndexBuilder(ColumnFamilyStore cfs, Set<Index> indexers, Redu
     public CompactionInfo getCompactionInfo()
     {
         return new CompactionInfo(cfs.metadata(),
-                OperationType.INDEX_BUILD,
-                iter.getBytesRead(),
-                iter.getTotalBytes(),
-                compactionId,
-                sstables);
+                                  OperationType.INDEX_BUILD,
+                                  iter.getBytesRead(),
+                                  iter.getTotalBytes(),
+                                  compactionId,
+                                  sstables);
     }
 
     public void build()
     {
         try
         {
             int pageSize = cfs.indexManager.calculateIndexingPageSize();
+            RegularAndStaticColumns targetPartitionColumns = extractIndexedColumns();
+            
             while (iter.hasNext())
             {
                 if (isStopRequested())
                     throw new CompactionInterruptedException(getCompactionInfo());
                 DecoratedKey key = iter.next();
-                cfs.indexManager.indexPartition(key, indexers, pageSize);
+                cfs.indexManager.indexPartition(key, indexers, pageSize, targetPartitionColumns);
             }
         }
         finally
         {
             iter.close();
         }
     }
+
+    private RegularAndStaticColumns extractIndexedColumns()
+    {
+        RegularAndStaticColumns.Builder builder = RegularAndStaticColumns.builder();
+        
+        for (Index index : indexers)
+        {
+            boolean isPartitionIndex = true;
+            
+            for (ColumnMetadata column : cfs.metadata().regularAndStaticColumns())
+            {
+                if (index.dependsOn(column))
+                {
+                    builder.add(column);
+                    isPartitionIndex = false;
+                }
+
+                // if any index declares no dependency on any column, it is a full partition index
+                // so we can use the base partition columns as the input source
+                if (isPartitionIndex)

Review comment:
       Shouldn't this be outside of the loop?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on pull request #1098: modify SecondaryIndexManager#indexPartition() to retrieve only columns for which indexes are actually being built

Posted by GitBox <gi...@apache.org>.
maedhroz commented on pull request #1098:
URL: https://github.com/apache/cassandra/pull/1098#issuecomment-893618575


   https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16776


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #1098: modify SecondaryIndexManager#indexPartition() to retrieve only columns for which indexes are actually being built

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #1098:
URL: https://github.com/apache/cassandra/pull/1098#discussion_r683618417



##########
File path: src/java/org/apache/cassandra/index/internal/CollatedViewIndexBuilder.java
##########
@@ -55,29 +57,57 @@ public CollatedViewIndexBuilder(ColumnFamilyStore cfs, Set<Index> indexers, Redu
     public CompactionInfo getCompactionInfo()
     {
         return new CompactionInfo(cfs.metadata(),
-                OperationType.INDEX_BUILD,
-                iter.getBytesRead(),
-                iter.getTotalBytes(),
-                compactionId,
-                sstables);
+                                  OperationType.INDEX_BUILD,
+                                  iter.getBytesRead(),
+                                  iter.getTotalBytes(),
+                                  compactionId,
+                                  sstables);
     }
 
     public void build()
     {
         try
         {
             int pageSize = cfs.indexManager.calculateIndexingPageSize();
+            RegularAndStaticColumns targetPartitionColumns = extractIndexedColumns();
+            
             while (iter.hasNext())
             {
                 if (isStopRequested())
                     throw new CompactionInterruptedException(getCompactionInfo());
                 DecoratedKey key = iter.next();
-                cfs.indexManager.indexPartition(key, indexers, pageSize);
+                cfs.indexManager.indexPartition(key, indexers, pageSize, targetPartitionColumns);
             }
         }
         finally
         {
             iter.close();
         }
     }
+
+    private RegularAndStaticColumns extractIndexedColumns()
+    {
+        RegularAndStaticColumns.Builder builder = RegularAndStaticColumns.builder();
+        
+        for (Index index : indexers)
+        {
+            boolean isPartitionIndex = true;
+            
+            for (ColumnMetadata column : cfs.metadata().regularAndStaticColumns())
+            {
+                if (index.dependsOn(column))
+                {
+                    builder.add(column);
+                    isPartitionIndex = false;
+                }
+
+                // if any index declares no dependency on any column, it is a full partition index
+                // so we can use the base partition columns as the input source
+                if (isPartitionIndex)

Review comment:
       You're right. Good catch :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #1098: modify SecondaryIndexManager#indexPartition() to retrieve only columns for which indexes are actually being built

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #1098:
URL: https://github.com/apache/cassandra/pull/1098#discussion_r667253066



##########
File path: test/memory/org/apache/cassandra/db/compaction/CompactionAllocationTest.java
##########
@@ -764,4 +800,135 @@ public void widePartitionsOverlappingRows3() throws Throwable
     {
         testWidePartitions("widePartitionsOverlappingRows3", 3, maybeInflate(24), true, true);
     }
+
+
+    private static void testIndexingWidePartitions(String name,
+                                                   int numSSTable,
+                                                   int sstablePartitions,
+                                                   IndexDef...indexes) throws Throwable
+    {
+        String ksname = "ks_" + name.toLowerCase();
+        SchemaLoader.createKeyspace(ksname, KeyspaceParams.simple(1),
+                CreateTableStatement.parse("CREATE TABLE tbl (k text, c text, v1 text, v2 text, v3 text, v4 text, PRIMARY KEY (k, c))", ksname).build());
+
+        ColumnFamilyStore cfs = Schema.instance.getColumnFamilyStoreInstance(Schema.instance.getTableMetadata(ksname, "tbl").id);
+        Assert.assertNotNull(cfs);
+        cfs.disableAutoCompaction();
+        int rowWidth = 100;
+        int rowsPerPartition = 1000;
+
+        measure(new Workload()
+        {
+            @SuppressWarnings("UnstableApiUsage")
+            public void setup()
+            {
+                cfs.disableAutoCompaction();

Review comment:
       CC @bdeggleston
   
   I think this was a copy/paste, so I'm wondering if it was necessary to disable twice in the original tests like `testTinyPartitions()` et al




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org