You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by GitBox <gi...@apache.org> on 2021/11/17 03:39:43 UTC
[GitHub] [cassandra] yifan-c commented on a change in pull request #1324: Marcuse/16310

yifan-c commented on a change in pull request #1324:
URL: https://github.com/apache/cassandra/pull/1324#discussion_r750854592



##########
File path: src/java/org/apache/cassandra/tools/nodetool/stats/TableStatsHolder.java
##########
@@ -368,6 +374,14 @@ private String format(long bytes, boolean humanReadable)
         return humanReadable ? FileUtils.stringifyFileSize(bytes) : Long.toString(bytes);
     }
 
+    private Map<String, String> format(Map<String, Long> map, boolean humanReadable)

Review comment:
       nit: it formats the size. Maybe call it `formatSize` to provide better clarity as the method is only called on some values.

##########
File path: src/java/org/apache/cassandra/metrics/TopPartitionTracker.java
##########
@@ -0,0 +1,374 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.metrics;
+
+import java.nio.ByteBuffer;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.TreeSet;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicReference;
+
+import com.google.common.annotations.VisibleForTesting;
+
+import org.apache.cassandra.concurrent.ScheduledExecutors;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.db.DecoratedKey;
+import org.apache.cassandra.db.SystemKeyspace;
+import org.apache.cassandra.db.rows.Cell;
+import org.apache.cassandra.db.rows.RangeTombstoneMarker;
+import org.apache.cassandra.db.rows.Row;
+import org.apache.cassandra.db.rows.UnfilteredRowIterator;
+import org.apache.cassandra.db.transform.Transformation;
+import org.apache.cassandra.dht.IPartitioner;
+import org.apache.cassandra.dht.Range;
+import org.apache.cassandra.dht.Token;
+import org.apache.cassandra.io.sstable.SSTable;
+import org.apache.cassandra.schema.TableMetadata;
+import org.apache.cassandra.utils.Pair;
+
+/**
+ * Tracks top partitions, currently by size and by tombstone count
+ *
+ * Collects during full and preview (-vd) repair since then we read the full partition
+ *
+ * Note that since we can run sub range repair there might be windows where the top partitions are not correct -
+ * for example, assume we track the top 2 partitions for this node:
+ *
+ * tokens with size:
+ * (a, 100); (b, 40); (c, 10); (d, 100); (e, 50); (f, 10)
+ * - top2: a, d
+ * now a is deleted and we run a repair for keys [a, c]
+ * - top2: b, d
+ * and when we repair [d, f]
+ * - top2: d, e
+ *
+ */
+public class TopPartitionTracker
+{
+    private final static String SIZES = "SIZES";
+    private final static String TOMBSTONES = "TOMBSTONES";
+
+    private final AtomicReference<TopHolder> topSizes = new AtomicReference<>();
+    private final AtomicReference<TopHolder> topTombstones = new AtomicReference<>();
+    private final IPartitioner partitioner;
+    private final String keyspace;
+    private final String table;
+
+    public TopPartitionTracker(IPartitioner partitioner, String keyspace, String table)
+    {
+        this.partitioner = partitioner;
+        this.keyspace = keyspace;
+        this.table = table;
+        startup();
+    }
+
+    private void startup()
+    {
+        topSizes.set(new TopHolder(partitioner,
+                                   SystemKeyspace.getTopPartitions(keyspace, table, SIZES),
+                                   DatabaseDescriptor.getMaxTopSizePartitionCount(),
+                                   DatabaseDescriptor.getMinTrackedPartitionSize(),
+                                   null));
+        topTombstones.set(new TopHolder(partitioner,
+                                        SystemKeyspace.getTopPartitions(keyspace, table, TOMBSTONES),
+                                        DatabaseDescriptor.getMaxTopTombstonePartitionCount(),
+                                        DatabaseDescriptor.getMinTrackedPartitionTombstoneCount(),
+                                        null));
+        ScheduledExecutors.optionalTasks.scheduleAtFixedRate(this::save, 60, 60, TimeUnit.MINUTES);
+    }
+
+    @VisibleForTesting
+    public void save()
+    {
+        if (!topSizes.get().top.isEmpty())
+            SystemKeyspace.saveTopPartitions(keyspace, table, SIZES, topSizes.get().top);
+        if (!topTombstones.get().top.isEmpty())

Review comment:
       nit: get the reference to `top` and use the same reference for empty check and saveTopPartition.
   I do not anticipate the referenced collection can ever be updated to an empty one in the current code. Just make it easier to reason about. 

##########
File path: src/java/org/apache/cassandra/metrics/TopPartitionTracker.java
##########
@@ -0,0 +1,374 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.metrics;
+
+import java.nio.ByteBuffer;
+import java.util.Collection;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.TreeSet;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicReference;
+
+import com.google.common.annotations.VisibleForTesting;
+
+import org.apache.cassandra.concurrent.ScheduledExecutors;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.db.DecoratedKey;
+import org.apache.cassandra.db.SystemKeyspace;
+import org.apache.cassandra.db.rows.Cell;
+import org.apache.cassandra.db.rows.RangeTombstoneMarker;
+import org.apache.cassandra.db.rows.Row;
+import org.apache.cassandra.db.rows.UnfilteredRowIterator;
+import org.apache.cassandra.db.transform.Transformation;
+import org.apache.cassandra.dht.IPartitioner;
+import org.apache.cassandra.dht.Range;
+import org.apache.cassandra.dht.Token;
+import org.apache.cassandra.io.sstable.SSTable;
+import org.apache.cassandra.schema.TableMetadata;
+import org.apache.cassandra.utils.Pair;
+
+/**
+ * Tracks top partitions, currently by size and by tombstone count
+ *
+ * Collects during full and preview (-vd) repair since then we read the full partition
+ *
+ * Note that since we can run sub range repair there might be windows where the top partitions are not correct -
+ * for example, assume we track the top 2 partitions for this node:
+ *
+ * tokens with size:
+ * (a, 100); (b, 40); (c, 10); (d, 100); (e, 50); (f, 10)
+ * - top2: a, d
+ * now a is deleted and we run a repair for keys [a, c]
+ * - top2: b, d
+ * and when we repair [d, f]
+ * - top2: d, e
+ *
+ */
+public class TopPartitionTracker
+{
+    private final static String SIZES = "SIZES";
+    private final static String TOMBSTONES = "TOMBSTONES";

Review comment:
       How about defining enum for the "TopType"? 

##########
File path: test/distributed/org/apache/cassandra/distributed/test/TopPartitionsTest.java
##########
@@ -0,0 +1,274 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.distributed.test;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.Map;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.db.ColumnFamilyStore;
+import org.apache.cassandra.db.Keyspace;
+import org.apache.cassandra.distributed.Cluster;
+import org.apache.cassandra.distributed.api.ConsistencyLevel;
+import org.assertj.core.api.Assertions;
+
+import static org.apache.cassandra.distributed.api.Feature.GOSSIP;
+import static org.apache.cassandra.distributed.api.Feature.NETWORK;
+import static org.junit.Assert.assertEquals;
+import static org.psjava.util.AssertStatus.assertTrue;
+
+@RunWith(Parameterized.class)
+public class TopPartitionsTest extends TestBaseImpl
+{
+    public enum Repair
+    {
+        Incremental, Full, FullPreview
+    }
+
+    private static AtomicInteger COUNTER = new AtomicInteger(0);
+    private static Cluster CLUSTER;
+
+    private final Repair repair;
+
+    public TopPartitionsTest(Repair repair)
+    {
+        this.repair = repair;
+    }
+
+    @Parameterized.Parameters(name = "{0}")
+    public static Collection<Object[]> messages()
+    {
+        return Stream.of(Repair.values())
+                     .map(a -> new Object[]{ a })
+                     .collect(Collectors.toList());
+    }
+
+    @BeforeClass
+    public static void setup() throws IOException
+    {
+        CLUSTER = init(Cluster.build(2).withConfig(config ->
+                                                   config.set("min_tracked_partition_size_bytes", 0)
+                                                         .set("min_tracked_partition_tombstone_count", 0)
+                                                         .with(GOSSIP, NETWORK))
+                              .start());
+    }
+
+    @AfterClass
+    public static void cleanup()
+    {
+        if (CLUSTER != null)
+            CLUSTER.close();
+    }
+
+    @Before
+    public void before()
+    {
+        setCount(10, 10);
+    }
+
+    @Test
+    public void basicPartitionSizeTest()

Review comment:
       It would be great to also run `tablestats` with `--top` and check the top partitions output is there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org