You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by sumanth-pasupuleti <gi...@git.apache.org> on 2018/10/04 17:00:38 UTC

[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

GitHub user sumanth-pasupuleti opened a pull request:

    https://github.com/apache/cassandra/pull/277

    12106 - blacklisting bad partitions for point reads

    Based on feedback from design document https://docs.google.com/document/d/1obg2IFL-UBU1KErvItxVBQL--oURvaHX-eDjq-Jz9Ys/edit, scoped this for point reads.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sumanth-pasupuleti/cassandra 12106_trunk

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/cassandra/pull/277.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #277
    
----
commit 58a9d2ff20e2bd380b8343a652cfa9074798c2f1
Author: sumanthpasupuleti <su...@...>
Date:   2018-10-04T16:22:50Z

    Adding blacklisted partitions feature with tests

commit f4c6f985afc33b72803385c0e7605b792e1abb8f
Author: sumanthpasupuleti <su...@...>
Date:   2018-10-04T16:36:18Z

    Adding nodetool to refresh blacklisted partitions cache

commit 279312033c39fdf5ce8aa3e0f8653bef7485f5ed
Author: sumanthpasupuleti <su...@...>
Date:   2018-10-04T16:53:14Z

    Adding documentation on blacklisting partitions

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r227865087
  
    --- Diff: src/java/org/apache/cassandra/service/CassandraDaemon.java ---
    @@ -426,6 +426,9 @@ public void uncaughtException(Thread t, Throwable e)
             // Native transport
             nativeTransportService = new NativeTransportService();
     
    +        // load blacklisted partitions into cache
    +        StorageService.instance.refreshBlacklistedPartitionsCache();
    --- End diff --
    
    I think this is where we might have to deal with some nodes not having it potentially.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r227864906
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    +                                             ConsistencyLevel.ONE);
    +        }
    +        catch (Exception e)
    +        {
    +            logger.error("Error querying blacklisted partitions", e);
    +            return Collections.emptySet();
    +        }
    +
    +        Set<BlacklistedPartition> blacklistedPartitions = new HashSet<>();
    +        int cacheSize = 0;
    --- End diff --
    
    nit: might be easier to refer to this with a unit so that the divide by later makes sense e.g. `cacheSizeBytes`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230864957
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartition.java ---
    @@ -0,0 +1,119 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.nio.ByteBuffer;
    +import java.util.Arrays;
    +
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.schema.KeyspaceMetadata;
    +import org.apache.cassandra.schema.Schema;
    +import org.apache.cassandra.schema.TableId;
    +import org.apache.cassandra.schema.TableMetadata;
    +import org.apache.cassandra.schema.TableMetadataRef;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +import org.apache.cassandra.utils.ObjectSizes;
    +
    +/**
    + * Class to represent a blacklisted partition
    + */
    +public class BlacklistedPartition implements IMeasurableMemory
    --- End diff --
    
    Hm, not sure I agree since this is in the same package, but I don't really care :-)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223518571
  
    --- Diff: test/unit/org/apache/cassandra/cache/BlacklistedPartitionsCacheTest.java ---
    @@ -0,0 +1,175 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import org.junit.Assert;
    +import org.junit.BeforeClass;
    +import org.junit.Test;
    +
    +import com.datastax.driver.core.PreparedStatement;
    +import com.datastax.driver.core.ResultSet;
    +import com.datastax.driver.core.Session;
    +import com.datastax.driver.core.exceptions.InvalidQueryException;
    +import org.apache.cassandra.cql3.CQLTester;
    +import org.apache.cassandra.repair.SystemDistributedKeyspace;
    +import org.apache.cassandra.schema.SchemaConstants;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +
    +public class BlacklistedPartitionsCacheTest extends CQLTester
    --- End diff --
    
    Can you add a few test cases which can cover different types of partition keys (text, int, etc.,)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230864731
  
    --- Diff: doc/source/operating/blacklisting_partitions.rst ---
    @@ -0,0 +1,63 @@
    +.. Licensed to the Apache Software Foundation (ASF) under one
    +.. or more contributor license agreements.  See the NOTICE file
    +.. distributed with this work for additional information
    +.. regarding copyright ownership.  The ASF licenses this file
    +.. to you under the Apache License, Version 2.0 (the
    +.. "License"); you may not use this file except in compliance
    +.. with the License.  You may obtain a copy of the License at
    +..
    +..     http://www.apache.org/licenses/LICENSE-2.0
    +..
    +.. Unless required by applicable law or agreed to in writing, software
    +.. distributed under the License is distributed on an "AS IS" BASIS,
    +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +.. See the License for the specific language governing permissions and
    +.. limitations under the License.
    +
    +.. highlight:: none
    +
    +
    +Blacklisting Partitions (CASSANDRA-12106)
    +-----------------------------------------
    +
    +This feature allows partition keys to be blacklisted i.e. Cassandra would not process following operations on those
    +blacklisted partitions.
    +
    +- Point READs
    +
    +Response would be InvalidQueryException.
    +
    +It is important to note that, this would not have any effect on range reads or write operations.
    +
    +How to blacklist a partition key
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +``system_distributed.blacklisted_partitions`` table can be used to blacklist partitions. Essentially, you will have to
    +insert a new row into the table with the following details:
    +
    +- Keyspace name
    +- Table name
    +- Partition key
    +
    +The way partition key needs to be mentioned is exactly similar to how ``nodetool getendpoints`` works.
    +Following are several examples for blacklisting partition keys in keyspace, say ks, and table, say table1 for different
    +possibilities of table1 partition key:
    +- For a single column partition key, say Id,
    +  Id is a simple type - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1');
    +  Id is a blob        - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','12345f');
    +  Id has a colon      - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1\:2');
    +
    +- For a composite column partition key, say (Key1, Key2),
    +  insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name, partition_key) VALUES ('ks','table1','k11:k21');
    +
    +BlacklistedPartitions Cache
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +Cassandra internally maintains an on-heap cache that loads partition keys from ``system_distributed.blacklisted_partitions``.
    +Any partition key mentioned against a non-existent keyspace name and table name will not be cached.
    +
    +Cache gets refreshed in following ways
    +- During Cassandra start up
    +- Scheduled cache refresh at a default interval of 10 minutes (can be overridden using ``blacklisted_partitions_cache_refresh_period_in_sec`` yaml property)
    +- Using nodetool refreshblacklistedpartitionscache
    +
    +There is no bound on how much on-heap memory this cache can occupy - so be cautious on how many keys you would want to blacklist.
    +``blacklisted_partitions_cache_size_warn_threshold_in_mb`` yaml property can be used to be notified (via warning logs) if cache size exceeds the set threshold.
    --- End diff --
    
    The docs have a `cassandra.yaml` section that you can link to to share context on configuration options. I was suggesting you could link to that page potentially? http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r225485013
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,42 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    --- End diff --
    
    After further offline discussion, changed the cache to be bounded, in order to avoid possible OOM'ing of C*


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223516968
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartition.java ---
    @@ -0,0 +1,119 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.nio.ByteBuffer;
    +import java.util.Arrays;
    +
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.schema.KeyspaceMetadata;
    +import org.apache.cassandra.schema.Schema;
    +import org.apache.cassandra.schema.TableId;
    +import org.apache.cassandra.schema.TableMetadata;
    +import org.apache.cassandra.schema.TableMetadataRef;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +import org.apache.cassandra.utils.ObjectSizes;
    +
    +/**
    + * Class to represent a blacklisted partition
    + */
    +public class BlacklistedPartition implements IMeasurableMemory
    +{
    +    private static final long EMPTY_SIZE = ObjectSizes.measure(new BlacklistedPartition(null, new byte[0]));
    +    public final byte[] key;
    +    public final TableId tableId;
    +
    +    public BlacklistedPartition(TableId tableId, DecoratedKey key)
    +    {
    +        this.tableId = tableId;
    +        this.key = ByteBufferUtil.getArray(key.getKey());
    +    }
    +
    +    private BlacklistedPartition(TableId tableId, byte[] key)
    +    {
    +        this.tableId = tableId;
    +        this.key = key;
    +    }
    +
    +    /**
    +     * Creates an instance of BlacklistedPartition for a given keyspace, table and partition key
    +     *
    +     * @param keyspace
    +     * @param table
    +     * @param key
    +     * @throws IllegalArgumentException
    +     */
    +    public BlacklistedPartition(String keyspace, String table, String key) throws IllegalArgumentException
    +    {
    +        // Determine tableId from keyspace and table parameters. If tableId cannot be determined due to invalid
    +        // parameters, throw an exception
    +        KeyspaceMetadata ksMetaData = Schema.instance.getKeyspaceMetadata(keyspace);
    +        if (ksMetaData == null)
    +        {
    +            throw new IllegalArgumentException("Unknown keyspace '" + keyspace + "'");
    +        }
    +
    +        TableMetadata metadata = ksMetaData.getTableOrViewNullable(table);
    --- End diff --
    
    You can simplify this by just checking `hasTable`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230865548
  
    --- Diff: src/java/org/apache/cassandra/service/CassandraDaemon.java ---
    @@ -426,6 +426,9 @@ public void uncaughtException(Thread t, Throwable e)
             // Native transport
             nativeTransportService = new NativeTransportService();
     
    +        // load blacklisted partitions into cache
    +        StorageService.instance.refreshBlacklistedPartitionsCache();
    --- End diff --
    
    During a cluster upgrade, I'm not sure that we can assume this call will succeed. I'm not entirely sure how that works tbh with system_distributed tables. I guess that the first node of the new version will create the table maybe?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223517674
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,42 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    +                                             ConsistencyLevel.ONE);
    +        }
    +        catch (Exception e)
    +        {
    +            logger.error("Error querying blacklisted partitions");
    --- End diff --
    
    You might want to log the exception also.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223780623
  
    --- Diff: doc/source/operating/blacklisting_partitions.rst ---
    @@ -0,0 +1,63 @@
    +.. Licensed to the Apache Software Foundation (ASF) under one
    +.. or more contributor license agreements.  See the NOTICE file
    +.. distributed with this work for additional information
    +.. regarding copyright ownership.  The ASF licenses this file
    +.. to you under the Apache License, Version 2.0 (the
    +.. "License"); you may not use this file except in compliance
    +.. with the License.  You may obtain a copy of the License at
    +..
    +..     http://www.apache.org/licenses/LICENSE-2.0
    +..
    +.. Unless required by applicable law or agreed to in writing, software
    +.. distributed under the License is distributed on an "AS IS" BASIS,
    +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +.. See the License for the specific language governing permissions and
    +.. limitations under the License.
    +
    +.. highlight:: none
    +
    +
    +Blacklisting Partitions (CASSANDRA-12106)
    +-----------------------------------------
    +
    +This feature allows partition keys to be blacklisted i.e. Cassandra would not process following operations on those
    --- End diff --
    
    I think this can be a little clearer and spend more time on "why" documentation as opposed to "how" (and don't link the ticket in the title). How about something like:
    
    ```markdown 
    Partition Blacklist
    -----------------------
    Sometimes there are specific partitions that are "hot" and cause instability in a Cassandra cluster. This can happen because of a bad data model that is focusing many update operations on a single partitions, which causes a partition to grow large over time and in turn it becomes very expensive to read.
    
    Cassandra supports blacklisting such problematic partitions so that when clients issue point reads (`SELECT` statements with the partition key specified), instead of impacting Cassandra the query will be immediately rejected with an `InvalidQueryException`
    ```
    
    Might want to also explain what the error will look like for SEO (the text in the error).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230548363
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    --- End diff --
    
    based on my quick glance, I don't think it does. Even if it does paginate, not sure how we can avoid possible OOM (without inspecting result size after each paginated query).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230865112
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    --- End diff --
    
    I think there is a local execution with pagination option (which should avoid oom right)?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223545187
  
    --- Diff: test/unit/org/apache/cassandra/cache/BlacklistedPartitionsCacheTest.java ---
    @@ -0,0 +1,175 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import org.junit.Assert;
    +import org.junit.BeforeClass;
    +import org.junit.Test;
    +
    +import com.datastax.driver.core.PreparedStatement;
    +import com.datastax.driver.core.ResultSet;
    +import com.datastax.driver.core.Session;
    +import com.datastax.driver.core.exceptions.InvalidQueryException;
    +import org.apache.cassandra.cql3.CQLTester;
    +import org.apache.cassandra.repair.SystemDistributedKeyspace;
    +import org.apache.cassandra.schema.SchemaConstants;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +
    +public class BlacklistedPartitionsCacheTest extends CQLTester
    --- End diff --
    
    @vinaykumarchella  I tried doing that to a certain extent in "testBlacklistingPartitionWithCompositeKey" where I test partition key of type blob, text and int. Do you suggest doing it for other types?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223518337
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -1545,6 +1546,18 @@ public static PartitionIterator read(SinglePartitionReadCommand.Group group, Con
                 throw new IsBootstrappingException();
             }
     
    +        // check if the partition in question is blacklisted; if yes, reject READ operation
    +        if (BlacklistedPartitionCache.instance.size() > 0)
    +        {
    +            for (SinglePartitionReadQuery query : group.queries)
    +            {
    +                if (BlacklistedPartitionCache.instance.contains(query.metadata().id, query.partitionKey()))
    +                {
    +                    throw new InvalidRequestException("Cannot perform READ on a blacklisted partition");
    --- End diff --
    
    Do you want to expose the parition key here in the exception message?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223517840
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,42 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    --- End diff --
    
    I am assuming `BLACKLISTED_PARTITIONS` table wont be too big that it eats up your heap. Do you have any restrictions on that?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r225632919
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    +                                             ConsistencyLevel.ONE);
    +        }
    +        catch (Exception e)
    +        {
    +            logger.error("Error querying blacklisted partitions", e);
    +            return Collections.emptySet();
    +        }
    +
    +        Set<BlacklistedPartition> blacklistedPartitions = new HashSet<>();
    +        int cacheSize = 0;
    +        for (UntypedResultSet.Row row : results)
    +        {
    +            try
    +            {
    +                BlacklistedPartition blacklistedPartition = new BlacklistedPartition(row.getString("keyspace_name"), row.getString("columnfamily_name"), row.getString("partition_key"));
    +
    +                // check if adding this blacklisted partition would increase cache size beyond the set limit.
    +                cacheSize += blacklistedPartition.unsharedHeapSize();
    +                if (cacheSize / (1024 * 1024) > DatabaseDescriptor.getBlackListedPartitionsCacheSizeLimitInMB())
    +                {
    +                    logger.warn("BlacklistedPartitions cache size limit of {} MB reached. Unable to load more blacklisted partitions. BlacklistedPartitions cache working in degraded mode.", DatabaseDescriptor.getBlackListedPartitionsCacheSizeLimitInMB());
    --- End diff --
    
    Add more details to the documentation section, explain what it means and what users should when the operator finds this message in C* logs. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r227864320
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -119,6 +123,15 @@ private SystemDistributedKeyspace()
                          + "status text,"
                          + "PRIMARY KEY ((keyspace_name, view_name), host_id))");
     
    +    private static final TableMetadata BlacklistedPartitions =
    --- End diff --
    
    So in light of the discussions around how to do upgrades when there are different versions and different system_distributed tables, have we checked that this is gracefully handled if it is or isn't there (e.g. during a 4.0 -> 5.0 upgrade)?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230548740
  
    --- Diff: test/unit/org/apache/cassandra/cache/BlacklistedPartitionsCacheTest.java ---
    @@ -0,0 +1,175 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import org.junit.Assert;
    +import org.junit.BeforeClass;
    +import org.junit.Test;
    +
    +import com.datastax.driver.core.PreparedStatement;
    +import com.datastax.driver.core.ResultSet;
    +import com.datastax.driver.core.Session;
    +import com.datastax.driver.core.exceptions.InvalidQueryException;
    +import org.apache.cassandra.cql3.CQLTester;
    +import org.apache.cassandra.repair.SystemDistributedKeyspace;
    +import org.apache.cassandra.schema.SchemaConstants;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +
    +public class BlacklistedPartitionsCacheTest extends CQLTester
    --- End diff --
    
    makes sense. Added a test case where reading blacklist fails due to absence of blacklisted_partitions table


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r225277660
  
    --- Diff: src/java/org/apache/cassandra/tools/nodetool/RefreshBlacklistedPartitionsCache.java ---
    @@ -0,0 +1,30 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.tools.nodetool;
    +
    +import io.airlift.airline.Command;
    +import org.apache.cassandra.tools.NodeProbe;
    +import org.apache.cassandra.tools.NodeTool;
    +
    +@Command(name = "refreshblacklistedpartitionscache", description = "Refresh blacklisted partitions cache from system_distributed.blacklisted_partitions table")
    --- End diff --
    
    Can we make blacklist a top level and then allow changing the size at runtime:
    
    ```
    nodetool blacklistpartition refresh
    nodetool blacklistpartition setsize 1000
    ```
    
    This also makes extending in the future easy.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223515773
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartitionCache.java ---
    @@ -0,0 +1,96 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.util.Set;
    +import java.util.concurrent.TimeUnit;
    +
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +import org.apache.cassandra.concurrent.ScheduledExecutors;
    +import org.apache.cassandra.config.DatabaseDescriptor;
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.repair.SystemDistributedKeyspace;
    +import org.apache.cassandra.schema.TableId;
    +
    +/**
    + * Cache that loads blacklisted partitions from system_distributed.blacklisted_partitions table
    + */
    +public class BlacklistedPartitionCache
    +{
    +    public final static BlacklistedPartitionCache instance = new BlacklistedPartitionCache();
    +    private static final Logger logger = LoggerFactory.getLogger(BlacklistedPartitionCache.class);
    +    private volatile Set<BlacklistedPartition> blacklistedPartitions;
    +
    +    private BlacklistedPartitionCache()
    +    {
    +        // setup a periodic refresh
    +        ScheduledExecutors.optionalTasks.scheduleWithFixedDelay(this::refreshCache,
    +                                                                DatabaseDescriptor.getBlacklistedPartitionsCacheRefreshInSec(),
    +                                                                DatabaseDescriptor.getBlacklistedPartitionsCacheRefreshInSec(),
    +                                                                TimeUnit.SECONDS);
    +    }
    +
    +    /**
    +     * Loads blacklisted partitions from system_distributed.blacklisted partitions table.
    +     * Also logs a warning if cache size exceeds the set threshold.
    +     */
    +    public void refreshCache()
    +    {
    +        this.blacklistedPartitions = SystemDistributedKeyspace.getBlacklistedPartitions();
    +
    +        // attempt to compute cache size only if there is a warn threshold configured
    +        if (DatabaseDescriptor.getBlackListedPartitionsCacheSizeWarnThresholdInMB() > 0)
    +        {
    +            long cacheSize = 0;
    +            for (BlacklistedPartition blacklistedPartition : blacklistedPartitions)
    +            {
    +                cacheSize += blacklistedPartition.unsharedHeapSize();
    +            }
    +
    +            if (cacheSize / (1024 * 1024) >= DatabaseDescriptor.getBlackListedPartitionsCacheSizeWarnThresholdInMB())
    +            {
    +                logger.warn(String.format("BlacklistedPartition cache size (%d) MB exceeds threshold size (%d) MB", cacheSize / (1024 * 1024), DatabaseDescriptor.getBlackListedPartitionsCacheSizeWarnThresholdInMB()));
    --- End diff --
    
    You can avoid `String.format` inside logger and use the formatting specifiers directly


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230548472
  
    --- Diff: src/java/org/apache/cassandra/tools/nodetool/BlacklistedPartitions.java ---
    @@ -0,0 +1,48 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.tools.nodetool;
    +
    +import io.airlift.airline.Command;
    +import io.airlift.airline.Option;
    +import org.apache.cassandra.tools.NodeProbe;
    +import org.apache.cassandra.tools.NodeTool;
    +
    +@Command(name = "blacklistedpartitions", description = "Refresh blacklisted partitions cache from system_distributed.blacklisted_partitions table")
    +public class BlacklistedPartitions extends NodeTool.NodeToolCmd
    +{
    +    @Option(title = "refresh_cache", name = { "--refresh-cache"}, description = "Refresh blacklisted partitions cache. Default = false.")
    --- End diff --
    
    done


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230548204
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartition.java ---
    @@ -0,0 +1,119 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.nio.ByteBuffer;
    +import java.util.Arrays;
    +
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.schema.KeyspaceMetadata;
    +import org.apache.cassandra.schema.Schema;
    +import org.apache.cassandra.schema.TableId;
    +import org.apache.cassandra.schema.TableMetadata;
    +import org.apache.cassandra.schema.TableMetadataRef;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +import org.apache.cassandra.utils.ObjectSizes;
    +
    +/**
    + * Class to represent a blacklisted partition
    + */
    +public class BlacklistedPartition implements IMeasurableMemory
    --- End diff --
    
    I doubt if we need to be consistent with other caches though; this cache, and the other caches are vastly different in the behavior.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r227865467
  
    --- Diff: src/java/org/apache/cassandra/tools/nodetool/BlacklistedPartitions.java ---
    @@ -0,0 +1,48 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.tools.nodetool;
    +
    +import io.airlift.airline.Command;
    +import io.airlift.airline.Option;
    +import org.apache.cassandra.tools.NodeProbe;
    +import org.apache.cassandra.tools.NodeTool;
    +
    +@Command(name = "blacklistedpartitions", description = "Refresh blacklisted partitions cache from system_distributed.blacklisted_partitions table")
    +public class BlacklistedPartitions extends NodeTool.NodeToolCmd
    +{
    +    @Option(title = "refresh_cache", name = { "--refresh-cache"}, description = "Refresh blacklisted partitions cache. Default = false.")
    --- End diff --
    
    Maybe indicate where it's refreshing from (the system table)?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223786251
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartition.java ---
    @@ -0,0 +1,119 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.nio.ByteBuffer;
    +import java.util.Arrays;
    +
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.schema.KeyspaceMetadata;
    +import org.apache.cassandra.schema.Schema;
    +import org.apache.cassandra.schema.TableId;
    +import org.apache.cassandra.schema.TableMetadata;
    +import org.apache.cassandra.schema.TableMetadataRef;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +import org.apache.cassandra.utils.ObjectSizes;
    +
    +/**
    + * Class to represent a blacklisted partition
    --- End diff --
    
    nit: to me this comment and others in the class appear somewhat redundant to the class or method name.
    
    Generally I'm a fan of "if you have to explain _why_ or _how_ use comments, _what_ should be explained by the name/signature"


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r225641831
  
    --- Diff: doc/source/operating/blacklisting_partitions.rst ---
    @@ -0,0 +1,67 @@
    +.. Licensed to the Apache Software Foundation (ASF) under one
    +.. or more contributor license agreements.  See the NOTICE file
    +.. distributed with this work for additional information
    +.. regarding copyright ownership.  The ASF licenses this file
    +.. to you under the Apache License, Version 2.0 (the
    +.. "License"); you may not use this file except in compliance
    +.. with the License.  You may obtain a copy of the License at
    +..
    +..     http://www.apache.org/licenses/LICENSE-2.0
    +..
    +.. Unless required by applicable law or agreed to in writing, software
    +.. distributed under the License is distributed on an "AS IS" BASIS,
    +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +.. See the License for the specific language governing permissions and
    +.. limitations under the License.
    +
    +.. highlight:: none
    +
    +
    +Blacklisting Partitions (CASSANDRA-12106)
    +-----------------------------------------
    +
    +This feature allows partition keys to be blacklisted i.e. Cassandra would not process following operations on those
    +blacklisted partitions.
    +
    +- Point READs
    +
    +Response would be InvalidQueryException.
    +
    +It is important to note that, this would not have any effect on range reads or write operations.
    +
    +How to blacklist a partition key
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +``system_distributed.blacklisted_partitions`` table can be used to blacklist partitions. Essentially, you will have to
    +insert a new row into the table with the following details:
    +
    +- Keyspace name
    +- Table name
    +- Partition key
    +
    +The way partition key needs to be mentioned is exactly similar to how ``nodetool getendpoints`` works.
    +Following are several examples for blacklisting partition keys in keyspace, say ks, and table, say table1 for different
    +possibilities of table1 partition key:
    +- For a single column partition key, say Id,
    +  Id is a simple type - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1');
    +  Id is a blob        - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','12345f');
    +  Id has a colon      - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1\:2');
    +
    +- For a composite column partition key, say (Key1, Key2),
    +  insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name, partition_key) VALUES ('ks','table1','k11:k21');
    +
    +BlacklistedPartitions Cache
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +Cassandra internally maintains an on-heap cache that loads partition keys from ``system_distributed.blacklisted_partitions``.
    +Any partition key mentioned against a non-existent keyspace name and table name will not be cached.
    +
    +Cache gets refreshed in following ways
    +- During Cassandra start up
    +- Scheduled cache refresh at a default interval of 10 minutes (can be overridden using ``blacklisted_partitions_cache_refresh_period_in_sec`` yaml property)
    +- Using ``nodetool blacklistedpartitions --refresh-cache``
    +
    +Cache size is bounded, set by ``blacklisted_partitions_cache_size_limit_in_mb`` yaml property (defaulted to 100 MB).
    +As cache loads blacklisted partitions from ``system_distributed.blacklisted_partitions``, it checks if cache size exceeds the size limit.
    +The moment cache size exceeeds the size limit, no more partitions are loaded into the cache, and only those partitions that have been loaded so far would be blacklisted.
    +A warning would be logged to indicate the same.
    +Size limit on the cache can be changed using the yaml property, or dynamically using below nodetool command
    +``nodetool blacklistedpartitions --cache-size-limit-in-mb=150 --refreshcache``
    --- End diff --
    
    I realize I already call this out in the feature description at the beginning of this document.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223780940
  
    --- Diff: doc/source/operating/blacklisting_partitions.rst ---
    @@ -0,0 +1,63 @@
    +.. Licensed to the Apache Software Foundation (ASF) under one
    +.. or more contributor license agreements.  See the NOTICE file
    +.. distributed with this work for additional information
    +.. regarding copyright ownership.  The ASF licenses this file
    +.. to you under the Apache License, Version 2.0 (the
    +.. "License"); you may not use this file except in compliance
    +.. with the License.  You may obtain a copy of the License at
    +..
    +..     http://www.apache.org/licenses/LICENSE-2.0
    +..
    +.. Unless required by applicable law or agreed to in writing, software
    +.. distributed under the License is distributed on an "AS IS" BASIS,
    +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +.. See the License for the specific language governing permissions and
    +.. limitations under the License.
    +
    +.. highlight:: none
    +
    +
    +Blacklisting Partitions (CASSANDRA-12106)
    +-----------------------------------------
    +
    +This feature allows partition keys to be blacklisted i.e. Cassandra would not process following operations on those
    +blacklisted partitions.
    +
    +- Point READs
    +
    +Response would be InvalidQueryException.
    +
    +It is important to note that, this would not have any effect on range reads or write operations.
    +
    +How to blacklist a partition key
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +``system_distributed.blacklisted_partitions`` table can be used to blacklist partitions. Essentially, you will have to
    +insert a new row into the table with the following details:
    +
    +- Keyspace name
    +- Table name
    +- Partition key
    +
    +The way partition key needs to be mentioned is exactly similar to how ``nodetool getendpoints`` works.
    --- End diff --
    
    Is there any existing documentation of this key format we could link to?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r225642809
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    +                                             ConsistencyLevel.ONE);
    +        }
    +        catch (Exception e)
    +        {
    +            logger.error("Error querying blacklisted partitions", e);
    +            return Collections.emptySet();
    +        }
    +
    +        Set<BlacklistedPartition> blacklistedPartitions = new HashSet<>();
    +        int cacheSize = 0;
    +        for (UntypedResultSet.Row row : results)
    +        {
    +            try
    +            {
    +                BlacklistedPartition blacklistedPartition = new BlacklistedPartition(row.getString("keyspace_name"), row.getString("columnfamily_name"), row.getString("partition_key"));
    +
    +                // check if adding this blacklisted partition would increase cache size beyond the set limit.
    +                cacheSize += blacklistedPartition.unsharedHeapSize();
    +                if (cacheSize / (1024 * 1024) > DatabaseDescriptor.getBlackListedPartitionsCacheSizeLimitInMB())
    --- End diff --
    
    makes sense!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223544527
  
    --- Diff: src/java/org/apache/cassandra/config/Config.java ---
    @@ -257,6 +257,9 @@
         public volatile int counter_cache_save_period = 7200;
         public volatile int counter_cache_keys_to_save = Integer.MAX_VALUE;
     
    +    public long blacklisted_partitions_cache_size_warn_threshold_in_mb;
    +    public int blacklisted_partitions_cache_refresh_period_in_sec = 600; //10 minutes
    --- End diff --
    
    i've already added a nodetool command to force refresh a cache incase someone adds a new blacklisted partition to be cached. I am not sure of the likelyhood of someone wanting to change the frequency of cache refresh at run time. I see your point of reducing the config changes, but I am honestly not sure if we should shy away from adding a config.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r227864610
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    --- End diff --
    
    Just double checking, does this paginate?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223786322
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartition.java ---
    @@ -0,0 +1,119 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.nio.ByteBuffer;
    +import java.util.Arrays;
    +
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.schema.KeyspaceMetadata;
    +import org.apache.cassandra.schema.Schema;
    +import org.apache.cassandra.schema.TableId;
    +import org.apache.cassandra.schema.TableMetadata;
    +import org.apache.cassandra.schema.TableMetadataRef;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +import org.apache.cassandra.utils.ObjectSizes;
    +
    +/**
    + * Class to represent a blacklisted partition
    + */
    +public class BlacklistedPartition implements IMeasurableMemory
    --- End diff --
    
    To be consistent with the other caches maybe `BlacklistedPartitionCacheKey` ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r227863003
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartitionCache.java ---
    @@ -0,0 +1,82 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.util.Set;
    +import java.util.concurrent.TimeUnit;
    +
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +import org.apache.cassandra.concurrent.ScheduledExecutors;
    +import org.apache.cassandra.config.DatabaseDescriptor;
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.repair.SystemDistributedKeyspace;
    +import org.apache.cassandra.schema.TableId;
    +
    +/**
    + * Cache that loads blacklisted partitions from system_distributed.blacklisted_partitions table
    + * This does not intentionally use AutoSavingCache since this is orthogonal to how AutoSavingCache works, i.e.
    --- End diff --
    
    nit: "This intentionally does not" or "This intentionally avoids AutoSavingCache"


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223791482
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartition.java ---
    @@ -0,0 +1,119 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.nio.ByteBuffer;
    +import java.util.Arrays;
    +
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.schema.KeyspaceMetadata;
    +import org.apache.cassandra.schema.Schema;
    +import org.apache.cassandra.schema.TableId;
    +import org.apache.cassandra.schema.TableMetadata;
    +import org.apache.cassandra.schema.TableMetadataRef;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +import org.apache.cassandra.utils.ObjectSizes;
    +
    +/**
    + * Class to represent a blacklisted partition
    + */
    +public class BlacklistedPartition implements IMeasurableMemory
    +{
    +    private static final long EMPTY_SIZE = ObjectSizes.measure(new BlacklistedPartition(null, new byte[0]));
    +    public final byte[] key;
    +    public final TableId tableId;
    +
    +    public BlacklistedPartition(TableId tableId, DecoratedKey key)
    +    {
    +        this.tableId = tableId;
    +        this.key = ByteBufferUtil.getArray(key.getKey());
    +    }
    +
    +    private BlacklistedPartition(TableId tableId, byte[] key)
    +    {
    +        this.tableId = tableId;
    +        this.key = key;
    +    }
    +
    +    /**
    +     * Creates an instance of BlacklistedPartition for a given keyspace, table and partition key
    +     *
    +     * @param keyspace
    +     * @param table
    +     * @param key
    +     * @throws IllegalArgumentException
    +     */
    +    public BlacklistedPartition(String keyspace, String table, String key) throws IllegalArgumentException
    +    {
    +        // Determine tableId from keyspace and table parameters. If tableId cannot be determined due to invalid
    --- End diff --
    
    I think it makes sense for this key to throw exceptions since it comes from user input instead of database internals (e.g. the other cache keys don't check that anything is valid because it would be a bug for them not to be). You may want to add a few words here explaining that difference, e.g. `As blacklisted partitions are user provided, they could be invalid`. The other option is to have `validate` methods like most of the e.g. replication strategy options work.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230548454
  
    --- Diff: src/java/org/apache/cassandra/service/CassandraDaemon.java ---
    @@ -426,6 +426,9 @@ public void uncaughtException(Thread t, Throwable e)
             // Native transport
             nativeTransportService = new NativeTransportService();
     
    +        // load blacklisted partitions into cache
    +        StorageService.instance.refreshBlacklistedPartitionsCache();
    --- End diff --
    
    can you elaborate please? not having what?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r225632166
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    +                                             ConsistencyLevel.ONE);
    +        }
    +        catch (Exception e)
    +        {
    +            logger.error("Error querying blacklisted partitions", e);
    +            return Collections.emptySet();
    +        }
    +
    +        Set<BlacklistedPartition> blacklistedPartitions = new HashSet<>();
    +        int cacheSize = 0;
    +        for (UntypedResultSet.Row row : results)
    +        {
    +            try
    +            {
    +                BlacklistedPartition blacklistedPartition = new BlacklistedPartition(row.getString("keyspace_name"), row.getString("columnfamily_name"), row.getString("partition_key"));
    +
    +                // check if adding this blacklisted partition would increase cache size beyond the set limit.
    +                cacheSize += blacklistedPartition.unsharedHeapSize();
    +                if (cacheSize / (1024 * 1024) > DatabaseDescriptor.getBlackListedPartitionsCacheSizeLimitInMB())
    --- End diff --
    
    It makes more sense to put this logic in `BlacklistedPartitionCache` since it is specific to caching, this also makes your caching logic easily testable.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223543063
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartition.java ---
    @@ -0,0 +1,119 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.nio.ByteBuffer;
    +import java.util.Arrays;
    +
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.schema.KeyspaceMetadata;
    +import org.apache.cassandra.schema.Schema;
    +import org.apache.cassandra.schema.TableId;
    +import org.apache.cassandra.schema.TableMetadata;
    +import org.apache.cassandra.schema.TableMetadataRef;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +import org.apache.cassandra.utils.ObjectSizes;
    +
    +/**
    + * Class to represent a blacklisted partition
    + */
    +public class BlacklistedPartition implements IMeasurableMemory
    +{
    +    private static final long EMPTY_SIZE = ObjectSizes.measure(new BlacklistedPartition(null, new byte[0]));
    +    public final byte[] key;
    +    public final TableId tableId;
    +
    +    public BlacklistedPartition(TableId tableId, DecoratedKey key)
    +    {
    +        this.tableId = tableId;
    +        this.key = ByteBufferUtil.getArray(key.getKey());
    +    }
    +
    +    private BlacklistedPartition(TableId tableId, byte[] key)
    +    {
    +        this.tableId = tableId;
    +        this.key = key;
    +    }
    +
    +    /**
    +     * Creates an instance of BlacklistedPartition for a given keyspace, table and partition key
    +     *
    +     * @param keyspace
    +     * @param table
    +     * @param key
    +     * @throws IllegalArgumentException
    +     */
    +    public BlacklistedPartition(String keyspace, String table, String key) throws IllegalArgumentException
    +    {
    +        // Determine tableId from keyspace and table parameters. If tableId cannot be determined due to invalid
    +        // parameters, throw an exception
    +        KeyspaceMetadata ksMetaData = Schema.instance.getKeyspaceMetadata(keyspace);
    +        if (ksMetaData == null)
    +        {
    +            throw new IllegalArgumentException("Unknown keyspace '" + keyspace + "'");
    +        }
    +
    +        TableMetadata metadata = ksMetaData.getTableOrViewNullable(table);
    --- End diff --
    
    yeah, I can replace null check on metadata with hasTable, however, I will still need table metadata to get the table Id, so will keep the table metadata null check.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230548384
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    +                                             ConsistencyLevel.ONE);
    +        }
    +        catch (Exception e)
    +        {
    +            logger.error("Error querying blacklisted partitions", e);
    +            return Collections.emptySet();
    +        }
    +
    +        Set<BlacklistedPartition> blacklistedPartitions = new HashSet<>();
    +        int cacheSize = 0;
    --- End diff --
    
    done


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223517462
  
    --- Diff: src/java/org/apache/cassandra/config/Config.java ---
    @@ -257,6 +257,9 @@
         public volatile int counter_cache_save_period = 7200;
         public volatile int counter_cache_keys_to_save = Integer.MAX_VALUE;
     
    +    public long blacklisted_partitions_cache_size_warn_threshold_in_mb;
    +    public int blacklisted_partitions_cache_refresh_period_in_sec = 600; //10 minutes
    --- End diff --
    
    **Idea:** Do we need to expose this? If we can expose a nodetool and JMX interfaces, this extra config can be avoided. If anyone is interested in refreshing the cache quickly after adding new blacklisted partitions, they can use these interfaces. I am just trying to see if we can reduce the number of configs and keep it simple.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by vinaykumarchella <gi...@git.apache.org>.
Github user vinaykumarchella commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r225629052
  
    --- Diff: doc/source/operating/blacklisting_partitions.rst ---
    @@ -0,0 +1,67 @@
    +.. Licensed to the Apache Software Foundation (ASF) under one
    +.. or more contributor license agreements.  See the NOTICE file
    +.. distributed with this work for additional information
    +.. regarding copyright ownership.  The ASF licenses this file
    +.. to you under the Apache License, Version 2.0 (the
    +.. "License"); you may not use this file except in compliance
    +.. with the License.  You may obtain a copy of the License at
    +..
    +..     http://www.apache.org/licenses/LICENSE-2.0
    +..
    +.. Unless required by applicable law or agreed to in writing, software
    +.. distributed under the License is distributed on an "AS IS" BASIS,
    +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +.. See the License for the specific language governing permissions and
    +.. limitations under the License.
    +
    +.. highlight:: none
    +
    +
    +Blacklisting Partitions (CASSANDRA-12106)
    +-----------------------------------------
    +
    +This feature allows partition keys to be blacklisted i.e. Cassandra would not process following operations on those
    +blacklisted partitions.
    +
    +- Point READs
    +
    +Response would be InvalidQueryException.
    +
    +It is important to note that, this would not have any effect on range reads or write operations.
    +
    +How to blacklist a partition key
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +``system_distributed.blacklisted_partitions`` table can be used to blacklist partitions. Essentially, you will have to
    +insert a new row into the table with the following details:
    +
    +- Keyspace name
    +- Table name
    +- Partition key
    +
    +The way partition key needs to be mentioned is exactly similar to how ``nodetool getendpoints`` works.
    +Following are several examples for blacklisting partition keys in keyspace, say ks, and table, say table1 for different
    +possibilities of table1 partition key:
    +- For a single column partition key, say Id,
    +  Id is a simple type - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1');
    +  Id is a blob        - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','12345f');
    +  Id has a colon      - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1\:2');
    +
    +- For a composite column partition key, say (Key1, Key2),
    +  insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name, partition_key) VALUES ('ks','table1','k11:k21');
    +
    +BlacklistedPartitions Cache
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +Cassandra internally maintains an on-heap cache that loads partition keys from ``system_distributed.blacklisted_partitions``.
    +Any partition key mentioned against a non-existent keyspace name and table name will not be cached.
    +
    +Cache gets refreshed in following ways
    +- During Cassandra start up
    +- Scheduled cache refresh at a default interval of 10 minutes (can be overridden using ``blacklisted_partitions_cache_refresh_period_in_sec`` yaml property)
    +- Using ``nodetool blacklistedpartitions --refresh-cache``
    +
    +Cache size is bounded, set by ``blacklisted_partitions_cache_size_limit_in_mb`` yaml property (defaulted to 100 MB).
    +As cache loads blacklisted partitions from ``system_distributed.blacklisted_partitions``, it checks if cache size exceeds the size limit.
    +The moment cache size exceeeds the size limit, no more partitions are loaded into the cache, and only those partitions that have been loaded so far would be blacklisted.
    +A warning would be logged to indicate the same.
    +Size limit on the cache can be changed using the yaml property, or dynamically using below nodetool command
    +``nodetool blacklistedpartitions --cache-size-limit-in-mb=150 --refreshcache``
    --- End diff --
    
    Can you also add `Limitations/ Gotchas` section explaining that it will not block range reads and mutations etc.,


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223544939
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -1545,6 +1546,18 @@ public static PartitionIterator read(SinglePartitionReadCommand.Group group, Con
                 throw new IsBootstrappingException();
             }
     
    +        // check if the partition in question is blacklisted; if yes, reject READ operation
    +        if (BlacklistedPartitionCache.instance.size() > 0)
    +        {
    +            for (SinglePartitionReadQuery query : group.queries)
    +            {
    +                if (BlacklistedPartitionCache.instance.contains(query.metadata().id, query.partitionKey()))
    +                {
    +                    throw new InvalidRequestException("Cannot perform READ on a blacklisted partition");
    --- End diff --
    
    good point. added.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223789839
  
    --- Diff: src/java/org/apache/cassandra/cache/BlacklistedPartition.java ---
    @@ -0,0 +1,119 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import java.nio.ByteBuffer;
    +import java.util.Arrays;
    +
    +import org.apache.cassandra.db.DecoratedKey;
    +import org.apache.cassandra.schema.KeyspaceMetadata;
    +import org.apache.cassandra.schema.Schema;
    +import org.apache.cassandra.schema.TableId;
    +import org.apache.cassandra.schema.TableMetadata;
    +import org.apache.cassandra.schema.TableMetadataRef;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +import org.apache.cassandra.utils.ObjectSizes;
    +
    +/**
    + * Class to represent a blacklisted partition
    + */
    +public class BlacklistedPartition implements IMeasurableMemory
    +{
    +    private static final long EMPTY_SIZE = ObjectSizes.measure(new BlacklistedPartition(null, new byte[0]));
    +    public final byte[] key;
    +    public final TableId tableId;
    +
    +    public BlacklistedPartition(TableId tableId, DecoratedKey key)
    +    {
    +        this.tableId = tableId;
    +        this.key = ByteBufferUtil.getArray(key.getKey());
    +    }
    +
    +    private BlacklistedPartition(TableId tableId, byte[] key)
    +    {
    +        this.tableId = tableId;
    +        this.key = key;
    +    }
    +
    +    /**
    +     * Creates an instance of BlacklistedPartition for a given keyspace, table and partition key
    +     *
    +     * @param keyspace
    +     * @param table
    +     * @param key
    +     * @throws IllegalArgumentException
    +     */
    +    public BlacklistedPartition(String keyspace, String table, String key) throws IllegalArgumentException
    +    {
    +        // Determine tableId from keyspace and table parameters. If tableId cannot be determined due to invalid
    +        // parameters, throw an exception
    +        KeyspaceMetadata ksMetaData = Schema.instance.getKeyspaceMetadata(keyspace);
    +        if (ksMetaData == null)
    +        {
    +            throw new IllegalArgumentException("Unknown keyspace '" + keyspace + "'");
    +        }
    +
    +        TableMetadata metadata = ksMetaData.getTableOrViewNullable(table);
    +        if (metadata == null)
    +        {
    +            throw new IllegalArgumentException("Unknown table '" + table + "' in keyspace '" + keyspace + "'");
    +        }
    +
    +        ByteBuffer keyAsBytes = metadata.partitionKeyType.fromString(key);
    +        this.tableId = metadata.id;
    +        this.key = keyAsBytes.array();
    +    }
    +
    +    /**
    +     * returns size in bytes of BlacklistedPartition instance
    +     *
    +     * @return
    +     */
    +    public long unsharedHeapSize()
    +    {
    +        return EMPTY_SIZE + ObjectSizes.sizeOf(tableId.toString()) + ObjectSizes.sizeOfArray(key);
    --- End diff --
    
    I am not sure you have to account for the size of the `tableId.toString()`, looking at `RowCacheKey` it just accounts for that as part of the empty size. What am I missing ...?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r234413324
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    --- End diff --
    
    execute() call can be replaced with executeInternalWithPaging(), but with paging or otherwise, we will still end up holding entire result set in memory through "results" variable here, isn't it? You think it could still help using paging?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223781363
  
    --- Diff: doc/source/operating/blacklisting_partitions.rst ---
    @@ -0,0 +1,63 @@
    +.. Licensed to the Apache Software Foundation (ASF) under one
    +.. or more contributor license agreements.  See the NOTICE file
    +.. distributed with this work for additional information
    +.. regarding copyright ownership.  The ASF licenses this file
    +.. to you under the Apache License, Version 2.0 (the
    +.. "License"); you may not use this file except in compliance
    +.. with the License.  You may obtain a copy of the License at
    +..
    +..     http://www.apache.org/licenses/LICENSE-2.0
    +..
    +.. Unless required by applicable law or agreed to in writing, software
    +.. distributed under the License is distributed on an "AS IS" BASIS,
    +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +.. See the License for the specific language governing permissions and
    +.. limitations under the License.
    +
    +.. highlight:: none
    +
    +
    +Blacklisting Partitions (CASSANDRA-12106)
    +-----------------------------------------
    +
    +This feature allows partition keys to be blacklisted i.e. Cassandra would not process following operations on those
    +blacklisted partitions.
    +
    +- Point READs
    +
    +Response would be InvalidQueryException.
    +
    +It is important to note that, this would not have any effect on range reads or write operations.
    +
    +How to blacklist a partition key
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +``system_distributed.blacklisted_partitions`` table can be used to blacklist partitions. Essentially, you will have to
    +insert a new row into the table with the following details:
    +
    +- Keyspace name
    +- Table name
    +- Partition key
    +
    +The way partition key needs to be mentioned is exactly similar to how ``nodetool getendpoints`` works.
    +Following are several examples for blacklisting partition keys in keyspace, say ks, and table, say table1 for different
    +possibilities of table1 partition key:
    +- For a single column partition key, say Id,
    +  Id is a simple type - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1');
    +  Id is a blob        - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','12345f');
    +  Id has a colon      - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1\:2');
    +
    +- For a composite column partition key, say (Key1, Key2),
    +  insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name, partition_key) VALUES ('ks','table1','k11:k21');
    +
    +BlacklistedPartitions Cache
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +Cassandra internally maintains an on-heap cache that loads partition keys from ``system_distributed.blacklisted_partitions``.
    +Any partition key mentioned against a non-existent keyspace name and table name will not be cached.
    +
    +Cache gets refreshed in following ways
    +- During Cassandra start up
    +- Scheduled cache refresh at a default interval of 10 minutes (can be overridden using ``blacklisted_partitions_cache_refresh_period_in_sec`` yaml property)
    +- Using nodetool refreshblacklistedpartitionscache
    +
    +There is no bound on how much on-heap memory this cache can occupy - so be cautious on how many keys you would want to blacklist.
    +``blacklisted_partitions_cache_size_warn_threshold_in_mb`` yaml property can be used to be notified (via warning logs) if cache size exceeds the set threshold.
    --- End diff --
    
    I think linking to the cassandra.yaml documentation makes sense here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230548153
  
    --- Diff: doc/source/operating/blacklisting_partitions.rst ---
    @@ -0,0 +1,63 @@
    +.. Licensed to the Apache Software Foundation (ASF) under one
    +.. or more contributor license agreements.  See the NOTICE file
    +.. distributed with this work for additional information
    +.. regarding copyright ownership.  The ASF licenses this file
    +.. to you under the Apache License, Version 2.0 (the
    +.. "License"); you may not use this file except in compliance
    +.. with the License.  You may obtain a copy of the License at
    +..
    +..     http://www.apache.org/licenses/LICENSE-2.0
    +..
    +.. Unless required by applicable law or agreed to in writing, software
    +.. distributed under the License is distributed on an "AS IS" BASIS,
    +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +.. See the License for the specific language governing permissions and
    +.. limitations under the License.
    +
    +.. highlight:: none
    +
    +
    +Blacklisting Partitions (CASSANDRA-12106)
    +-----------------------------------------
    +
    +This feature allows partition keys to be blacklisted i.e. Cassandra would not process following operations on those
    +blacklisted partitions.
    +
    +- Point READs
    +
    +Response would be InvalidQueryException.
    +
    +It is important to note that, this would not have any effect on range reads or write operations.
    +
    +How to blacklist a partition key
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +``system_distributed.blacklisted_partitions`` table can be used to blacklist partitions. Essentially, you will have to
    +insert a new row into the table with the following details:
    +
    +- Keyspace name
    +- Table name
    +- Partition key
    +
    +The way partition key needs to be mentioned is exactly similar to how ``nodetool getendpoints`` works.
    +Following are several examples for blacklisting partition keys in keyspace, say ks, and table, say table1 for different
    +possibilities of table1 partition key:
    +- For a single column partition key, say Id,
    +  Id is a simple type - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1');
    +  Id is a blob        - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','12345f');
    +  Id has a colon      - insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name ,partition_key) VALUES ('ks','table1','1\:2');
    +
    +- For a composite column partition key, say (Key1, Key2),
    +  insert into system_distributed.blacklisted_partitions (keyspace_name ,columnfamily_name, partition_key) VALUES ('ks','table1','k11:k21');
    +
    +BlacklistedPartitions Cache
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +Cassandra internally maintains an on-heap cache that loads partition keys from ``system_distributed.blacklisted_partitions``.
    +Any partition key mentioned against a non-existent keyspace name and table name will not be cached.
    +
    +Cache gets refreshed in following ways
    +- During Cassandra start up
    +- Scheduled cache refresh at a default interval of 10 minutes (can be overridden using ``blacklisted_partitions_cache_refresh_period_in_sec`` yaml property)
    +- Using nodetool refreshblacklistedpartitionscache
    +
    +There is no bound on how much on-heap memory this cache can occupy - so be cautious on how many keys you would want to blacklist.
    +``blacklisted_partitions_cache_size_warn_threshold_in_mb`` yaml property can be used to be notified (via warning logs) if cache size exceeds the set threshold.
    --- End diff --
    
    what cassandra.yaml documentation are you referring to? also, this section has been updated. will be pushing new changes soon


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230548390
  
    --- Diff: doc/source/operating/blacklisting_partitions.rst ---
    @@ -0,0 +1,63 @@
    +.. Licensed to the Apache Software Foundation (ASF) under one
    +.. or more contributor license agreements.  See the NOTICE file
    +.. distributed with this work for additional information
    +.. regarding copyright ownership.  The ASF licenses this file
    +.. to you under the Apache License, Version 2.0 (the
    +.. "License"); you may not use this file except in compliance
    +.. with the License.  You may obtain a copy of the License at
    +..
    +..     http://www.apache.org/licenses/LICENSE-2.0
    +..
    +.. Unless required by applicable law or agreed to in writing, software
    +.. distributed under the License is distributed on an "AS IS" BASIS,
    +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +.. See the License for the specific language governing permissions and
    +.. limitations under the License.
    +
    +.. highlight:: none
    +
    +
    +Blacklisting Partitions (CASSANDRA-12106)
    +-----------------------------------------
    +
    +This feature allows partition keys to be blacklisted i.e. Cassandra would not process following operations on those
    --- End diff --
    
    done


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r230548088
  
    --- Diff: doc/source/operating/blacklisting_partitions.rst ---
    @@ -0,0 +1,63 @@
    +.. Licensed to the Apache Software Foundation (ASF) under one
    +.. or more contributor license agreements.  See the NOTICE file
    +.. distributed with this work for additional information
    +.. regarding copyright ownership.  The ASF licenses this file
    +.. to you under the Apache License, Version 2.0 (the
    +.. "License"); you may not use this file except in compliance
    +.. with the License.  You may obtain a copy of the License at
    +..
    +..     http://www.apache.org/licenses/LICENSE-2.0
    +..
    +.. Unless required by applicable law or agreed to in writing, software
    +.. distributed under the License is distributed on an "AS IS" BASIS,
    +.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +.. See the License for the specific language governing permissions and
    +.. limitations under the License.
    +
    +.. highlight:: none
    +
    +
    +Blacklisting Partitions (CASSANDRA-12106)
    +-----------------------------------------
    +
    +This feature allows partition keys to be blacklisted i.e. Cassandra would not process following operations on those
    +blacklisted partitions.
    +
    +- Point READs
    +
    +Response would be InvalidQueryException.
    +
    +It is important to note that, this would not have any effect on range reads or write operations.
    +
    +How to blacklist a partition key
    +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    +``system_distributed.blacklisted_partitions`` table can be used to blacklist partitions. Essentially, you will have to
    +insert a new row into the table with the following details:
    +
    +- Keyspace name
    +- Table name
    +- Partition key
    +
    +The way partition key needs to be mentioned is exactly similar to how ``nodetool getendpoints`` works.
    --- End diff --
    
    not that I could find. There is getendpoints.txt, but that doesn't go into required details around partition key


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r223545254
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,42 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    --- End diff --
    
    There is no restriction on how much heap the cache could take up; this is to make the blacklisting deterministic. Hoping the explicit mentions about this in the documentation I put in place, and the warn threshold on the cache size to help.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r234413081
  
    --- Diff: src/java/org/apache/cassandra/service/CassandraDaemon.java ---
    @@ -426,6 +426,9 @@ public void uncaughtException(Thread t, Throwable e)
             // Native transport
             nativeTransportService = new NativeTransportService();
     
    +        // load blacklisted partitions into cache
    +        StorageService.instance.refreshBlacklistedPartitionsCache();
    --- End diff --
    
    yeah, I think the first node of the new version would create the table. Either way, it wouldn't be a catastrophic failure for server start since possible failure during reading blacklisted partitions is handled.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by jolynch <gi...@git.apache.org>.
Github user jolynch commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r227865693
  
    --- Diff: test/unit/org/apache/cassandra/cache/BlacklistedPartitionsCacheTest.java ---
    @@ -0,0 +1,175 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.cache;
    +
    +import org.junit.Assert;
    +import org.junit.BeforeClass;
    +import org.junit.Test;
    +
    +import com.datastax.driver.core.PreparedStatement;
    +import com.datastax.driver.core.ResultSet;
    +import com.datastax.driver.core.Session;
    +import com.datastax.driver.core.exceptions.InvalidQueryException;
    +import org.apache.cassandra.cql3.CQLTester;
    +import org.apache.cassandra.repair.SystemDistributedKeyspace;
    +import org.apache.cassandra.schema.SchemaConstants;
    +import org.apache.cassandra.utils.ByteBufferUtil;
    +
    +public class BlacklistedPartitionsCacheTest extends CQLTester
    --- End diff --
    
    Can we add a test where reading the blacklist fails (e.g. due to mixed version clusters) just to make sure we handle that issue?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #277: 12106 - blacklisting bad partitions for point r...

Posted by sumanth-pasupuleti <gi...@git.apache.org>.
Github user sumanth-pasupuleti commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/277#discussion_r227077676
  
    --- Diff: src/java/org/apache/cassandra/repair/SystemDistributedKeyspace.java ---
    @@ -307,6 +320,55 @@ public static void setViewRemoved(String keyspaceName, String viewName)
             forceBlockingFlush(VIEW_BUILD_STATUS);
         }
     
    +    /**
    +     * Reads blacklisted partitions from system_distributed.blacklisted_partitions table.
    +     * Stops reading partitions upon exceeding the cache size limit by logging a warning.
    +     * @return
    +     */
    +    public static Set<BlacklistedPartition> getBlacklistedPartitions()
    +    {
    +        String query = "SELECT keyspace_name, columnfamily_name, partition_key FROM %s.%s";
    +        UntypedResultSet results;
    +        try
    +        {
    +            results = QueryProcessor.execute(format(query, SchemaConstants.DISTRIBUTED_KEYSPACE_NAME, BLACKLISTED_PARTITIONS),
    +                                             ConsistencyLevel.ONE);
    +        }
    +        catch (Exception e)
    +        {
    +            logger.error("Error querying blacklisted partitions", e);
    +            return Collections.emptySet();
    +        }
    +
    +        Set<BlacklistedPartition> blacklistedPartitions = new HashSet<>();
    +        int cacheSize = 0;
    +        for (UntypedResultSet.Row row : results)
    +        {
    +            try
    +            {
    +                BlacklistedPartition blacklistedPartition = new BlacklistedPartition(row.getString("keyspace_name"), row.getString("columnfamily_name"), row.getString("partition_key"));
    +
    +                // check if adding this blacklisted partition would increase cache size beyond the set limit.
    +                cacheSize += blacklistedPartition.unsharedHeapSize();
    +                if (cacheSize / (1024 * 1024) > DatabaseDescriptor.getBlackListedPartitionsCacheSizeLimitInMB())
    +                {
    +                    logger.warn("BlacklistedPartitions cache size limit of {} MB reached. Unable to load more blacklisted partitions. BlacklistedPartitions cache working in degraded mode.", DatabaseDescriptor.getBlackListedPartitionsCacheSizeLimitInMB());
    --- End diff --
    
    done


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org