You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by bdeggleston <gi...@git.apache.org> on 2018/05/09 19:06:57 UTC

[GitHub] cassandra pull request #224: 14405 replicas

GitHub user bdeggleston opened a pull request:

    https://github.com/apache/cassandra/pull/224

    14405 replicas

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bdeggleston/cassandra 14405-replicas

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/cassandra/pull/224.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #224
    
----
commit 7cacd9d3cf1b9eceea0420bee412ce571f104d7a
Author: Blake Eggleston <bd...@...>
Date:   2018-04-03T22:06:39Z

    adding Replica and ReplicatedRange classes

commit 34b6ed98141ad3820a465d1fdcc513fa83364051
Author: Blake Eggleston <bd...@...>
Date:   2018-04-18T15:15:43Z

    fixing replica equality

commit 66768f269d19cffba83c647a91bb89c7dab0b1de
Author: Blake Eggleston <bd...@...>
Date:   2018-04-18T15:16:15Z

    adding transient replication enable to config

commit 19abd4362bddf1f12c7dcd0f90e63173f1c0d780
Author: Blake Eggleston <bd...@...>
Date:   2018-04-18T15:16:23Z

    adding RF tests

commit f269149a9d001c4d15e8e6472488817b5b35886c
Author: Blake Eggleston <bd...@...>
Date:   2018-04-18T17:31:18Z

    updating rf

commit c10309b76a428d7b306fa64920827451f7cc20a9
Author: Blake Eggleston <bd...@...>
Date:   2018-04-18T17:31:26Z

    adding initial nts support

commit 92b69489a8277f943a26f0bc0a618f47578dbd18
Author: Blake Eggleston <bd...@...>
Date:   2018-04-18T18:06:45Z

    renaming endpoint specific names

commit 77e18845553e45f322a01f06af2a7efbf5cf79ea
Author: Blake Eggleston <bd...@...>
Date:   2018-04-24T17:06:50Z

    adding simple strategy support

commit 19cefdaefd0fe96b7fb25fd9d6d9aedc7cd49517
Author: Blake Eggleston <bd...@...>
Date:   2018-04-24T17:58:23Z

    prevent mixing of transient replication and MV/2i

commit 676066689d84b382916022b005f71d48be85c2ac
Author: Blake Eggleston <bd...@...>
Date:   2018-04-25T22:00:56Z

    fixing some tests

commit 514eab87ba2939d27163dc793286c05a25a3ce2e
Author: Blake Eggleston <bd...@...>
Date:   2018-04-26T16:17:28Z

    more test fixing

commit f9fecbe714d78b00ba004dc843e577115674a3f2
Author: Blake Eggleston <bd...@...>
Date:   2018-04-26T22:11:43Z

    fixing hint dtest

commit 83bfe801e8fa75769866530387680150552385b7
Author: Blake Eggleston <bd...@...>
Date:   2018-04-26T22:25:43Z

    fixing rf comparison

commit fdc100e459927a09aef0ad818baa0c7399c3c6c5
Author: Blake Eggleston <bd...@...>
Date:   2018-04-27T17:20:28Z

    adding range to replica

commit 1f8ba51bad94e5d3b89555d4bb8a2cc70338d548
Author: Blake Eggleston <bd...@...>
Date:   2018-04-27T19:50:23Z

    fixing move test

commit d5b5fe631d42c4207e0983f688d10cdc1885ea7a
Author: Blake Eggleston <bd...@...>
Date:   2018-04-27T21:05:04Z

    merging Replica and ReplicatedRange

commit bf97410f68d721c175c7b3f1b75990a23f8d525f
Author: Blake Eggleston <bd...@...>
Date:   2018-04-30T22:27:55Z

    fixing range tests

commit 60ab840241777e2423809807ea3b7159346ef240
Author: Blake Eggleston <bd...@...>
Date:   2018-05-01T00:38:53Z

    more test fixing

commit 5d84ba5c57d8853e21f23574840760c67d60cf83
Author: Blake Eggleston <bd...@...>
Date:   2018-05-02T00:13:31Z

    fixing srp test, misc adjustments

commit f83d6ec4c8e7853372a9443444e88b62ae80c6e1
Author: Blake Eggleston <bd...@...>
Date:   2018-05-02T16:21:38Z

    fixing concurrent list modification

commit 1c98a43823ff313f09b2e54736b7ebcf5ab2a1f6
Author: Blake Eggleston <bd...@...>
Date:   2018-05-02T22:38:55Z

    fixing MV test

commit 1e8a17cd5fa525ffc0d6a4b5d441d424320947d5
Author: Blake Eggleston <bd...@...>
Date:   2018-05-02T23:45:17Z

    renaming replicas to replica helpers

commit e2742dbe03ab1994991e8cd6ff62ef2a5dfcf4a3
Author: Blake Eggleston <bd...@...>
Date:   2018-05-03T17:52:41Z

    adding replica aware collections

commit e35be2d0372a62e0748882ad31f717c585b0d892
Author: Blake Eggleston <bd...@...>
Date:   2018-05-03T21:29:18Z

    moving stuff out of ReplicaHelpers

commit 58010ee0f7ec173b7d14c126ad6792d64bedccb5
Author: Blake Eggleston <bd...@...>
Date:   2018-05-04T00:21:17Z

    finishing move away from replica helpers

commit 689043c52829a2b0fd94477781393006135a720b
Author: Blake Eggleston <bd...@...>
Date:   2018-05-04T20:47:52Z

    undoing multimap changes

commit 1cee66b4fec11d298ca20873e3f42580eeaed18b
Author: Blake Eggleston <bd...@...>
Date:   2018-05-04T22:12:50Z

    fixing tests

commit 090ca3d289112ad18cd014ea1803fa0620413180
Author: Blake Eggleston <bd...@...>
Date:   2018-05-04T22:48:10Z

    fixing decommission

commit c3c714203a7b390b8ec0f980a9e17da49b85aa91
Author: Blake Eggleston <bd...@...>
Date:   2018-05-08T22:04:58Z

    adding config and docs

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188690076
  
    --- Diff: src/java/org/apache/cassandra/service/WriteResponseHandler.java ---
    @@ -42,26 +44,26 @@
         private static final AtomicIntegerFieldUpdater<WriteResponseHandler> responsesUpdater
                 = AtomicIntegerFieldUpdater.newUpdater(WriteResponseHandler.class, "responses");
     
    -    public WriteResponseHandler(Collection<InetAddressAndPort> writeEndpoints,
    -                                Collection<InetAddressAndPort> pendingEndpoints,
    +    public WriteResponseHandler(Replicas writeReplicas,
    +                                Replicas pendingReplicas,
                                     ConsistencyLevel consistencyLevel,
                                     Keyspace keyspace,
                                     Runnable callback,
                                     WriteType writeType,
                                     long queryStartNanoTime)
         {
    -        super(keyspace, writeEndpoints, pendingEndpoints, consistencyLevel, callback, writeType, queryStartNanoTime);
    +        super(keyspace, writeReplicas, pendingReplicas, consistencyLevel, callback, writeType, queryStartNanoTime);
             responses = totalBlockFor();
         }
     
    -    public WriteResponseHandler(InetAddressAndPort endpoint, WriteType writeType, Runnable callback, long queryStartNanoTime)
    +    public WriteResponseHandler(Replica replica, WriteType writeType, Runnable callback, long queryStartNanoTime)
         {
    -        this(Arrays.asList(endpoint), Collections.<InetAddressAndPort>emptyList(), ConsistencyLevel.ONE, null, callback, writeType, queryStartNanoTime);
    +        this(new ReplicaList(Collections.singleton(replica)), new ReplicaList(), ConsistencyLevel.ONE, null, callback, writeType, queryStartNanoTime);
    --- End diff --
    
    Allocates an extra singleton, second replica list could be the immutable empty list instead of allocating.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189092456
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaMultimap.java ---
    @@ -0,0 +1,121 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.AbstractMap;
    +import java.util.HashMap;
    +import java.util.Iterator;
    +import java.util.LinkedList;
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Set;
    +
    +import com.google.common.collect.Iterables;
    +
    +public abstract class ReplicaMultimap<K, V extends Replicas>
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188746162
  
    --- Diff: src/java/org/apache/cassandra/locator/TokenMetadata.java ---
    @@ -904,38 +905,38 @@ private static PendingRangeMaps calculatePendingRanges(AbstractReplicationStrate
             for (Pair<Token, InetAddressAndPort> moving : movingEndpoints)
             {
                 //Calculate all the ranges which will could be affected. This will include the ranges before and after the move.
    -            Set<Range<Token>> moveAffectedRanges = new HashSet<>();
    +            Set<Replica> moveAffectedReplicas = new HashSet<>();
    --- End diff --
    
    Need to think carefully about this usage of Set<Replica> move is the case where endpoints and transientness changes but ranges might not. That's when it breaks down to try and use sets and set difference of all 3 attributes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197130894
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaMultimap.java ---
    @@ -0,0 +1,132 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.AbstractMap;
    +import java.util.HashMap;
    +import java.util.LinkedList;
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +public abstract class ReplicaMultimap<K, V extends Replicas>
    +{
    +    Map<K, V> map = new HashMap<>();
    --- End diff --
    
    I think we can make it final.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197136522
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -4231,53 +4211,53 @@ private void calculateToFromStreams(Collection<Token> newTokens, List<String> ke
                 InetAddressAndPort localAddress = FBUtilities.getBroadcastAddressAndPort();
                 IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
                 TokenMetadata tokenMetaCloneAllSettled = tokenMetadata.cloneAfterAllSettled();
    -            // clone to avoid concurrent modification in calculateNaturalEndpoints
    +            // clone to avoid concurrent modification in calculateNaturalReplicas
                 TokenMetadata tokenMetaClone = tokenMetadata.cloneOnlyTokenMap();
     
                 for (String keyspace : keyspaceNames)
                 {
                     // replication strategy of the current keyspace
                     AbstractReplicationStrategy strategy = Keyspace.open(keyspace).getReplicationStrategy();
    -                Multimap<InetAddressAndPort, Range<Token>> endpointToRanges = strategy.getAddressRanges();
    +                ReplicaMultimap<InetAddressAndPort, ReplicaSet> endpointToRanges = strategy.getAddressReplicas();
     
                     logger.debug("Calculating ranges to stream and request for keyspace {}", keyspace);
                     for (Token newToken : newTokens)
                     {
                         // getting collection of the currently used ranges by this keyspace
    -                    Collection<Range<Token>> currentRanges = endpointToRanges.get(localAddress);
    +                    ReplicaSet currentReplicas = endpointToRanges.get(localAddress);
                         // collection of ranges which this node will serve after move to the new token
    -                    Collection<Range<Token>> updatedRanges = strategy.getPendingAddressRanges(tokenMetaClone, newToken, localAddress);
    +                    ReplicaSet updatedReplicas = strategy.getPendingAddressRanges(tokenMetaClone, newToken, localAddress);
     
                         // ring ranges and endpoints associated with them
                         // this used to determine what nodes should we ping about range data
    -                    Multimap<Range<Token>, InetAddressAndPort> rangeAddresses = strategy.getRangeAddresses(tokenMetaClone);
    +                    ReplicaMultimap<Range<Token>, ReplicaSet> rangeAddresses = strategy.getRangeAddresses(tokenMetaClone);
     
                         // calculated parts of the ranges to request/stream from/to nodes in the ring
    -                    Pair<Set<Range<Token>>, Set<Range<Token>>> rangesPerKeyspace = calculateStreamAndFetchRanges(currentRanges, updatedRanges);
    +                    Pair<Set<Range<Token>>, Set<Range<Token>>> rangesPerKeyspace = calculateStreamAndFetchRanges(currentReplicas, updatedReplicas);
     
                         /**
                          * In this loop we are going through all ranges "to fetch" and determining
                          * nodes in the ring responsible for data we are interested in
                          */
    -                    Multimap<Range<Token>, InetAddressAndPort> rangesToFetchWithPreferredEndpoints = ArrayListMultimap.create();
    +                    ReplicaMultimap<Range<Token>, ReplicaList> rangesToFetchWithPreferredEndpoints = ReplicaMultimap.list();
                         for (Range<Token> toFetch : rangesPerKeyspace.right)
                         {
                             for (Range<Token> range : rangeAddresses.keySet())
                             {
                                 if (range.contains(toFetch))
                                 {
    -                                List<InetAddressAndPort> endpoints = null;
    +                                ReplicaList endpoints = null;
     
                                     if (useStrictConsistency)
                                     {
    -                                    Set<InetAddressAndPort> oldEndpoints = Sets.newHashSet(rangeAddresses.get(range));
    -                                    Set<InetAddressAndPort> newEndpoints = Sets.newHashSet(strategy.calculateNaturalEndpoints(toFetch.right, tokenMetaCloneAllSettled));
    +                                    ReplicaSet oldEndpoints = new ReplicaSet(rangeAddresses.get(range));
    --- End diff --
    
    Might be better to filter on `oldEndpoints` rather than remove from them (esp. since we already copy)
    
    Additional benefit would be if we ever decide to switch to immutable interface, we'll have to do this anyways.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189091647
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    +        ReplicaList replicas = new ReplicaList(normalized.size());
    +        for (Range<Token> normalizedRange: normalized)
    +        {
    +            replicas.add(new Replica(getEndpoint(), normalizedRange, isFull()));
    +        }
    +        return replicas;
    +    }
    +
    +    public boolean contains(Range<Token> that)
    +    {
    +        return getRange().contains(that);
    +    }
    +
    +    public boolean intersectsOnRange(Replica replica)
    +    {
    +        return getRange().intersects(replica.getRange());
    +    }
    +
    +    public Replica decorateSubrange(Range<Token> subrange)
    +    {
    +        Preconditions.checkArgument(range.contains(subrange));
    +        return new Replica(getEndpoint(), subrange, isFull());
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, true);
    +    }
    +
    +    /**
    +     * We need to assume an endpoint is a full replica in a with unknown ranges in a
    +     * few cases, so this returns one that throw an exception if you try to get it's range
    +     */
    +    public static Replica fullStandin(InetAddressAndPort endpoint)
    +    {
    +        return new Replica(endpoint, null, true) {
    +            @Override
    +            public Range<Token> getRange()
    +            {
    +                throw new UnsupportedOperationException("Can't get range on standin replicas");
    +            }
    +        };
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return full(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, false);
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return trans(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static void assureSufficientFullReplica(Collection<Replica> replicas, ConsistencyLevel cl) throws UnavailableException
    +    {
    +        if (!Iterables.any(replicas, Replica::isFull))
    +        {
    +            throw new UnavailableException(cl, "At least one full replica required", 1, 0);
    +        }
    +    }
    +
    +    public static Iterable<InetAddressAndPort> toEndpoints(Iterable<Replica> replicas)
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197124486
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    --- End diff --
    
    We can compare with `!=` (possibly only in the second case).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r194793501
  
    --- Diff: src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java ---
    @@ -269,24 +272,26 @@ public void executeAsync()
             {
                 // if CL + RR result in covering all replicas, getReadExecutor forces AlwaysSpeculating.  So we know
                 // that the last replica in our list is "extra."
    -            List<InetAddressAndPort> initialReplicas = targetReplicas.subList(0, targetReplicas.size() - 1);
    +            ReplicaList initialReplicas = targetReplicas.subList(0, targetReplicas.size() - 1);
    +
    +            Replicas.checkFull(initialReplicas);
     
                 if (handler.blockfor < initialReplicas.size())
                 {
                     // We're hitting additional targets for read repair.  Since our "extra" replica is the least-
                     // preferred by the snitch, we do an extra data read to start with against a replica more
                     // likely to reply; better to let RR fail than the entire query.
    -                makeDataRequests(initialReplicas.subList(0, 2));
    +                makeDataRequests(initialReplicas.subList(0, 2).asEndpoints());
                     if (initialReplicas.size() > 2)
    -                    makeDigestRequests(initialReplicas.subList(2, initialReplicas.size()));
    +                    makeDigestRequests(initialReplicas.subList(2, initialReplicas.size()).asEndpoints());
                 }
                 else
                 {
                     // not doing read repair; all replies are important, so it doesn't matter which nodes we
                     // perform data reads against vs digest.
    -                makeDataRequests(initialReplicas.subList(0, 1));
    +                makeDataRequests(initialReplicas.subList(0, 1).asEndpoints());
                     if (initialReplicas.size() > 1)
    -                    makeDigestRequests(initialReplicas.subList(1, initialReplicas.size()));
    +                    makeDigestRequests(initialReplicas.subList(1, initialReplicas.size()).asEndpoints());
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189091637
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    +        ReplicaList replicas = new ReplicaList(normalized.size());
    +        for (Range<Token> normalizedRange: normalized)
    +        {
    +            replicas.add(new Replica(getEndpoint(), normalizedRange, isFull()));
    +        }
    +        return replicas;
    +    }
    +
    +    public boolean contains(Range<Token> that)
    +    {
    +        return getRange().contains(that);
    +    }
    +
    +    public boolean intersectsOnRange(Replica replica)
    +    {
    +        return getRange().intersects(replica.getRange());
    +    }
    +
    +    public Replica decorateSubrange(Range<Token> subrange)
    +    {
    +        Preconditions.checkArgument(range.contains(subrange));
    +        return new Replica(getEndpoint(), subrange, isFull());
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, true);
    +    }
    +
    +    /**
    +     * We need to assume an endpoint is a full replica in a with unknown ranges in a
    +     * few cases, so this returns one that throw an exception if you try to get it's range
    +     */
    +    public static Replica fullStandin(InetAddressAndPort endpoint)
    +    {
    +        return new Replica(endpoint, null, true) {
    +            @Override
    +            public Range<Token> getRange()
    +            {
    +                throw new UnsupportedOperationException("Can't get range on standin replicas");
    +            }
    +        };
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return full(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, false);
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return trans(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static void assureSufficientFullReplica(Collection<Replica> replicas, ConsistencyLevel cl) throws UnavailableException
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r194919778
  
    --- Diff: test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java ---
    @@ -54,19 +54,19 @@ public void testConcurrency() throws InterruptedException, IOException, Configur
                 DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, String.valueOf(ss.hashCode()));
                 InetAddressAndPort self = FBUtilities.getBroadcastAddressAndPort();
     
    -            List<InetAddressAndPort> hosts = new ArrayList<>();
    +            ReplicaList replicas = new ReplicaList();
                 // We want a big list of hosts so  sorting takes time, making it much more likely to reproduce the
                 // problem we're looking for.
                 for (int i = 0; i < 100; i++)
                     for (int j = 0; j < 256; j++)
    -                    hosts.add(InetAddressAndPort.getByAddress(new byte[]{ 127, 0, (byte)i, (byte)j}));
    +                    replicas.add(Replica.fullStandin(InetAddressAndPort.getByAddress(new byte[]{ 127, 0, (byte)i, (byte)j})));
    --- End diff --
    
    fixed, the places that need it now deal with SystemReplicas.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188361596
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -0,0 +1,313 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Set;
    +
    +import com.google.common.base.Predicate;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.utils.FBUtilities;
    +
    +/**
    + * A collection like class for Replica objects. Since the Replica class contains inetaddress, range, and
    + * transient replication status, basic contains and remove methods can be ambiguous. Replicas forces you
    + * to be explicit about what you're checking the container for, or removing from it.
    + */
    +public abstract class Replicas implements Iterable<Replica>
    +{
    +
    +    public abstract boolean add(Replica replica);
    +    public abstract void addAll(Iterable<Replica> replicas);
    +    public abstract void removeEndpoint(InetAddressAndPort endpoint);
    +    public abstract void removeReplica(Replica replica);
    +    public abstract int size();
    +
    +    public Iterable<InetAddressAndPort> asEndpoints()
    +    {
    +        return Iterables.transform(this, Replica::getEndpoint);
    +    }
    +
    +    public Set<InetAddressAndPort> asEndpointSet()
    +    {
    +        Set<InetAddressAndPort> result = Sets.newHashSetWithExpectedSize(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getEndpoint());
    +        }
    +        return result;
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> result = new ArrayList<>(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getEndpoint());
    +        }
    +        return result;
    +    }
    +
    +    public Iterable<Range<Token>> asRanges()
    +    {
    +        return Iterables.transform(this, Replica::getRange);
    +    }
    +
    +    public Set<Range<Token>> asRangeSet()
    +    {
    +        Set<Range<Token>> result = Sets.newHashSetWithExpectedSize(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getRange());
    +        }
    +        return result;
    +    }
    +
    +    public Iterable<Range<Token>> fullRanges()
    +    {
    +        return Iterables.transform(Iterables.filter(this, Replica::isFull), Replica::getRange);
    +    }
    +
    +    public boolean containsEndpoint(InetAddressAndPort endpoint)
    +    {
    +        return Iterables.any(this, r -> r.getEndpoint().equals(endpoint));
    --- End diff --
    
    This allocates a lambda. Also ReplicaList can implement this without allocating an iterator.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188786457
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    +
    +    private void warnIfIncreasingRF(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (newStrategy.getReplicationFactor().full > oldStrategy.getReplicationFactor().full)
    --- End diff --
    
    Since replication factor changes don't really fall under any of the existing JIRAs, maybe we should disable all rf changes (except transient increases) as part of the initial refactor, and make a follow on jira to enable the other rf changes after the read and write path changes are in. Then we'll be able to reason about them in isolation, and will be better equipped to do the needed testing / tool updates.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189392952
  
    --- Diff: src/java/org/apache/cassandra/dht/RangeFetchMapCalculator.java ---
    @@ -347,15 +349,16 @@ else if (vertex.isRangeVertex())
         private boolean addEndpoints(MutableCapacityGraph<Vertex, Integer> capacityGraph, RangeVertex rangeVertex, boolean localDCCheck)
         {
             boolean sourceFound = false;
    -        for (InetAddressAndPort endpoint : rangesWithSources.get(rangeVertex.getRange()))
    +        Replicas.checkFull(rangesWithSources.get(rangeVertex.getRange()));
    +        for (Replica replica : rangesWithSources.get(rangeVertex.getRange()))
             {
    -            if (passFilters(endpoint, localDCCheck))
    +            if (passFilters(replica, localDCCheck))
                 {
                     sourceFound = true;
                     // if we pass filters, it means that we don't filter away localhost and we can count it as a source:
    -                if (endpoint.equals(FBUtilities.getBroadcastAddressAndPort()))
    +                if (replica.getEndpoint().equals(FBUtilities.getBroadcastAddressAndPort()))
    --- End diff --
    
    good idea, changed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188770805
  
    --- Diff: doc/source/architecture/dynamo.rst ---
    @@ -74,6 +74,26 @@ nodes in each rack, the data load on the smallest rack may be much higher.  Simi
     into a new rack, it will be considered a replica for the entire ring.  For this reason, many operators choose to
     configure all nodes on a single "rack".
     
    +.. _transient-replication:
    +
    +Transient Replication
    +~~~~~~~~~~~~~~~~~~~~~
    +
    +Transient replication allows you to configure a subset of replicas to only replicate data that hasn't been incrementally
    +repaired. This allows you to trade data redundancy for storage usage, and increased read and write throughput. For instance,
    +if you have a replication factor of 3, with 1 transient replica, 2 replicas will replicate all data for a given token
    +range, while the 3rd will only keep data that hasn't been incrementally repaired. Since you're reducing the copies kept
    +of data by the number of transient replicas, transient replication is best suited to multiple dc deployments.
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188447066
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -541,12 +536,12 @@ public void run()
             return callback;
         }
     
    -    private static boolean proposePaxos(Commit proposal, List<InetAddressAndPort> endpoints, int requiredParticipants, boolean timeoutIfPartial, ConsistencyLevel consistencyLevel, long queryStartNanoTime)
    +    private static boolean proposePaxos(Commit proposal, ReplicaList replicas, int requiredParticipants, boolean timeoutIfPartial, ConsistencyLevel consistencyLevel, long queryStartNanoTime)
         throws WriteTimeoutException
         {
    -        ProposeCallback callback = new ProposeCallback(endpoints.size(), requiredParticipants, !timeoutIfPartial, consistencyLevel, queryStartNanoTime);
    +        ProposeCallback callback = new ProposeCallback(replicas.size(), requiredParticipants, !timeoutIfPartial, consistencyLevel, queryStartNanoTime);
             MessageOut<Commit> message = new MessageOut<Commit>(MessagingService.Verb.PAXOS_PROPOSE, proposal, Commit.serializer);
    -        for (InetAddressAndPort target : endpoints)
    +        for (InetAddressAndPort target : replicas.asEndpoints())
    --- End diff --
    
    Another candidate for manually unwrapping.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188451194
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -1098,7 +1095,7 @@ private static void asyncRemoveFromBatchlog(Collection<InetAddressAndPort> endpo
                     logger.trace("Sending batchlog remove request {} to {}", uuid, target);
     
                 if (canDoLocalRequest(target))
    -                performLocally(Stage.MUTATION, () -> BatchlogManager.remove(uuid));
    +                performLocally(Stage.MUTATION, Replica.fullStandin(target), () -> BatchlogManager.remove(uuid));
    --- End diff --
    
    Full stand in! Either a replica is necessary downstream or it isn't. I checked and to me it sure looks necessary because we use this to decide who to hint to and the decision of should we hint or not is based on the transient state.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188103293
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    +        ReplicaList replicas = new ReplicaList(normalized.size());
    +        for (Range<Token> normalizedRange: normalized)
    +        {
    +            replicas.add(new Replica(getEndpoint(), normalizedRange, isFull()));
    +        }
    +        return replicas;
    +    }
    +
    +    public boolean contains(Range<Token> that)
    +    {
    +        return getRange().contains(that);
    +    }
    +
    +    public boolean intersectsOnRange(Replica replica)
    +    {
    +        return getRange().intersects(replica.getRange());
    +    }
    +
    +    public Replica decorateSubrange(Range<Token> subrange)
    +    {
    +        Preconditions.checkArgument(range.contains(subrange));
    +        return new Replica(getEndpoint(), subrange, isFull());
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, true);
    +    }
    +
    +    /**
    +     * We need to assume an endpoint is a full replica in a with unknown ranges in a
    +     * few cases, so this returns one that throw an exception if you try to get it's range
    +     */
    +    public static Replica fullStandin(InetAddressAndPort endpoint)
    +    {
    +        return new Replica(endpoint, null, true) {
    +            @Override
    +            public Range<Token> getRange()
    +            {
    +                throw new UnsupportedOperationException("Can't get range on standin replicas");
    +            }
    +        };
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Token start, Token end)
    --- End diff --
    
    So generally people shouldn't be constructing using full or transient and I think maybe the javadoc should say that.
    
    You almost always want a real source of the transient status not a fake one. Propagating fake transient status is going to get us into trouble.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188783933
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    --- End diff --
    
    Sorry I was reading your example wrong. I didn't realize the number of full replicas was staying the same.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189038708
  
    --- Diff: src/java/org/apache/cassandra/db/compaction/CompactionManager.java ---
    @@ -468,7 +465,7 @@ public AllSSTableOpStatus performCleanup(final ColumnFamilyStore cfStore, int jo
                 return AllSSTableOpStatus.ABORTED;
             }
             // if local ranges is empty, it means no data should remain
    -        final Collection<Range<Token>> ranges = StorageService.instance.getLocalRanges(keyspace.getName());
    +        final Collection<Range<Token>> ranges = StorageService.instance.getLocalReplicas(keyspace.getName()).asRangeSet();
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    Additional note, forbid the use of counters with transient replication? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197132925
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -344,47 +343,43 @@ private static void recordCasContention(int contentions)
                 casWriteMetrics.contention.update(contentions);
         }
     
    -    private static Predicate<InetAddressAndPort> sameDCPredicateFor(final String dc)
    +    private static Predicate<Replica> sameDCPredicateFor(final String dc)
         {
             final IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
    -        return new Predicate<InetAddressAndPort>()
    -        {
    -            public boolean apply(InetAddressAndPort host)
    -            {
    -                return dc.equals(snitch.getDatacenter(host));
    -            }
    -        };
    +        return replica -> dc.equals(snitch.getDatacenter(replica));
         }
     
         private static PaxosParticipants getPaxosParticipants(TableMetadata metadata, DecoratedKey key, ConsistencyLevel consistencyForPaxos) throws UnavailableException
         {
             Token tk = key.getToken();
    -        List<InetAddressAndPort> naturalEndpoints = StorageService.instance.getNaturalEndpoints(metadata.keyspace, tk);
    -        Collection<InetAddressAndPort> pendingEndpoints = StorageService.instance.getTokenMetadata().pendingEndpointsFor(tk, metadata.keyspace);
    +        ReplicaList naturalReplicas = StorageService.instance.getNaturalReplicas(metadata.keyspace, tk);
    --- End diff --
    
    Looks like we can just concat natural (as list) and pending (as set), without additional copying in-between.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187434990
  
    --- Diff: src/java/org/apache/cassandra/db/ColumnFamilyStore.java ---
    @@ -1868,7 +1866,7 @@ public void compactionDiskSpaceCheck(boolean enable)
     
         public void cleanupCache()
         {
    -        Collection<Range<Token>> ranges = StorageService.instance.getLocalRanges(keyspace.getName());
    +        Collection<Range<Token>> ranges = StorageService.instance.getLocalReplicas(keyspace.getName()).asRangeSet();
    --- End diff --
    
    Same.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r196133429
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -50,6 +50,30 @@
         public abstract int size();
         protected abstract Collection<Replica> getUnmodifiableCollection();
     
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (!(o instanceof Replicas))
    +            return false;
    +
    +        Replicas that = (Replicas) o;
    +        if (this.size() != that.size())
    +            return false;
    +        return Iterables.elementsEqual(this, that);
    +    }
    +
    +
    +    public int hashCode()
    +    {
    +        int result = 17;
    --- End diff --
    
    that's how it's done in Effective Java, bottom of page 48


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197149100
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -1275,36 +1272,38 @@ private static WriteResponseHandlerWrapper wrapViewBatchResponseHandler(Mutation
          *
          * @throws OverloadedException if the hints cannot be written/enqueued
          */
    -    public static void sendToHintedEndpoints(final Mutation mutation,
    -                                             Iterable<InetAddressAndPort> targets,
    -                                             AbstractWriteResponseHandler<IMutation> responseHandler,
    -                                             String localDataCenter,
    -                                             Stage stage)
    +    public static void sendToHintedReplicas(final Mutation mutation,
    +                                            Iterable<Replica> targets,
    --- End diff --
    
    We can use `Replicas` here


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197163900
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -1325,10 +1324,10 @@ public static void sendToHintedEndpoints(final Mutation mutation,
                         }
                         else
                         {
    -                        Collection<InetAddressAndPort> messages = (dcGroups != null) ? dcGroups.get(dc) : null;
    +                        Replicas messages = (dcGroups != null) ? dcGroups.get(dc) : null;
    --- End diff --
    
    We can use `computeIfAbsent` here


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r194793477
  
    --- Diff: src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java ---
    @@ -269,24 +272,26 @@ public void executeAsync()
             {
                 // if CL + RR result in covering all replicas, getReadExecutor forces AlwaysSpeculating.  So we know
                 // that the last replica in our list is "extra."
    -            List<InetAddressAndPort> initialReplicas = targetReplicas.subList(0, targetReplicas.size() - 1);
    +            ReplicaList initialReplicas = targetReplicas.subList(0, targetReplicas.size() - 1);
    +
    +            Replicas.checkFull(initialReplicas);
     
                 if (handler.blockfor < initialReplicas.size())
                 {
                     // We're hitting additional targets for read repair.  Since our "extra" replica is the least-
                     // preferred by the snitch, we do an extra data read to start with against a replica more
                     // likely to reply; better to let RR fail than the entire query.
    -                makeDataRequests(initialReplicas.subList(0, 2));
    +                makeDataRequests(initialReplicas.subList(0, 2).asEndpoints());
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188446854
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -503,12 +498,12 @@ private static void sendCommit(Commit commit, Iterable<InetAddressAndPort> repli
                 MessagingService.instance().sendOneWay(message, target);
         }
     
    -    private static PrepareCallback preparePaxos(Commit toPrepare, List<InetAddressAndPort> endpoints, int requiredParticipants, ConsistencyLevel consistencyForPaxos, long queryStartNanoTime)
    +    private static PrepareCallback preparePaxos(Commit toPrepare, ReplicaList replicas, int requiredParticipants, ConsistencyLevel consistencyForPaxos, long queryStartNanoTime)
         throws WriteTimeoutException
         {
             PrepareCallback callback = new PrepareCallback(toPrepare.update.partitionKey(), toPrepare.update.metadata(), requiredParticipants, consistencyForPaxos, queryStartNanoTime);
             MessageOut<Commit> message = new MessageOut<Commit>(MessagingService.Verb.PAXOS_PREPARE, toPrepare, Commit.serializer);
    -        for (InetAddressAndPort target : endpoints)
    +        for (InetAddressAndPort target : replicas.asEndpoints())
    --- End diff --
    
    This is a candidate for manually unwrapping in the loop.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r195480580
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -50,6 +50,30 @@
         public abstract int size();
         protected abstract Collection<Replica> getUnmodifiableCollection();
     
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (!(o instanceof Replicas))
    +            return false;
    +
    +        Replicas that = (Replicas) o;
    +        if (this.size() != that.size())
    +            return false;
    +        return Iterables.elementsEqual(this, that);
    +    }
    +
    +
    +    public int hashCode()
    +    {
    +        int result = 17;
    --- End diff --
    
    Result starts as 17? I don't recall seeing that being done normally.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188452737
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -1526,38 +1529,37 @@ protected Verb verb()
          * is unclear we want to mix those latencies with read latencies, so this
          * may be a bit involved.
          */
    -    private static InetAddressAndPort findSuitableEndpoint(String keyspaceName, DecoratedKey key, String localDataCenter, ConsistencyLevel cl) throws UnavailableException
    +    private static Replica findSuitableReplica(String keyspaceName, DecoratedKey key, String localDataCenter, ConsistencyLevel cl) throws UnavailableException
         {
             Keyspace keyspace = Keyspace.open(keyspaceName);
             IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
    -        List<InetAddressAndPort> endpoints = new ArrayList<>();
    -        StorageService.instance.getLiveNaturalEndpoints(keyspace, key, endpoints);
    +        ReplicaList replicas = StorageService.instance.getLiveNaturalReplicas(keyspace, key);
     
             // CASSANDRA-13043: filter out those endpoints not accepting clients yet, maybe because still bootstrapping
    -        endpoints.removeIf(endpoint -> !StorageService.instance.isRpcReady(endpoint));
    +        replicas = replicas.filter(replica -> StorageService.instance.isRpcReady(replica.getEndpoint()));
    --- End diff --
    
    if isRpcReady were static you could avoid allocating the lambda.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188695095
  
    --- Diff: src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java ---
    @@ -269,24 +272,26 @@ public void executeAsync()
             {
                 // if CL + RR result in covering all replicas, getReadExecutor forces AlwaysSpeculating.  So we know
                 // that the last replica in our list is "extra."
    -            List<InetAddressAndPort> initialReplicas = targetReplicas.subList(0, targetReplicas.size() - 1);
    +            ReplicaList initialReplicas = targetReplicas.subList(0, targetReplicas.size() - 1);
    +
    +            Replicas.checkFull(initialReplicas);
     
                 if (handler.blockfor < initialReplicas.size())
                 {
                     // We're hitting additional targets for read repair.  Since our "extra" replica is the least-
                     // preferred by the snitch, we do an extra data read to start with against a replica more
                     // likely to reply; better to let RR fail than the entire query.
    -                makeDataRequests(initialReplicas.subList(0, 2));
    +                makeDataRequests(initialReplicas.subList(0, 2).asEndpoints());
                     if (initialReplicas.size() > 2)
    -                    makeDigestRequests(initialReplicas.subList(2, initialReplicas.size()));
    +                    makeDigestRequests(initialReplicas.subList(2, initialReplicas.size()).asEndpoints());
                 }
                 else
                 {
                     // not doing read repair; all replies are important, so it doesn't matter which nodes we
                     // perform data reads against vs digest.
    -                makeDataRequests(initialReplicas.subList(0, 1));
    +                makeDataRequests(initialReplicas.subList(0, 1).asEndpoints());
                     if (initialReplicas.size() > 1)
    -                    makeDigestRequests(initialReplicas.subList(1, initialReplicas.size()));
    +                    makeDigestRequests(initialReplicas.subList(1, initialReplicas.size()).asEndpoints());
    --- End diff --
    
    Avoid intermediate sublist


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197124923
  
    --- Diff: src/java/org/apache/cassandra/db/ConsistencyLevel.java ---
    @@ -148,40 +150,45 @@ public boolean isLocal(InetAddressAndPort endpoint)
             return DatabaseDescriptor.getLocalDataCenter().equals(DatabaseDescriptor.getEndpointSnitch().getDatacenter(endpoint));
         }
     
    -    public int countLocalEndpoints(Iterable<InetAddressAndPort> liveEndpoints)
    +    public boolean isLocal(Replica replica)
    +    {
    +        return isLocal(replica.getEndpoint());
    +    }
    +
    +    public int countLocalEndpoints(Iterable<Replica> liveReplicas)
         {
             int count = 0;
    -        for (InetAddressAndPort endpoint : liveEndpoints)
    -            if (isLocal(endpoint))
    +        for (Replica replica : liveReplicas)
    +            if (isLocal(replica))
                     count++;
             return count;
         }
     
    -    private Map<String, Integer> countPerDCEndpoints(Keyspace keyspace, Iterable<InetAddressAndPort> liveEndpoints)
    +    private Map<String, Integer> countPerDCEndpoints(Keyspace keyspace, Iterable<Replica> liveReplicas)
    --- End diff --
    
    This can be just `Replicas`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r194793488
  
    --- Diff: src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java ---
    @@ -344,17 +351,18 @@ public void maybeTryAdditionalReplicas()
                 // no-op
             }
     
    -        public List<InetAddressAndPort> getContactedReplicas()
    +        public ReplicaList getContactedReplicas()
             {
                 return targetReplicas;
             }
     
             @Override
             public void executeAsync()
             {
    -            makeDataRequests(targetReplicas.subList(0, targetReplicas.size() > 1 ? 2 : 1));
    +            Replicas.checkFull(targetReplicas);
    +            makeDataRequests(targetReplicas.subList(0, targetReplicas.size() > 1 ? 2 : 1).asEndpoints());
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188786670
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    +
    +    private void warnIfIncreasingRF(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (newStrategy.getReplicationFactor().full > oldStrategy.getReplicationFactor().full)
    --- End diff --
    
    Sounds good.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188352903
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -0,0 +1,313 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Set;
    +
    +import com.google.common.base.Predicate;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.utils.FBUtilities;
    +
    +/**
    + * A collection like class for Replica objects. Since the Replica class contains inetaddress, range, and
    + * transient replication status, basic contains and remove methods can be ambiguous. Replicas forces you
    + * to be explicit about what you're checking the container for, or removing from it.
    + */
    +public abstract class Replicas implements Iterable<Replica>
    +{
    +
    +    public abstract boolean add(Replica replica);
    +    public abstract void addAll(Iterable<Replica> replicas);
    +    public abstract void removeEndpoint(InetAddressAndPort endpoint);
    +    public abstract void removeReplica(Replica replica);
    +    public abstract int size();
    +
    +    public Iterable<InetAddressAndPort> asEndpoints()
    +    {
    +        return Iterables.transform(this, Replica::getEndpoint);
    +    }
    +
    +    public Set<InetAddressAndPort> asEndpointSet()
    --- End diff --
    
    I am thinking we should have two versions, asEndpointSet/asRangeSet and toEndpointSet/toRangeSet. The "to" version allocates and converts. The "as" version is a Collections2 wrapper.
    
    I think in a lot of places we will want neither because I saw several opportunities to manually unwrap.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187715019
  
    --- Diff: src/java/org/apache/cassandra/dht/RangeStreamer.java ---
    @@ -176,25 +179,28 @@ public void addSourceFilter(ISourceFilter filter)
          * Add ranges to be streamed for given keyspace.
          *
          * @param keyspaceName keyspace name
    -     * @param ranges ranges to be streamed
    +     * @param replicas ranges to be streamed
          */
    -    public void addRanges(String keyspaceName, Collection<Range<Token>> ranges)
    +    public void addRanges(String keyspaceName, Replicas replicas)
         {
             if(Keyspace.open(keyspaceName).getReplicationStrategy() instanceof LocalStrategy)
             {
                 logger.info("Not adding ranges for Local Strategy keyspace={}", keyspaceName);
                 return;
             }
     
    +        Replicas.checkFull(replicas);
    +
             boolean useStrictSource = useStrictSourcesForRanges(keyspaceName);
    -        Multimap<Range<Token>, InetAddressAndPort> rangesForKeyspace = useStrictSource
    -                ? getAllRangesWithStrictSourcesFor(keyspaceName, ranges) : getAllRangesWithSourcesFor(keyspaceName, ranges);
    +        ReplicaMultimap<Range<Token>, ReplicaList> rangesForKeyspace = useStrictSource
    +                                                                       ? getAllRangesWithStrictSourcesFor(keyspaceName, replicas.fullRanges())
    --- End diff --
    
    fullRanges(), not a fan.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189109427
  
    --- Diff: src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java ---
    @@ -225,7 +223,7 @@ protected boolean waitingFor(InetAddressAndPort from)
     
         public void assureSufficientLiveNodes() throws UnavailableException
         {
    -        consistencyLevel.assureSufficientLiveNodes(keyspace, Iterables.filter(Iterables.concat(naturalEndpoints, pendingEndpoints), isAlive));
    +        consistencyLevel.assureSufficientLiveNodes(keyspace, Replicas.filter(Replicas.concatNaturalAndPending(naturalReplicas, pendingReplicas), isReplicaAlive));
    --- End diff --
    
    I don't think we're actually allocating any more objects than we were before. Both `filter` and `concat`/`concatNaturalAndPending` each allocate a single iterable/replicas warpper, and don't copy anything.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r194812259
  
    --- Diff: src/java/org/apache/cassandra/locator/PendingRangeMaps.java ---
    @@ -23,196 +23,176 @@
     import com.google.common.collect.Iterators;
     import org.apache.cassandra.dht.Range;
     import org.apache.cassandra.dht.Token;
    -import org.slf4j.Logger;
    -import org.slf4j.LoggerFactory;
     
     import java.util.*;
     
    -public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<InetAddressAndPort>>>
    +public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<Replica>>>
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197130296
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,270 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        this(new ArrayList<>());
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        this(new ArrayList<>(capacity));
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        this(new ArrayList<>(from.replicaList));
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        this(new ArrayList<>(from.size()));
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        this(new ArrayList<>(from));
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        Preconditions.checkNotNull(replica);
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    protected Collection<Replica> getUnmodifiableCollection()
    +    {
    +        return Collections.unmodifiableCollection(replicaList);
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=replicaList.size()-1; i>=0; i--)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +            {
    +                replicaList.remove(i);
    +            }
    +        }
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    @Override
    +    public boolean containsEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=0; i<size(); i++)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +                return true;
    +        }
    +        return false;
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = size() < 10 ? new ArrayList<>(size()) : new ArrayList<>();
    +        for (int i=0; i<size(); i++)
    +        {
    +            Replica replica = replicaList.get(i);
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of()
    +    {
    +        return new ReplicaList(0);
    +    }
    +
    +    public static ReplicaList of(Replica replica)
    +    {
    +        ReplicaList replicaList = new ReplicaList(1);
    +        replicaList.add(replica);
    +        return replicaList;
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    +    {
    +        ReplicaList replicaList = new ReplicaList(replicas.length);
    +        for (Replica replica: replicas)
    +        {
    +            replicaList.add(replica);
    +        }
    +        return replicaList;
    +    }
    +
    +    public ReplicaList subList(int fromIndex, int toIndex)
    +    {
    +        return new ReplicaList(replicaList.subList(fromIndex, toIndex));
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        if (Iterables.all(this, Replica::isFull))
    +        {
    +            ReplicaList normalized = new ReplicaList(size());
    +            for (Replica replica: this)
    +            {
    +                replica.addNormalizeByRange(normalized);
    +            }
    +
    +            return normalized;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public static ReplicaList immutableCopyOf(Replicas replicas)
    +    {
    +        return new ReplicaList(ImmutableList.<Replica>builder().addAll(replicas).build());
    +    }
    +
    +    public static ReplicaList immutableCopyOf(ReplicaList replicas)
    +    {
    +        return new ReplicaList(ImmutableList.copyOf(replicas.replicaList));
    +    }
    +
    +    public static ReplicaList immutableCopyOf(Replica... replicas)
    +    {
    +        return new ReplicaList(ImmutableList.copyOf(replicas));
    +    }
    +
    +    public static ReplicaList empty()
    +    {
    +        return new ReplicaList();
    +    }
    +
    +    public static ReplicaList fullStandIns(Collection<InetAddressAndPort> endpoints)
    --- End diff --
    
    `Collection` is already `Iterable`, so we only need a looser version since they seem to be identical.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188100355
  
    --- Diff: src/java/org/apache/cassandra/locator/PendingRangeMaps.java ---
    @@ -23,196 +23,176 @@
     import com.google.common.collect.Iterators;
     import org.apache.cassandra.dht.Range;
     import org.apache.cassandra.dht.Token;
    -import org.slf4j.Logger;
    -import org.slf4j.LoggerFactory;
     
     import java.util.*;
     
    -public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<InetAddressAndPort>>>
    +public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<Replica>>>
     {
    -    private static final Logger logger = LoggerFactory.getLogger(PendingRangeMaps.class);
    -
         /**
          * We have for NavigableMap to be able to search for ranges containing a token efficiently.
          *
          * First two are for non-wrap-around ranges, and the last two are for wrap-around ranges.
          */
         // ascendingMap will sort the ranges by the ascending order of right token
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMap;
    +    private final NavigableMap<Range<Token>, List<Replica>> ascendingMap;
    +
         /**
          * sorting end ascending, if ends are same, sorting begin descending, so that token (end, end) will
          * come before (begin, end] with the same end, and (begin, end) will be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> ascendingComparator = new Comparator<Range<Token>>()
    -        {
    -            @Override
    -            public int compare(Range<Token> o1, Range<Token> o2)
    -            {
    -                int res = o1.right.compareTo(o2.right);
    -                if (res != 0)
    -                    return res;
    +    private static final Comparator<Range<Token>> ascendingComparator = (o1, o2) -> {
    +        int res = o1.right.compareTo(o2.right);
    +        if (res != 0)
    +            return res;
     
    -                return o2.left.compareTo(o1.left);
    -            }
    -        };
    +        return o2.left.compareTo(o1.left);
    +    };
     
         // ascendingMap will sort the ranges by the descending order of left token
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMap;
    +    private final NavigableMap<Range<Token>, List<Replica>> descendingMap;
    +
         /**
          * sorting begin descending, if begins are same, sorting end descending, so that token (begin, begin) will
          * come after (begin, end] with the same begin, and (begin, end) won't be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> descendingComparator = new Comparator<Range<Token>>()
    -        {
    -            @Override
    -            public int compare(Range<Token> o1, Range<Token> o2)
    -            {
    -                int res = o2.left.compareTo(o1.left);
    -                if (res != 0)
    -                    return res;
    +    private static final Comparator<Range<Token>> descendingComparator = (o1, o2) -> {
    +        int res = o2.left.compareTo(o1.left);
    +        if (res != 0)
    +            return res;
     
    -                // if left tokens are same, sort by the descending of the right tokens.
    -                return o2.right.compareTo(o1.right);
    -            }
    -        };
    +        // if left tokens are same, sort by the descending of the right tokens.
    +        return o2.right.compareTo(o1.right);
    +    };
     
         // these two maps are for warp around ranges.
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMapForWrapAround;
    +    private final NavigableMap<Range<Token>, List<Replica>> ascendingMapForWrapAround;
    +
         /**
          * for wrap around range (begin, end], which begin > end.
          * Sorting end ascending, if ends are same, sorting begin ascending,
          * so that token (end, end) will come before (begin, end] with the same end, and (begin, end] will be selected in
          * the tailMap.
          */
    -    static final Comparator<Range<Token>> ascendingComparatorForWrapAround = new Comparator<Range<Token>>()
    -    {
    -        @Override
    -        public int compare(Range<Token> o1, Range<Token> o2)
    -        {
    -            int res = o1.right.compareTo(o2.right);
    -            if (res != 0)
    -                return res;
    +    private static final Comparator<Range<Token>> ascendingComparatorForWrapAround = (o1, o2) -> {
    +        int res = o1.right.compareTo(o2.right);
    +        if (res != 0)
    +            return res;
     
    -            return o1.left.compareTo(o2.left);
    -        }
    +        return o1.left.compareTo(o2.left);
         };
     
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMapForWrapAround;
    +    private final NavigableMap<Range<Token>, List<Replica>> descendingMapForWrapAround;
    +
         /**
          * for wrap around ranges, which begin > end.
          * Sorting end ascending, so that token (begin, begin) will come after (begin, end] with the same begin,
          * and (begin, end) won't be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> descendingComparatorForWrapAround = new Comparator<Range<Token>>()
    -    {
    -        @Override
    -        public int compare(Range<Token> o1, Range<Token> o2)
    -        {
    -            int res = o2.left.compareTo(o1.left);
    -            if (res != 0)
    -                return res;
    -            return o1.right.compareTo(o2.right);
    -        }
    +    private static final Comparator<Range<Token>> descendingComparatorForWrapAround = (o1, o2) -> {
    --- End diff --
    
    These are kind of orthogonal changes, but sure. I don't see a down side to doing them here. I'll bet these comparators are low traffic so it won't be much merge pain.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r191572253
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,283 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        this(new ArrayList<>());
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        this(new ArrayList<>(capacity));
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        this(new ArrayList<>(from.replicaList));
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        this(new ArrayList<>(from.size()));
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        this(new ArrayList<>(from));
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +        return replicaList.hashCode();
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        Preconditions.checkNotNull(replica);
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    protected Collection<Replica> getUnmodifiableCollection()
    +    {
    +        return Collections.unmodifiableCollection(replicaList);
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=replicaList.size()-1; i>=0; i--)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +            {
    +                replicaList.remove(i);
    +            }
    +        }
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    @Override
    +    public boolean containsEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=0; i<size(); i++)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +                return true;
    +        }
    +        return false;
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = size() < 10 ? new ArrayList<>(size()) : new ArrayList<>();
    +        for (int i=0; i<size(); i++)
    +        {
    +            Replica replica = replicaList.get(i);
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of()
    +    {
    +        return new ReplicaList(0);
    +    }
    +
    +    public static ReplicaList of(Replica replica)
    +    {
    +        ReplicaList replicaList = new ReplicaList(1);
    +        replicaList.add(replica);
    +        return replicaList;
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    +    {
    +        ReplicaList replicaList = new ReplicaList(replicas.length);
    +        for (Replica replica: replicas)
    +        {
    +            replicaList.add(replica);
    +        }
    +        return replicaList;
    +    }
    +
    +    public ReplicaList subList(int fromIndex, int toIndex)
    +    {
    +        return new ReplicaList(replicaList.subList(fromIndex, toIndex));
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        if (Iterables.all(this, Replica::isFull))
    +        {
    +            ReplicaList normalized = new ReplicaList(size());
    +            for (Replica replica: this)
    +            {
    +                replica.addNormalizeByRange(normalized);
    +            }
    +
    +            return normalized;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public static ReplicaList immutableCopyOf(Replicas replicas)
    +    {
    +        return new ReplicaList(ImmutableList.<Replica>builder().addAll(replicas).build());
    +    }
    +
    +    public static ReplicaList immutableCopyOf(ReplicaList replicas)
    +    {
    +        return new ReplicaList(ImmutableList.copyOf(replicas.replicaList));
    +    }
    +
    +    public static ReplicaList immutableCopyOf(Replica... replicas)
    +    {
    +        return new ReplicaList(ImmutableList.copyOf(replicas));
    +    }
    +
    +    public static ReplicaList empty()
    +    {
    +        return new ReplicaList();
    +    }
    +
    +    public static ReplicaList fullStandIns(Collection<InetAddressAndPort> endpoints)
    +    {
    +        ReplicaList replicaList = new ReplicaList(endpoints.size());
    +        for (InetAddressAndPort endpoint: endpoints)
    +        {
    +            replicaList.add(Replica.fullStandin(endpoint));
    +        }
    +        return replicaList;
    +    }
    +
    +    public static ReplicaList fullStandIns(Iterable<InetAddressAndPort> endpoints)
    +    {
    +        ReplicaList replicaList = new ReplicaList();
    +        for (InetAddressAndPort endpoint: endpoints)
    +        {
    +            replicaList.add(Replica.fullStandin(endpoint));
    +        }
    +        return replicaList;
    +    }
    +
    +    /**
    +     * For allocating ReplicaLists where the final size is unknown, but
    +     * should be less than the given size. Prevents overallocations in cases
    +     * where there are less than the default ArrayList size, and defers to the
    +     * ArrayList algorithem where there might be more
    --- End diff --
    
    typo in algorithm


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189117028
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -1364,68 +1363,72 @@ public static void sendToHintedEndpoints(final Mutation mutation,
                 submitHint(mutation, endpointsToHint, responseHandler);
     
             if (insertLocal)
    -            performLocally(stage, Optional.of(mutation), mutation::apply, responseHandler);
    +        {
    +            Preconditions.checkNotNull(localReplica);
    +            performLocally(stage, localReplica, Optional.of(mutation), mutation::apply, responseHandler);
    +        }
     
             if (localDc != null)
             {
    -            for (InetAddressAndPort destination : localDc)
    -                MessagingService.instance().sendRR(message, destination, responseHandler, true);
    +            for (Replica destination : localDc)
    +                MessagingService.instance().sendWriteRR(message, destination, responseHandler, true);
             }
             if (dcGroups != null)
             {
                 // for each datacenter, send the message to one node to relay the write to other replicas
    -            for (Collection<InetAddressAndPort> dcTargets : dcGroups.values())
    +            for (Replicas dcTargets : dcGroups.values())
                     sendMessagesToNonlocalDC(message, dcTargets, responseHandler);
             }
         }
     
    -    private static void checkHintOverload(InetAddressAndPort destination)
    +    private static void checkHintOverload(Replica destination)
         {
             // avoid OOMing due to excess hints.  we need to do this check even for "live" nodes, since we can
             // still generate hints for those if it's overloaded or simply dead but not yet known-to-be-dead.
             // The idea is that if we have over maxHintsInProgress hints in flight, this is probably due to
             // a small number of nodes causing problems, so we should avoid shutting down writes completely to
             // healthy nodes.  Any node with no hintsInProgress is considered healthy.
             if (StorageMetrics.totalHintsInProgress.getCount() > maxHintsInProgress
    -                && (getHintsInProgressFor(destination).get() > 0 && shouldHint(destination)))
    +                && (getHintsInProgressFor(destination.getEndpoint()).get() > 0 && shouldHint(destination)))
             {
                 throw new OverloadedException("Too many in flight hints: " + StorageMetrics.totalHintsInProgress.getCount() +
                                               " destination: " + destination +
    -                                          " destination hints: " + getHintsInProgressFor(destination).get());
    +                                          " destination hints: " + getHintsInProgressFor(destination.getEndpoint()).get());
             }
         }
     
         private static void sendMessagesToNonlocalDC(MessageOut<? extends IMutation> message,
    -                                                 Collection<InetAddressAndPort> targets,
    +                                                 Replicas targets,
                                                      AbstractWriteResponseHandler<IMutation> handler)
         {
    -        Iterator<InetAddressAndPort> iter = targets.iterator();
    +        Iterator<Replica> iter = targets.iterator();
             int[] messageIds = new int[targets.size()];
    -        InetAddressAndPort target = iter.next();
    +        Replica target = iter.next();
     
             int idIdx = 0;
             // Add the other destinations of the same message as a FORWARD_HEADER entry
             while (iter.hasNext())
             {
    -            InetAddressAndPort destination = iter.next();
    -            int id = MessagingService.instance().addCallback(handler,
    -                                                             message,
    -                                                             destination,
    -                                                             message.getTimeout(),
    -                                                             handler.consistencyLevel,
    -                                                             true);
    +            Replica destination = iter.next();
    +            int id = MessagingService.instance().addWriteCallback(handler,
    +                                                                  message,
    +                                                                  destination,
    +                                                                  message.getTimeout(),
    +                                                                  handler.consistencyLevel,
    +                                                                  true);
                 messageIds[idIdx++] = id;
                 logger.trace("Adding FWD message to {}@{}", id, destination);
             }
    -        message = message.withParameter(ParameterType.FORWARD_TO.FORWARD_TO, new ForwardToContainer(targets, messageIds));
    +        Replicas.checkFull(targets);
    +        message = message.withParameter(ParameterType.FORWARD_TO.FORWARD_TO, new ForwardToContainer(targets.asEndpointList(), messageIds));
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189119609
  
    --- Diff: src/java/org/apache/cassandra/service/reads/DataResolver.java ---
    @@ -64,12 +72,19 @@ public PartitionIterator resolve()
             // at the beginning of this method), so grab the response count once and use that through the method.
             int count = responses.size();
             List<UnfilteredPartitionIterator> iters = new ArrayList<>(count);
    -        InetAddressAndPort[] sources = new InetAddressAndPort[count];
    +        Replica[] sources = new Replica[count];
             for (int i = 0; i < count; i++)
             {
                 MessageIn<ReadResponse> msg = responses.get(i);
                 iters.add(msg.payload.makeIterator(command));
    -            sources[i] = msg.from;
    +
    +            Replica replica = replicaMap.get(msg.from);
    +            if (replica == null)
    --- End diff --
    
    My thinking here was that the speculative read repair patch (in review still) could speculate to a node right before a ring change, and then wouldn't be able to find it through the read command. Since the speculation and read responses will be on different threads, adding replicas post DataResolver instantiation would require synchronization of the map, which isn't really worth it for the vanishingly small chance of this race occurring. The consequence of using a fake full replica is just the potential sending of read repair mutations to a transient replica.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188785532
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    --- End diff --
    
    If you are decreasing the transient count who the full replicas are is unchanging? So you don't need to move data around? You just need to fetch what you are about to lose from the transients before you lose another replica?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187672097
  
    --- Diff: src/java/org/apache/cassandra/dht/RangeFetchMapCalculator.java ---
    @@ -347,15 +349,16 @@ else if (vertex.isRangeVertex())
         private boolean addEndpoints(MutableCapacityGraph<Vertex, Integer> capacityGraph, RangeVertex rangeVertex, boolean localDCCheck)
         {
             boolean sourceFound = false;
    -        for (InetAddressAndPort endpoint : rangesWithSources.get(rangeVertex.getRange()))
    +        Replicas.checkFull(rangesWithSources.get(rangeVertex.getRange()));
    +        for (Replica replica : rangesWithSources.get(rangeVertex.getRange()))
             {
    -            if (passFilters(endpoint, localDCCheck))
    +            if (passFilters(replica, localDCCheck))
                 {
                     sourceFound = true;
                     // if we pass filters, it means that we don't filter away localhost and we can count it as a source:
    -                if (endpoint.equals(FBUtilities.getBroadcastAddressAndPort()))
    +                if (replica.getEndpoint().equals(FBUtilities.getBroadcastAddressAndPort()))
    --- End diff --
    
    Crazy thought, what if Replica had an isLocal or something?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187429139
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    --- End diff --
    
    I think we can allow people to increase # of transient replicas (they make it real with nodetool cleanup or repair), but how does decreasing number of transient replicas work safely? I think we should disallow it until we can articulate how and why it is safe. There has to be a temporary pending state where a node is transitioning from transient to full where it receives writes but not reads because it can't correctly service reads as a full replcia.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188118159
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    --- End diff --
    
    This allocates a lambda. Better to iterate and remove by index. removeIf also allocates an iterator. Also you need to handle a null endpoint correctly. Does ReplicaList and Replicas allow null entries?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189091639
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    +        ReplicaList replicas = new ReplicaList(normalized.size());
    +        for (Range<Token> normalizedRange: normalized)
    +        {
    +            replicas.add(new Replica(getEndpoint(), normalizedRange, isFull()));
    +        }
    +        return replicas;
    +    }
    +
    +    public boolean contains(Range<Token> that)
    +    {
    +        return getRange().contains(that);
    +    }
    +
    +    public boolean intersectsOnRange(Replica replica)
    +    {
    +        return getRange().intersects(replica.getRange());
    +    }
    +
    +    public Replica decorateSubrange(Range<Token> subrange)
    +    {
    +        Preconditions.checkArgument(range.contains(subrange));
    +        return new Replica(getEndpoint(), subrange, isFull());
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, true);
    +    }
    +
    +    /**
    +     * We need to assume an endpoint is a full replica in a with unknown ranges in a
    +     * few cases, so this returns one that throw an exception if you try to get it's range
    +     */
    +    public static Replica fullStandin(InetAddressAndPort endpoint)
    +    {
    +        return new Replica(endpoint, null, true) {
    +            @Override
    +            public Range<Token> getRange()
    +            {
    +                throw new UnsupportedOperationException("Can't get range on standin replicas");
    +            }
    +        };
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return full(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, false);
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return trans(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static void assureSufficientFullReplica(Collection<Replica> replicas, ConsistencyLevel cl) throws UnavailableException
    +    {
    +        if (!Iterables.any(replicas, Replica::isFull))
    +        {
    +            throw new UnavailableException(cl, "At least one full replica required", 1, 0);
    +        }
    +    }
    +
    +    public static Iterable<InetAddressAndPort> toEndpoints(Iterable<Replica> replicas)
    +    {
    +        return Iterables.transform(replicas, Replica::getEndpoint);
    +    }
    +
    +    public static List<InetAddressAndPort> toEndpointList(List<Replica> replicas)
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187476449
  
    --- Diff: src/java/org/apache/cassandra/db/view/ViewBuilder.java ---
    @@ -135,14 +137,15 @@ private synchronized void build()
             }
     
             // Get the local ranges for which the view hasn't already been built nor it's building
    -        Set<Range<Token>> newRanges = StorageService.instance.getLocalRanges(ksName)
    -                                                             .stream()
    -                                                             .map(r -> r.subtractAll(builtRanges))
    -                                                             .flatMap(Set::stream)
    -                                                             .map(r -> r.subtractAll(pendingRanges.keySet()))
    -                                                             .flatMap(Set::stream)
    -                                                             .collect(Collectors.toSet());
    -
    +        ReplicaSet replicatedRanges = StorageService.instance.getLocalReplicas(ksName);
    +        Replicas.checkFull(StorageService.instance.getLocalReplicas(ksName));
    +        Set<Range<Token>> newRanges = replicatedRanges.asRangeSet()
    --- End diff --
    
    Also just iterating here. You don't even need Collection2 you can unwrap it in the first map.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197149200
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -1275,36 +1272,38 @@ private static WriteResponseHandlerWrapper wrapViewBatchResponseHandler(Mutation
          *
          * @throws OverloadedException if the hints cannot be written/enqueued
          */
    -    public static void sendToHintedEndpoints(final Mutation mutation,
    -                                             Iterable<InetAddressAndPort> targets,
    -                                             AbstractWriteResponseHandler<IMutation> responseHandler,
    -                                             String localDataCenter,
    -                                             Stage stage)
    +    public static void sendToHintedReplicas(final Mutation mutation,
    +                                            Iterable<Replica> targets,
    +                                            AbstractWriteResponseHandler<IMutation> responseHandler,
    +                                            String localDataCenter,
    +                                            Stage stage)
         throws OverloadedException
         {
             int targetsSize = Iterables.size(targets);
     
             // this dc replicas:
    -        Collection<InetAddressAndPort> localDc = null;
    +        Collection<Replica> localDc = null;
    --- End diff --
    
    We can use `Replicas` here 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188440896
  
    --- Diff: src/java/org/apache/cassandra/net/MessagingService.java ---
    @@ -591,8 +592,9 @@ public void run()
     
                     if (expiredCallbackInfo.shouldHint())
                     {
    -                    Mutation mutation = ((WriteCallbackInfo) expiredCallbackInfo).mutation();
    -                    return StorageProxy.submitHint(mutation, expiredCallbackInfo.target, null);
    +                    WriteCallbackInfo writeCallbackInfo = ((WriteCallbackInfo) expiredCallbackInfo);
    +                    Mutation mutation = writeCallbackInfo.mutation();
    --- End diff --
    
    Kind of a rando change, were you looking at mutation in a debugger?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188107268
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    +        ReplicaList replicas = new ReplicaList(normalized.size());
    +        for (Range<Token> normalizedRange: normalized)
    +        {
    +            replicas.add(new Replica(getEndpoint(), normalizedRange, isFull()));
    +        }
    +        return replicas;
    +    }
    +
    +    public boolean contains(Range<Token> that)
    +    {
    +        return getRange().contains(that);
    +    }
    +
    +    public boolean intersectsOnRange(Replica replica)
    +    {
    +        return getRange().intersects(replica.getRange());
    +    }
    +
    +    public Replica decorateSubrange(Range<Token> subrange)
    +    {
    +        Preconditions.checkArgument(range.contains(subrange));
    +        return new Replica(getEndpoint(), subrange, isFull());
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, true);
    +    }
    +
    +    /**
    +     * We need to assume an endpoint is a full replica in a with unknown ranges in a
    +     * few cases, so this returns one that throw an exception if you try to get it's range
    +     */
    +    public static Replica fullStandin(InetAddressAndPort endpoint)
    +    {
    +        return new Replica(endpoint, null, true) {
    +            @Override
    +            public Range<Token> getRange()
    +            {
    +                throw new UnsupportedOperationException("Can't get range on standin replicas");
    +            }
    +        };
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return full(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, false);
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return trans(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static void assureSufficientFullReplica(Collection<Replica> replicas, ConsistencyLevel cl) throws UnavailableException
    +    {
    +        if (!Iterables.any(replicas, Replica::isFull))
    +        {
    +            throw new UnavailableException(cl, "At least one full replica required", 1, 0);
    +        }
    +    }
    +
    +    public static Iterable<InetAddressAndPort> toEndpoints(Iterable<Replica> replicas)
    +    {
    +        return Iterables.transform(replicas, Replica::getEndpoint);
    +    }
    +
    +    public static List<InetAddressAndPort> toEndpointList(List<Replica> replicas)
    --- End diff --
    
    Unused


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187475375
  
    --- Diff: src/java/org/apache/cassandra/db/compaction/CompactionManager.java ---
    @@ -533,7 +530,7 @@ public AllSSTableOpStatus relocateSSTables(final ColumnFamilyStore cfs, int jobs
                 logger.info("Partitioner does not support splitting");
                 return AllSSTableOpStatus.ABORTED;
             }
    -        final Collection<Range<Token>> r = StorageService.instance.getLocalRanges(cfs.keyspace.getName());
    +        final Collection<Range<Token>> r = StorageService.instance.getLocalReplicas(cfs.keyspace.getName()).asRangeSet();
    --- End diff --
    
    This is hardly even used. It just checks for isEmpty(). No need to unwrap.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187433683
  
    --- Diff: src/java/org/apache/cassandra/db/ColumnFamilyStore.java ---
    @@ -1591,7 +1589,7 @@ public long getExpectedCompactedFileSize(Iterable<SSTableReader> sstables, Opera
     
             // cleanup size estimation only counts bytes for keys local to this node
             long expectedFileSize = 0;
    -        Collection<Range<Token>> ranges = StorageService.instance.getLocalRanges(keyspace.getName());
    +        Collection<Range<Token>> ranges = StorageService.instance.getLocalReplicas(keyspace.getName()).asRangeSet();
    --- End diff --
    
    Since this is a collection could you use Collections2.transform instead of materializing a set?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188785202
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    --- End diff --
    
    > I don't think a full repair is necessary?
    
    If you don't, it won't get he repaired data it needs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189394058
  
    --- Diff: src/java/org/apache/cassandra/locator/PendingRangeMaps.java ---
    @@ -23,196 +23,176 @@
     import com.google.common.collect.Iterators;
     import org.apache.cassandra.dht.Range;
     import org.apache.cassandra.dht.Token;
    -import org.slf4j.Logger;
    -import org.slf4j.LoggerFactory;
     
     import java.util.*;
     
    -public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<InetAddressAndPort>>>
    +public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<Replica>>>
     {
    -    private static final Logger logger = LoggerFactory.getLogger(PendingRangeMaps.class);
    -
         /**
          * We have for NavigableMap to be able to search for ranges containing a token efficiently.
          *
          * First two are for non-wrap-around ranges, and the last two are for wrap-around ranges.
          */
         // ascendingMap will sort the ranges by the ascending order of right token
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMap;
    +    private final NavigableMap<Range<Token>, List<Replica>> ascendingMap;
    +
         /**
          * sorting end ascending, if ends are same, sorting begin descending, so that token (end, end) will
          * come before (begin, end] with the same end, and (begin, end) will be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> ascendingComparator = new Comparator<Range<Token>>()
    -        {
    -            @Override
    -            public int compare(Range<Token> o1, Range<Token> o2)
    -            {
    -                int res = o1.right.compareTo(o2.right);
    -                if (res != 0)
    -                    return res;
    +    private static final Comparator<Range<Token>> ascendingComparator = (o1, o2) -> {
    +        int res = o1.right.compareTo(o2.right);
    +        if (res != 0)
    +            return res;
     
    -                return o2.left.compareTo(o1.left);
    -            }
    -        };
    +        return o2.left.compareTo(o1.left);
    +    };
     
         // ascendingMap will sort the ranges by the descending order of left token
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMap;
    +    private final NavigableMap<Range<Token>, List<Replica>> descendingMap;
    +
         /**
          * sorting begin descending, if begins are same, sorting end descending, so that token (begin, begin) will
          * come after (begin, end] with the same begin, and (begin, end) won't be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> descendingComparator = new Comparator<Range<Token>>()
    -        {
    -            @Override
    -            public int compare(Range<Token> o1, Range<Token> o2)
    -            {
    -                int res = o2.left.compareTo(o1.left);
    -                if (res != 0)
    -                    return res;
    +    private static final Comparator<Range<Token>> descendingComparator = (o1, o2) -> {
    +        int res = o2.left.compareTo(o1.left);
    +        if (res != 0)
    +            return res;
     
    -                // if left tokens are same, sort by the descending of the right tokens.
    -                return o2.right.compareTo(o1.right);
    -            }
    -        };
    +        // if left tokens are same, sort by the descending of the right tokens.
    +        return o2.right.compareTo(o1.right);
    +    };
     
         // these two maps are for warp around ranges.
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMapForWrapAround;
    +    private final NavigableMap<Range<Token>, List<Replica>> ascendingMapForWrapAround;
    +
         /**
          * for wrap around range (begin, end], which begin > end.
          * Sorting end ascending, if ends are same, sorting begin ascending,
          * so that token (end, end) will come before (begin, end] with the same end, and (begin, end] will be selected in
          * the tailMap.
          */
    -    static final Comparator<Range<Token>> ascendingComparatorForWrapAround = new Comparator<Range<Token>>()
    -    {
    -        @Override
    -        public int compare(Range<Token> o1, Range<Token> o2)
    -        {
    -            int res = o1.right.compareTo(o2.right);
    -            if (res != 0)
    -                return res;
    +    private static final Comparator<Range<Token>> ascendingComparatorForWrapAround = (o1, o2) -> {
    +        int res = o1.right.compareTo(o2.right);
    +        if (res != 0)
    +            return res;
     
    -            return o1.left.compareTo(o2.left);
    -        }
    +        return o1.left.compareTo(o2.left);
         };
     
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMapForWrapAround;
    +    private final NavigableMap<Range<Token>, List<Replica>> descendingMapForWrapAround;
    +
         /**
          * for wrap around ranges, which begin > end.
          * Sorting end ascending, so that token (begin, begin) will come after (begin, end] with the same begin,
          * and (begin, end) won't be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> descendingComparatorForWrapAround = new Comparator<Range<Token>>()
    -    {
    -        @Override
    -        public int compare(Range<Token> o1, Range<Token> o2)
    -        {
    -            int res = o2.left.compareTo(o1.left);
    -            if (res != 0)
    -                return res;
    -            return o1.right.compareTo(o2.right);
    -        }
    +    private static final Comparator<Range<Token>> descendingComparatorForWrapAround = (o1, o2) -> {
    +        int res = o2.left.compareTo(o1.left);
    +        if (res != 0)
    +            return res;
    +        return o1.right.compareTo(o2.right);
         };
     
         public PendingRangeMaps()
         {
    -        this.ascendingMap = new TreeMap<Range<Token>, List<InetAddressAndPort>>(ascendingComparator);
    -        this.descendingMap = new TreeMap<Range<Token>, List<InetAddressAndPort>>(descendingComparator);
    -        this.ascendingMapForWrapAround = new TreeMap<Range<Token>, List<InetAddressAndPort>>(ascendingComparatorForWrapAround);
    -        this.descendingMapForWrapAround = new TreeMap<Range<Token>, List<InetAddressAndPort>>(descendingComparatorForWrapAround);
    +        this.ascendingMap = new TreeMap<>(ascendingComparator);
    +        this.descendingMap = new TreeMap<>(descendingComparator);
    +        this.ascendingMapForWrapAround = new TreeMap<>(ascendingComparatorForWrapAround);
    +        this.descendingMapForWrapAround = new TreeMap<>(descendingComparatorForWrapAround);
         }
     
         static final void addToMap(Range<Token> range,
    -                               InetAddressAndPort address,
    -                               NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMap,
    -                               NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMap)
    +                               Replica replica,
    +                               NavigableMap<Range<Token>, List<Replica>> ascendingMap,
    +                               NavigableMap<Range<Token>, List<Replica>> descendingMap)
         {
    -        List<InetAddressAndPort> addresses = ascendingMap.get(range);
    -        if (addresses == null)
    +        List<Replica> replicas = ascendingMap.get(range);
    +        if (replicas == null)
             {
    -            addresses = new ArrayList<>(1);
    -            ascendingMap.put(range, addresses);
    -            descendingMap.put(range, addresses);
    +            replicas = new ArrayList<>(1);
    +            ascendingMap.put(range, replicas);
    +            descendingMap.put(range, replicas);
             }
    -        addresses.add(address);
    +        replicas.add(replica);
         }
     
    -    public void addPendingRange(Range<Token> range, InetAddressAndPort address)
    +    public void addPendingRange(Range<Token> range, Replica replica)
         {
             if (Range.isWrapAround(range.left, range.right))
             {
    -            addToMap(range, address, ascendingMapForWrapAround, descendingMapForWrapAround);
    +            addToMap(range, replica, ascendingMapForWrapAround, descendingMapForWrapAround);
             }
             else
             {
    -            addToMap(range, address, ascendingMap, descendingMap);
    +            addToMap(range, replica, ascendingMap, descendingMap);
             }
         }
     
    -    static final void addIntersections(Set<InetAddressAndPort> endpointsToAdd,
    -                                       NavigableMap<Range<Token>, List<InetAddressAndPort>> smallerMap,
    -                                       NavigableMap<Range<Token>, List<InetAddressAndPort>> biggerMap)
    +    static final void addIntersections(ReplicaSet replicasToAdd,
    +                                       NavigableMap<Range<Token>, List<Replica>> smallerMap,
    +                                       NavigableMap<Range<Token>, List<Replica>> biggerMap)
         {
             // find the intersection of two sets
             for (Range<Token> range : smallerMap.keySet())
             {
    -            List<InetAddressAndPort> addresses = biggerMap.get(range);
    -            if (addresses != null)
    +            List<Replica> replicas = biggerMap.get(range);
    +            if (replicas != null)
                 {
    -                endpointsToAdd.addAll(addresses);
    +                replicasToAdd.addAll(replicas);
                 }
             }
         }
     
    -    public Collection<InetAddressAndPort> pendingEndpointsFor(Token token)
    +    public ReplicaSet pendingEndpointsFor(Token token)
         {
    -        Set<InetAddressAndPort> endpoints = new HashSet<>();
    +        ReplicaSet replicas = new ReplicaSet();
     
    -        Range searchRange = new Range(token, token);
    +        Range<Token> searchRange = new Range<>(token, token);
     
             // search for non-wrap-around maps
    -        NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingTailMap = ascendingMap.tailMap(searchRange, true);
    -        NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingTailMap = descendingMap.tailMap(searchRange, false);
    +        NavigableMap<Range<Token>, List<Replica>> ascendingTailMap = ascendingMap.tailMap(searchRange, true);
    +        NavigableMap<Range<Token>, List<Replica>> descendingTailMap = descendingMap.tailMap(searchRange, false);
     
             // add intersections of two maps
             if (ascendingTailMap.size() < descendingTailMap.size())
             {
    -            addIntersections(endpoints, ascendingTailMap, descendingTailMap);
    +            addIntersections(replicas, ascendingTailMap, descendingTailMap);
             }
             else
             {
    -            addIntersections(endpoints, descendingTailMap, ascendingTailMap);
    +            addIntersections(replicas, descendingTailMap, ascendingTailMap);
             }
     
             // search for wrap-around sets
             ascendingTailMap = ascendingMapForWrapAround.tailMap(searchRange, true);
             descendingTailMap = descendingMapForWrapAround.tailMap(searchRange, false);
     
             // add them since they are all necessary.
    -        for (Map.Entry<Range<Token>, List<InetAddressAndPort>> entry : ascendingTailMap.entrySet())
    +        for (Map.Entry<Range<Token>, List<Replica>> entry : ascendingTailMap.entrySet())
             {
    -            endpoints.addAll(entry.getValue());
    +            replicas.addAll(entry.getValue());
             }
    -        for (Map.Entry<Range<Token>, List<InetAddressAndPort>> entry : descendingTailMap.entrySet())
    +        for (Map.Entry<Range<Token>, List<Replica>> entry : descendingTailMap.entrySet())
             {
    -            endpoints.addAll(entry.getValue());
    +            replicas.addAll(entry.getValue());
             }
     
    -        return endpoints;
    +        return replicas;
         }
     
         public String printPendingRanges()
         {
             StringBuilder sb = new StringBuilder();
     
    -        for (Map.Entry<Range<Token>, List<InetAddressAndPort>> entry : this)
    +        for (Map.Entry<Range<Token>, List<Replica>> entry : this)
             {
                 Range<Token> range = entry.getKey();
     
    -            for (InetAddressAndPort address : entry.getValue())
    +            for (Replica replica : entry.getValue())
                 {
    -                sb.append(address).append(':').append(range);
    +                sb.append(replica).append(':').append(range);
    --- End diff --
    
    Some, I'm sure. Not sure what we can really do about it though, since we're adding the port and transient data it kind of needs to be included here. At least it's in a major release though?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    Some test failures https://circleci.com/gh/aweisberg/cassandra/1223#tests/containers/60
    https://circleci.com/gh/aweisberg/cassandra/1225#tests/containers/46
    The dtest failure I suspect is it failing to notice the node move and push the notification. I haven't run it multiple times yet to see if it is flakey or not.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    Reminder we wanted to disallow the creation of transiently replicated keyspaces in mixed version clusters.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197132130
  
    --- Diff: src/java/org/apache/cassandra/locator/TokenMetadata.java ---
    @@ -1204,21 +1205,21 @@ private String printPendingRanges()
             return sb.toString();
         }
     
    -    public Collection<InetAddressAndPort> pendingEndpointsFor(Token token, String keyspaceName)
    +    public Replicas pendingEndpointsFor(Token token, String keyspaceName)
    --- End diff --
    
    If we return with more specific type, but pass less specific type, it might be simpler to establish a contract for List and Set.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197132625
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -78,7 +78,6 @@
     import org.apache.cassandra.service.paxos.ProposeVerbHandler;
     import org.apache.cassandra.net.MessagingService.Verb;
     import org.apache.cassandra.tracing.Tracing;
    -import org.apache.cassandra.transport.Server;
    --- End diff --
    
    `CacheLoder` `hintsInProgress` can be from `Replica` to `AtomicInteger`, this way we can get rid of many mentions of `InetAddressAndPort`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r198213823
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -50,6 +50,30 @@
         public abstract int size();
         protected abstract Collection<Replica> getUnmodifiableCollection();
     
    +
    +    public boolean equals(Object o)
    --- End diff --
    
    Good point on set/list equality. Reverted.
    
    >  We should never return a Replicas that isn't either an instance of either ReplicaSet or ReplicaList 
    
    Why not? The methods that return the immutable containers are basically just returning transformed views of their arguments. The fact that they return generic `ReplicaCollection` instances is a pretty clear indicator that you can't expect them to behave like lists or sets.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    Sankalp made the suggestion that we should forbid altering transient replication during range movements and other things that might conflict. Currently we allow RF changes without really checking if it is safe to do so and there is maybe no reason to continue that.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189117019
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    +        for (Replica replica: replicaList)
    +        {
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    I'm not totally crazy. I added a ReplicaSet.of method and when I call it I get Replicas.of instead which is a little nuts.
    
    What I don't get is I am explicit invoking ReplicaSet and it's a static method! This is some of the craziest language behavior I have ever seen.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189117008
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -541,12 +536,12 @@ public void run()
             return callback;
         }
     
    -    private static boolean proposePaxos(Commit proposal, List<InetAddressAndPort> endpoints, int requiredParticipants, boolean timeoutIfPartial, ConsistencyLevel consistencyLevel, long queryStartNanoTime)
    +    private static boolean proposePaxos(Commit proposal, ReplicaList replicas, int requiredParticipants, boolean timeoutIfPartial, ConsistencyLevel consistencyLevel, long queryStartNanoTime)
         throws WriteTimeoutException
         {
    -        ProposeCallback callback = new ProposeCallback(endpoints.size(), requiredParticipants, !timeoutIfPartial, consistencyLevel, queryStartNanoTime);
    +        ProposeCallback callback = new ProposeCallback(replicas.size(), requiredParticipants, !timeoutIfPartial, consistencyLevel, queryStartNanoTime);
             MessageOut<Commit> message = new MessageOut<Commit>(MessagingService.Verb.PAXOS_PROPOSE, proposal, Commit.serializer);
    -        for (InetAddressAndPort target : endpoints)
    +        for (InetAddressAndPort target : replicas.asEndpoints())
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197128495
  
    --- Diff: src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java ---
    @@ -329,6 +335,10 @@ public static void validateReplicationStrategy(String keyspaceName,
             AbstractReplicationStrategy strategy = createInternal(keyspaceName, strategyClass, tokenMetadata, snitch, strategyOptions);
             strategy.validateExpectedOptions();
             strategy.validateOptions();
    +        if (strategy.getReplicationFactor().trans > 0 && !DatabaseDescriptor.isTransientReplicationEnabled())
    --- End diff --
    
    Check for version. Previously we had allNodesAtLeast22/30 for some internal handling, so we might have to implement similar mechanism for that.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188349548
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaSet.java ---
    @@ -0,0 +1,142 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.HashSet;
    +import java.util.Iterator;
    +import java.util.LinkedHashSet;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.collect.ImmutableSet;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +public class ReplicaSet extends Replicas
    +{
    +    static final ReplicaSet EMPTY = new ReplicaSet(ImmutableSet.of());
    +
    +    private final Set<Replica> replicaSet;
    +
    +    public ReplicaSet()
    +    {
    +        replicaSet = new HashSet<>();
    +    }
    +
    +    public ReplicaSet(int expectedSize)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(expectedSize);
    +    }
    +
    +    public ReplicaSet(Replicas replicas)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(replicas.size());
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    private ReplicaSet(Set<Replica> replicaSet)
    +    {
    +        this.replicaSet = replicaSet;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaSet that = (ReplicaSet) o;
    +        return Objects.equals(replicaSet, that.replicaSet);
    +    }
    +
    +    public int hashCode()
    +    {
    +        return Objects.hash(replicaSet);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaSet.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaSet.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaSet.removeIf(r -> r.getEndpoint().equals(endpoint));
    --- End diff --
    
    Allocates a lambda.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189117046
  
    --- Diff: src/java/org/apache/cassandra/service/WriteResponseHandler.java ---
    @@ -42,26 +44,26 @@
         private static final AtomicIntegerFieldUpdater<WriteResponseHandler> responsesUpdater
                 = AtomicIntegerFieldUpdater.newUpdater(WriteResponseHandler.class, "responses");
     
    -    public WriteResponseHandler(Collection<InetAddressAndPort> writeEndpoints,
    -                                Collection<InetAddressAndPort> pendingEndpoints,
    +    public WriteResponseHandler(Replicas writeReplicas,
    +                                Replicas pendingReplicas,
                                     ConsistencyLevel consistencyLevel,
                                     Keyspace keyspace,
                                     Runnable callback,
                                     WriteType writeType,
                                     long queryStartNanoTime)
         {
    -        super(keyspace, writeEndpoints, pendingEndpoints, consistencyLevel, callback, writeType, queryStartNanoTime);
    +        super(keyspace, writeReplicas, pendingReplicas, consistencyLevel, callback, writeType, queryStartNanoTime);
             responses = totalBlockFor();
         }
     
    -    public WriteResponseHandler(InetAddressAndPort endpoint, WriteType writeType, Runnable callback, long queryStartNanoTime)
    +    public WriteResponseHandler(Replica replica, WriteType writeType, Runnable callback, long queryStartNanoTime)
         {
    -        this(Arrays.asList(endpoint), Collections.<InetAddressAndPort>emptyList(), ConsistencyLevel.ONE, null, callback, writeType, queryStartNanoTime);
    +        this(new ReplicaList(Collections.singleton(replica)), new ReplicaList(), ConsistencyLevel.ONE, null, callback, writeType, queryStartNanoTime);
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188699158
  
    --- Diff: test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java ---
    @@ -54,19 +54,19 @@ public void testConcurrency() throws InterruptedException, IOException, Configur
                 DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, String.valueOf(ss.hashCode()));
                 InetAddressAndPort self = FBUtilities.getBroadcastAddressAndPort();
     
    -            List<InetAddressAndPort> hosts = new ArrayList<>();
    +            ReplicaList replicas = new ReplicaList();
                 // We want a big list of hosts so  sorting takes time, making it much more likely to reproduce the
                 // problem we're looking for.
                 for (int i = 0; i < 100; i++)
                     for (int j = 0; j < 256; j++)
    -                    hosts.add(InetAddressAndPort.getByAddress(new byte[]{ 127, 0, (byte)i, (byte)j}));
    +                    replicas.add(Replica.fullStandin(InetAddressAndPort.getByAddress(new byte[]{ 127, 0, (byte)i, (byte)j})));
    --- End diff --
    
    Full standin, can we avoid it? Seems like the snitch is generally not concerned with either transientness or ranges just endpoints so we should need to have it using replicas?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    Can we get more thorough tests for Replicas, ReplicaList, and ReplicaSet. I'm not looking for 100% coverage, but things like equality and hash code, order sensitivity of hash code output and equals results.
    
    ReplicaList, containsEndpoint, filter, intersect, subList, normalizeByRange (I am not sure this is equivalent to Range.normalize they look very different).
    
    Then tests for methods in Replicas like containsEndpoint.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r198251878
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -50,6 +50,30 @@
         public abstract int size();
         protected abstract Collection<Replica> getUnmodifiableCollection();
     
    +
    +    public boolean equals(Object o)
    --- End diff --
    
    Because it breaks equality comparisons and hash code as they don't really have a good definition. If you look at how java.util.Collections and Guava do it you can't actually construct something that isn't either a set or a list. I think it's a better approach then having some things returned that aren't comparable to other collections (either list or set) with the same contents.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188102629
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    --- End diff --
    
    Was this how an auto-generated implementation? Just want to make sure it's "correct" since it's always tricky to do equals with best practices.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197135241
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -503,14 +498,14 @@ private static void sendCommit(Commit commit, Iterable<InetAddressAndPort> repli
                 MessagingService.instance().sendOneWay(message, target);
         }
     
    -    private static PrepareCallback preparePaxos(Commit toPrepare, List<InetAddressAndPort> endpoints, int requiredParticipants, ConsistencyLevel consistencyForPaxos, long queryStartNanoTime)
    +    private static PrepareCallback preparePaxos(Commit toPrepare, ReplicaList replicas, int requiredParticipants, ConsistencyLevel consistencyForPaxos, long queryStartNanoTime)
         throws WriteTimeoutException
         {
             PrepareCallback callback = new PrepareCallback(toPrepare.update.partitionKey(), toPrepare.update.metadata(), requiredParticipants, consistencyForPaxos, queryStartNanoTime);
             MessageOut<Commit> message = new MessageOut<Commit>(MessagingService.Verb.PAXOS_PREPARE, toPrepare, Commit.serializer);
    -        for (InetAddressAndPort target : endpoints)
    +        for (Replica replica: replicas)
             {
    -            if (canDoLocalRequest(target))
    +            if (canDoLocalRequest(replica.getEndpoint()))
    --- End diff --
    
    `canDoLocalRequest` can receive Replica (or we can move it to replica). This will spare many `getEndpoint` calls where this method is called.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r195811425
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -50,6 +50,30 @@
         public abstract int size();
         protected abstract Collection<Replica> getUnmodifiableCollection();
     
    +
    +    public boolean equals(Object o)
    --- End diff --
    
    So I want to try and be clearer again :-P 
    
    ReplicaList and ReplicaSet both work fine the way they are today. What doesn't work is Replicas functions that return an ImmutableContainer because those don't implement equals or hashCode for their contents. We should never return a Replicas that isn't either an instance of either ReplicaSet or ReplicaList with the implied equality and hashCode behavior. 
    
    Java has the Collections class with singleton (which is singleton set) singletonList, singletonMap etc. Guava has ImmutableXYZ and the .of methods.
    
    I think Guava got it right having those methods be in the type specific class rather than just a generic Collections class.
    
    Otherwise it's all good. ReplicaSet and ReplicaList are wrappers for actual set and list implementations so it's not clear we would ever have more than just that. I think optimized singleton implementations should be a later optimization unless you feel driven.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188110522
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    --- End diff --
    
    replicaList is not null so I would have just used replicaList.hashCode(). This might allocate a wrapper array for varargs.
    
    Objects.hashCode() is the single arg version that won't allocate.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188695455
  
    --- Diff: src/java/org/apache/cassandra/service/reads/AbstractReadExecutor.java ---
    @@ -269,24 +272,26 @@ public void executeAsync()
             {
                 // if CL + RR result in covering all replicas, getReadExecutor forces AlwaysSpeculating.  So we know
                 // that the last replica in our list is "extra."
    -            List<InetAddressAndPort> initialReplicas = targetReplicas.subList(0, targetReplicas.size() - 1);
    +            ReplicaList initialReplicas = targetReplicas.subList(0, targetReplicas.size() - 1);
    +
    +            Replicas.checkFull(initialReplicas);
     
                 if (handler.blockfor < initialReplicas.size())
                 {
                     // We're hitting additional targets for read repair.  Since our "extra" replica is the least-
                     // preferred by the snitch, we do an extra data read to start with against a replica more
                     // likely to reply; better to let RR fail than the entire query.
    -                makeDataRequests(initialReplicas.subList(0, 2));
    +                makeDataRequests(initialReplicas.subList(0, 2).asEndpoints());
    --- End diff --
    
    Can you do a subList that is just the endpoint subList without converting it in a second step? Or should initialReplicas just be unwrapped immediately?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188784654
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    +
    +    private void warnIfIncreasingRF(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (newStrategy.getReplicationFactor().full > oldStrategy.getReplicationFactor().full)
    --- End diff --
    
    Huh, so because your quorum size increases you don't need to run even an incremental repair? Is that always true. Have to think about it some more.
    
    But... while it might be safe to do one increase, if you increased transient replication twice in sequence and it didn't change th quorum size could you get into trouble?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197156136
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -4231,53 +4211,53 @@ private void calculateToFromStreams(Collection<Token> newTokens, List<String> ke
                 InetAddressAndPort localAddress = FBUtilities.getBroadcastAddressAndPort();
                 IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
                 TokenMetadata tokenMetaCloneAllSettled = tokenMetadata.cloneAfterAllSettled();
    -            // clone to avoid concurrent modification in calculateNaturalEndpoints
    +            // clone to avoid concurrent modification in calculateNaturalReplicas
                 TokenMetadata tokenMetaClone = tokenMetadata.cloneOnlyTokenMap();
     
                 for (String keyspace : keyspaceNames)
                 {
                     // replication strategy of the current keyspace
                     AbstractReplicationStrategy strategy = Keyspace.open(keyspace).getReplicationStrategy();
    -                Multimap<InetAddressAndPort, Range<Token>> endpointToRanges = strategy.getAddressRanges();
    +                ReplicaMultimap<InetAddressAndPort, ReplicaSet> endpointToRanges = strategy.getAddressReplicas();
     
                     logger.debug("Calculating ranges to stream and request for keyspace {}", keyspace);
                     for (Token newToken : newTokens)
                     {
                         // getting collection of the currently used ranges by this keyspace
    -                    Collection<Range<Token>> currentRanges = endpointToRanges.get(localAddress);
    +                    ReplicaSet currentReplicas = endpointToRanges.get(localAddress);
    --- End diff --
    
    `currentReplicas` is used inside the loop only. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189092440
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r195479970
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -50,6 +50,30 @@
         public abstract int size();
         protected abstract Collection<Replica> getUnmodifiableCollection();
     
    +
    +    public boolean equals(Object o)
    --- End diff --
    
    Sorry I should have been clearer. We would need AbstractReplicaList and AbstractReplicaSet to each implement their own equals and hash code. If you look at Java's AbstractList and AbstractSet you will see they implement their own equals and hashCode where the two aren't ever equal.
    
    And I think it's not just cargo culting to also do it that way. I think it makes sense they should never be equal because sets have unpredictable iteration order and it leads to bugs to have it work only some of the time. Sets  also use a hash code specific to sets which is the sum of hash codes of the elements which makes it order insensitive.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188782694
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    +
    +    private void warnIfIncreasingRF(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (newStrategy.getReplicationFactor().full > oldStrategy.getReplicationFactor().full)
    --- End diff --
    
    I think full is what we want here. Increasing the number of full replicas would require a repair. If the total rf is increased, but the number of full replicas doesn't change, then you're just adding transient replicas, which shouldn't require a repair or any other operator action.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187383489
  
    --- Diff: doc/source/architecture/dynamo.rst ---
    @@ -74,6 +74,26 @@ nodes in each rack, the data load on the smallest rack may be much higher.  Simi
     into a new rack, it will be considered a replica for the entire ring.  For this reason, many operators choose to
     configure all nodes on a single "rack".
     
    +.. _transient-replication:
    +
    +Transient Replication
    +~~~~~~~~~~~~~~~~~~~~~
    +
    +Transient replication allows you to configure a subset of replicas to only replicate data that hasn't been incrementally
    +repaired. This allows you to trade data redundancy for storage usage, and increased read and write throughput. For instance,
    +if you have a replication factor of 3, with 1 transient replica, 2 replicas will replicate all data for a given token
    --- End diff --
    
    Use the 3 replicas upgraded to 5 case. It's not a tradeoff of redundancy at all. We can't message that at all and it's not true.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187442385
  
    --- Diff: src/java/org/apache/cassandra/service/reads/DataResolver.java ---
    @@ -64,12 +72,19 @@ public PartitionIterator resolve()
             // at the beginning of this method), so grab the response count once and use that through the method.
             int count = responses.size();
             List<UnfilteredPartitionIterator> iters = new ArrayList<>(count);
    -        InetAddressAndPort[] sources = new InetAddressAndPort[count];
    +        Replica[] sources = new Replica[count];
             for (int i = 0; i < count; i++)
             {
                 MessageIn<ReadResponse> msg = responses.get(i);
                 iters.add(msg.payload.makeIterator(command));
    -            sources[i] = msg.from;
    +
    +            Replica replica = replicaMap.get(msg.from);
    +            if (replica == null)
    --- End diff --
    
    It seems like we knew what kind of Replica it was when we sent the message. The kind of Replica it might be when we get the response could be different? Probably not given how pending states work.
    
    I just don't get how we could fail to decorate it. That's not right. If we sent the message eliciting this response then we must have known at the time.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188058733
  
    --- Diff: src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java ---
    @@ -202,61 +204,63 @@ private Keyspace getKeyspace()
          *
          * @return the replication factor
          */
    -    public abstract int getReplicationFactor();
    +    public abstract ReplicationFactor getReplicationFactor();
     
         /*
          * NOTE: this is pretty inefficient. also the inverse (getRangeAddresses) below.
          * this is fine as long as we don't use this on any critical path.
          * (fixing this would probably require merging tokenmetadata into replicationstrategy,
          * so we could cache/invalidate cleanly.)
          */
    -    public Multimap<InetAddressAndPort, Range<Token>> getAddressRanges(TokenMetadata metadata)
    +    public ReplicaMultimap<InetAddressAndPort, ReplicaSet> getAddressReplicas(TokenMetadata metadata)
         {
    -        Multimap<InetAddressAndPort, Range<Token>> map = HashMultimap.create();
    +        ReplicaMultimap<InetAddressAndPort, ReplicaSet> map = ReplicaMultimap.set();
     
             for (Token token : metadata.sortedTokens())
             {
                 Range<Token> range = metadata.getPrimaryRangeFor(token);
    -            for (InetAddressAndPort ep : calculateNaturalEndpoints(token, metadata))
    +            for (Replica replica : calculateNaturalReplicas(token, metadata))
                 {
    -                map.put(ep, range);
    +                Preconditions.checkState(range.equals(replica.getRange()) || this instanceof LocalStrategy);
    --- End diff --
    
    Can you explain this check? Why would LocalStrategy not have the ranges match?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189038875
  
    --- Diff: src/java/org/apache/cassandra/db/compaction/CompactionManager.java ---
    @@ -871,7 +868,7 @@ public void forceUserDefinedCleanup(String dataFiles)
             {
                 ColumnFamilyStore cfs = entry.getKey();
                 Keyspace keyspace = cfs.keyspace;
    -            Collection<Range<Token>> ranges = StorageService.instance.getLocalRanges(keyspace.getName());
    +            Collection<Range<Token>> ranges = StorageService.instance.getLocalReplicas(keyspace.getName()).asRangeSet();
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187476379
  
    --- Diff: src/java/org/apache/cassandra/db/compaction/Verifier.java ---
    @@ -209,7 +208,9 @@ public void verify()
                         markAndThrow();
                 }
     
    -            List<Range<Token>> ownedRanges = isOffline ? Collections.emptyList() : Range.normalize(StorageService.instance.getLocalAndPendingRanges(cfs.metadata().keyspace));
    +            List<Range<Token>> ownedRanges = isOffline
    +                                             ? Collections.emptyList()
    +                                             : Range.normalize(StorageService.instance.getLocalAndPendingReplicas(cfs.metadata().keyspace).asRangeSet());
    --- End diff --
    
    Another candidate


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197130790
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,270 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        this(new ArrayList<>());
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        this(new ArrayList<>(capacity));
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        this(new ArrayList<>(from.replicaList));
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        this(new ArrayList<>(from.size()));
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        this(new ArrayList<>(from));
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        Preconditions.checkNotNull(replica);
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    protected Collection<Replica> getUnmodifiableCollection()
    +    {
    +        return Collections.unmodifiableCollection(replicaList);
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=replicaList.size()-1; i>=0; i--)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +            {
    +                replicaList.remove(i);
    +            }
    +        }
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    @Override
    +    public boolean containsEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=0; i<size(); i++)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +                return true;
    +        }
    +        return false;
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = size() < 10 ? new ArrayList<>(size()) : new ArrayList<>();
    +        for (int i=0; i<size(); i++)
    +        {
    +            Replica replica = replicaList.get(i);
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of()
    +    {
    +        return new ReplicaList(0);
    +    }
    +
    +    public static ReplicaList of(Replica replica)
    +    {
    +        ReplicaList replicaList = new ReplicaList(1);
    +        replicaList.add(replica);
    +        return replicaList;
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    +    {
    +        ReplicaList replicaList = new ReplicaList(replicas.length);
    +        for (Replica replica: replicas)
    +        {
    +            replicaList.add(replica);
    +        }
    +        return replicaList;
    +    }
    +
    +    public ReplicaList subList(int fromIndex, int toIndex)
    +    {
    +        return new ReplicaList(replicaList.subList(fromIndex, toIndex));
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        if (Iterables.all(this, Replica::isFull))
    +        {
    +            ReplicaList normalized = new ReplicaList(size());
    +            for (Replica replica: this)
    +            {
    +                replica.addNormalizeByRange(normalized);
    +            }
    +
    +            return normalized;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public static ReplicaList immutableCopyOf(Replicas replicas)
    +    {
    +        return new ReplicaList(ImmutableList.<Replica>builder().addAll(replicas).build());
    +    }
    +
    +    public static ReplicaList immutableCopyOf(ReplicaList replicas)
    +    {
    +        return new ReplicaList(ImmutableList.copyOf(replicas.replicaList));
    +    }
    +
    +    public static ReplicaList immutableCopyOf(Replica... replicas)
    +    {
    +        return new ReplicaList(ImmutableList.copyOf(replicas));
    +    }
    +
    +    public static ReplicaList empty()
    +    {
    +        return new ReplicaList();
    +    }
    +
    +    public static ReplicaList fullStandIns(Collection<InetAddressAndPort> endpoints)
    +    {
    +        ReplicaList replicaList = new ReplicaList(endpoints.size());
    +        for (InetAddressAndPort endpoint: endpoints)
    +        {
    +            replicaList.add(Replica.fullStandin(endpoint));
    +        }
    +        return replicaList;
    +    }
    +
    +    public static ReplicaList fullStandIns(Iterable<InetAddressAndPort> endpoints)
    +    {
    +        ReplicaList replicaList = new ReplicaList();
    +        for (InetAddressAndPort endpoint: endpoints)
    +        {
    +            replicaList.add(Replica.fullStandin(endpoint));
    +        }
    +        return replicaList;
    +    }
    +
    +    /**
    +     * For allocating ReplicaLists where the final size is unknown, but
    +     * should be less than the given size. Prevents overallocations in cases
    +     * where there are less than the default ArrayList size, and defers to the
    +     * ArrayList algorithm where there might be more
    +     */
    +    public static ReplicaList withMaxSize(int size)
    --- End diff --
    
    Naming seems to suggest that it can't grow over `size`. It's more `initialSize`, or?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    > ReplicaSet can be a map from endpoint to replica, which might help with both iteration and access to endpoints
    Should it be keyed by endpoint or range? I don't think what we want is consistent.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197154559
  
    --- Diff: src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java ---
    @@ -202,61 +204,65 @@ private Keyspace getKeyspace()
          *
          * @return the replication factor
          */
    -    public abstract int getReplicationFactor();
    +    public abstract ReplicationFactor getReplicationFactor();
     
         /*
          * NOTE: this is pretty inefficient. also the inverse (getRangeAddresses) below.
          * this is fine as long as we don't use this on any critical path.
          * (fixing this would probably require merging tokenmetadata into replicationstrategy,
          * so we could cache/invalidate cleanly.)
          */
    -    public Multimap<InetAddressAndPort, Range<Token>> getAddressRanges(TokenMetadata metadata)
    +    public ReplicaMultimap<InetAddressAndPort, ReplicaSet> getAddressReplicas(TokenMetadata metadata)
    --- End diff --
    
    Because of how it's used, we can add another, non-map variant, since in the majority of usages we're just calling `get` on just-materialised map.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197156422
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -4231,53 +4211,53 @@ private void calculateToFromStreams(Collection<Token> newTokens, List<String> ke
                 InetAddressAndPort localAddress = FBUtilities.getBroadcastAddressAndPort();
                 IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
                 TokenMetadata tokenMetaCloneAllSettled = tokenMetadata.cloneAfterAllSettled();
    -            // clone to avoid concurrent modification in calculateNaturalEndpoints
    +            // clone to avoid concurrent modification in calculateNaturalReplicas
                 TokenMetadata tokenMetaClone = tokenMetadata.cloneOnlyTokenMap();
     
                 for (String keyspace : keyspaceNames)
                 {
                     // replication strategy of the current keyspace
                     AbstractReplicationStrategy strategy = Keyspace.open(keyspace).getReplicationStrategy();
    -                Multimap<InetAddressAndPort, Range<Token>> endpointToRanges = strategy.getAddressRanges();
    +                ReplicaMultimap<InetAddressAndPort, ReplicaSet> endpointToRanges = strategy.getAddressReplicas();
    --- End diff --
    
    add `getAddressReplicas` by `InetAddressAndPort` to avoid materialising map.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197156892
  
    --- Diff: src/java/org/apache/cassandra/locator/TokenMetadata.java ---
    @@ -733,19 +733,19 @@ public InetAddressAndPort getEndpoint(Token token)
             return sortedTokens;
         }
     
    -    public Multimap<Range<Token>, InetAddressAndPort> getPendingRangesMM(String keyspaceName)
    +    public ReplicaMultimap<Range<Token>, ReplicaSet> getPendingRangesMM(String keyspaceName)
         {
    -        Multimap<Range<Token>, InetAddressAndPort> map = HashMultimap.create();
    +        ReplicaMultimap<Range<Token>, ReplicaSet> map = ReplicaMultimap.set();
             PendingRangeMaps pendingRangeMaps = this.pendingRanges.get(keyspaceName);
     
             if (pendingRangeMaps != null)
             {
    -            for (Map.Entry<Range<Token>, List<InetAddressAndPort>> entry : pendingRangeMaps)
    +            for (Map.Entry<Range<Token>, ReplicaList> entry : pendingRangeMaps)
                 {
                     Range<Token> range = entry.getKey();
    -                for (InetAddressAndPort address : entry.getValue())
    +                for (Replica replica : entry.getValue())
                     {
    -                    map.put(range, address);
    +                    map.put(range, replica);
    --- End diff --
    
    Since we're putting by `range` and iterating over `entry.getValue` here, it seems like we can spare this iteration.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197129440
  
    --- Diff: src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java ---
    @@ -90,41 +98,53 @@ public NetworkTopologyStrategy(String keyspaceName, TokenMetadata tokenMetadata,
             /** Number of replicas left to fill from this DC. */
             int rfLeft;
             int acceptableRackRepeats;
    +        int transients;
     
    -        DatacenterEndpoints(int rf, int rackCount, int nodeCount, Set<InetAddressAndPort> endpoints, Set<Pair<String, String>> racks)
    +        DatacenterEndpoints(ReplicationFactor rf, int rackCount, int nodeCount, ReplicaSet replicas, Set<Pair<String, String>> racks)
    --- End diff --
    
    Should we call it `DatacenterReplicas` now?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    bq. You probably want an AbstractReplicaSet and AbstractReplicaList with no storage that define how equals (against any derived class of AbstractReplicaSet or List depending).
    
    fixed, moved equals and hashCode up to base class
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188119835
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    +        for (Replica replica: replicaList)
    +        {
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    +    {
    +        ReplicaList replicaList = new ReplicaList(replicas.length);
    +        for (Replica replica: replicas)
    +        {
    +            replicaList.add(replica);
    +        }
    +        return replicaList;
    +    }
    +
    +    public ReplicaList subList(int fromIndex, int toIndex)
    +    {
    +        return new ReplicaList(replicaList.subList(fromIndex, toIndex));
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        if (Iterables.all(this, Replica::isFull))
    +        {
    +            ReplicaList normalized = new ReplicaList(size());
    --- End diff --
    
    Maybe you want to invert things a bit and pass this into each call to replica.normalizeByRange?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188352053
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    --- End diff --
    
    This is in Replicas which I think is the better place since it allows conversion from any Replicas to an endpoint list.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189038385
  
    --- Diff: src/java/org/apache/cassandra/db/ColumnFamilyStore.java ---
    @@ -1868,7 +1866,7 @@ public void compactionDiskSpaceCheck(boolean enable)
     
         public void cleanupCache()
         {
    -        Collection<Range<Token>> ranges = StorageService.instance.getLocalRanges(keyspace.getName());
    +        Collection<Range<Token>> ranges = StorageService.instance.getLocalReplicas(keyspace.getName()).asRangeSet();
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188108224
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -0,0 +1,313 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Set;
    +
    +import com.google.common.base.Predicate;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.utils.FBUtilities;
    +
    +/**
    + * A collection like class for Replica objects. Since the Replica class contains inetaddress, range, and
    + * transient replication status, basic contains and remove methods can be ambiguous. Replicas forces you
    + * to be explicit about what you're checking the container for, or removing from it.
    + */
    +public abstract class Replicas implements Iterable<Replica>
    --- End diff --
    
    This is really ReplicaCollection. The XYZs idiom is usually an additional class that has additional utility methods.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188758082
  
    --- Diff: test/unit/org/apache/cassandra/locator/NetworkTopologyStrategyTest.java ---
    @@ -36,12 +37,17 @@
     
     import org.apache.cassandra.config.DatabaseDescriptor;
     import org.apache.cassandra.dht.Murmur3Partitioner;
    +import org.apache.cassandra.dht.Murmur3Partitioner.LongToken;
     import org.apache.cassandra.dht.OrderPreservingPartitioner.StringToken;
    +import org.apache.cassandra.dht.Range;
     import org.apache.cassandra.dht.Token;
     import org.apache.cassandra.exceptions.ConfigurationException;
     import org.apache.cassandra.locator.TokenMetadata.Topology;
     import org.apache.cassandra.service.StorageService;
     
    +import static org.apache.cassandra.locator.Replica.full;
    --- End diff --
    
    Generally not a fan of static importing a single method.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r198604608
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -78,7 +78,6 @@
     import org.apache.cassandra.service.paxos.ProposeVerbHandler;
     import org.apache.cassandra.net.MessagingService.Verb;
     import org.apache.cassandra.tracing.Tracing;
    -import org.apache.cassandra.transport.Server;
    --- End diff --
    
    Replica is also a range? Hints in progress isn't by range.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188446031
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -344,47 +343,43 @@ private static void recordCasContention(int contentions)
                 casWriteMetrics.contention.update(contentions);
         }
     
    -    private static Predicate<InetAddressAndPort> sameDCPredicateFor(final String dc)
    +    private static Predicate<Replica> sameDCPredicateFor(final String dc)
         {
             final IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
    -        return new Predicate<InetAddressAndPort>()
    -        {
    -            public boolean apply(InetAddressAndPort host)
    -            {
    -                return dc.equals(snitch.getDatacenter(host));
    -            }
    -        };
    +        return replica -> dc.equals(snitch.getDatacenter(replica));
         }
     
         private static PaxosParticipants getPaxosParticipants(TableMetadata metadata, DecoratedKey key, ConsistencyLevel consistencyForPaxos) throws UnavailableException
         {
             Token tk = key.getToken();
    -        List<InetAddressAndPort> naturalEndpoints = StorageService.instance.getNaturalEndpoints(metadata.keyspace, tk);
    -        Collection<InetAddressAndPort> pendingEndpoints = StorageService.instance.getTokenMetadata().pendingEndpointsFor(tk, metadata.keyspace);
    +        ReplicaList naturalReplicas = StorageService.instance.getNaturalReplicas(metadata.keyspace, tk);
    +        ReplicaList pendingReplicas = new ReplicaList(StorageService.instance.getTokenMetadata().pendingEndpointsFor(tk, metadata.keyspace));
    --- End diff --
    
    Interesting you end up having to make a copy to a list to convert it from Replicas (actually a set if you get a non-empty value back).
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189092468
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaSet.java ---
    @@ -0,0 +1,142 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.HashSet;
    +import java.util.Iterator;
    +import java.util.LinkedHashSet;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.collect.ImmutableSet;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +public class ReplicaSet extends Replicas
    +{
    +    static final ReplicaSet EMPTY = new ReplicaSet(ImmutableSet.of());
    +
    +    private final Set<Replica> replicaSet;
    +
    +    public ReplicaSet()
    +    {
    +        replicaSet = new HashSet<>();
    +    }
    +
    +    public ReplicaSet(int expectedSize)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(expectedSize);
    +    }
    +
    +    public ReplicaSet(Replicas replicas)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(replicas.size());
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    private ReplicaSet(Set<Replica> replicaSet)
    +    {
    +        this.replicaSet = replicaSet;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaSet that = (ReplicaSet) o;
    +        return Objects.equals(replicaSet, that.replicaSet);
    +    }
    +
    +    public int hashCode()
    +    {
    +        return Objects.hash(replicaSet);
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188117515
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    --- End diff --
    
    This is a little weird as a way to normalize a single thing? Isn't this basically unwrap and most of the time won't it not unwrap?
    
    Maybe we can return Replicas.singleton most of the time?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189039085
  
    --- Diff: src/java/org/apache/cassandra/db/view/ViewBuilder.java ---
    @@ -135,14 +137,15 @@ private synchronized void build()
             }
     
             // Get the local ranges for which the view hasn't already been built nor it's building
    -        Set<Range<Token>> newRanges = StorageService.instance.getLocalRanges(ksName)
    -                                                             .stream()
    -                                                             .map(r -> r.subtractAll(builtRanges))
    -                                                             .flatMap(Set::stream)
    -                                                             .map(r -> r.subtractAll(pendingRanges.keySet()))
    -                                                             .flatMap(Set::stream)
    -                                                             .collect(Collectors.toSet());
    -
    +        ReplicaSet replicatedRanges = StorageService.instance.getLocalReplicas(ksName);
    +        Replicas.checkFull(StorageService.instance.getLocalReplicas(ksName));
    +        Set<Range<Token>> newRanges = replicatedRanges.asRangeSet()
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197127705
  
    --- Diff: src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java ---
    @@ -102,35 +104,35 @@ protected AbstractReplicationStrategy(String keyspaceName, TokenMetadata tokenMe
          * @param searchPosition the position the natural endpoints are requested for
          * @return a copy of the natural endpoints for the given token
          */
    -    public ArrayList<InetAddressAndPort> getNaturalEndpoints(RingPosition searchPosition)
    +    public ReplicaList getNaturalReplicas(RingPosition searchPosition)
         {
             Token searchToken = searchPosition.getToken();
             Token keyToken = TokenMetadata.firstToken(tokenMetadata.sortedTokens(), searchToken);
    -        ArrayList<InetAddressAndPort> endpoints = getCachedEndpoints(keyToken);
    +        ReplicaList endpoints = getCachedReplicas(keyToken);
             if (endpoints == null)
             {
                 TokenMetadata tm = tokenMetadata.cachedOnlyTokenMap();
                 // if our cache got invalidated, it's possible there is a new token to account for too
                 keyToken = TokenMetadata.firstToken(tm.sortedTokens(), searchToken);
    -            endpoints = new ArrayList<InetAddressAndPort>(calculateNaturalEndpoints(searchToken, tm));
    -            cachedEndpoints.put(keyToken, endpoints);
    +            endpoints = calculateNaturalReplicas(searchToken, tm);
    +            cachedReplicas.put(keyToken, endpoints);
    --- End diff --
    
    It seems that we're only putting here, so here we can have an immutable collection already, which might spare us a need for copy in `getCachedReplicas` consumers.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r191569156
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaSet.java ---
    @@ -0,0 +1,156 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.HashSet;
    +import java.util.Iterator;
    +import java.util.LinkedHashSet;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.collect.ImmutableSet;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +public class ReplicaSet extends Replicas
    +{
    +    static final ReplicaSet EMPTY = new ReplicaSet(ImmutableSet.of());
    +
    +    private final Set<Replica> replicaSet;
    +
    +    public ReplicaSet()
    +    {
    +        this(new HashSet<>());
    +    }
    +
    +    public ReplicaSet(int expectedSize)
    +    {
    +        this(Sets.newHashSetWithExpectedSize(expectedSize));
    +    }
    +
    +    public ReplicaSet(Replicas replicas)
    +    {
    +        this(Sets.newHashSetWithExpectedSize(replicas.size()));
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    private ReplicaSet(Set<Replica> replicaSet)
    +    {
    +        this.replicaSet = replicaSet;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaSet that = (ReplicaSet) o;
    +        return Objects.equals(replicaSet, that.replicaSet);
    +    }
    +
    +    public int hashCode()
    +    {
    +        return replicaSet.hashCode();
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaSet.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaSet.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (Replica replica: replicaSet)
    --- End diff --
    
    This is a CME because it keeps iterating even after it has found the thing to remove. It needs to keep iterating to be correct.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197131513
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicationFactor.java ---
    @@ -0,0 +1,121 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.Objects;
    +
    +import com.google.common.base.Preconditions;
    +
    +import org.apache.cassandra.config.DatabaseDescriptor;
    +
    +public class ReplicationFactor
    +{
    +    public static final ReplicationFactor ZERO = new ReplicationFactor(0);
    +
    +    public final int trans;
    +    public final int replicas;
    +    public transient final int full;
    +
    +    private ReplicationFactor(int replicas, int trans)
    +    {
    +        validate(replicas, trans);
    +        this.replicas = replicas;
    +        this.trans = trans;
    +        this.full = replicas - trans;
    +    }
    +
    +    private ReplicationFactor(int replicas)
    +    {
    +        this(replicas, 0);
    +    }
    +
    +    static void validate(int replicas, int trans)
    +    {
    +        Preconditions.checkArgument(trans == 0 || DatabaseDescriptor.isTransientReplicationEnabled(),
    +                                    "Transient replication is not enabled on this node");
    +        Preconditions.checkArgument(replicas >= 0,
    +                                    "Replication factor must be non-negative, found %s", replicas);
    +        Preconditions.checkArgument(trans == 0 || trans < replicas,
    --- End diff --
    
    We should also check for nonnegative here, since `full` is calculated by `replicas - trans` 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188782098
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    --- End diff --
    
    If you're keeping the total number of replicas the same, I don't think there would  be any difference between reducing the number of transient replicas, and increasing the replication factor on a normally replicated keyspace. You'd just decrease transient count one at a time and do a full repair between each? If you were reducing both (ie: 5/2 -> 4/1), that shouldn't cause any problems at all.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197161771
  
    --- Diff: test/microbench/org/apache/cassandra/test/microbench/PendingRangesBench.java ---
    @@ -97,13 +103,13 @@ public void searchTokenForOldPendingRanges(final Blackhole bh)
         {
             int randomToken = ThreadLocalRandom.current().nextInt(maxToken * 10 + 5);
             Token searchToken = new RandomPartitioner.BigIntegerToken(Integer.toString(randomToken));
    -        Set<InetAddressAndPort> endpoints = new HashSet<>();
    -        for (Map.Entry<Range<Token>, Collection<InetAddressAndPort>> entry : oldPendingRanges.asMap().entrySet())
    +        Set<Replica> replicas = new HashSet<>();
    +        for (Map.Entry<Range<Token>, Collection<Replica>> entry : oldPendingRanges.asMap().entrySet())
    --- End diff --
    
    Since all usages of `asMap` are used with `entrySet`, should we just expose `entrySet` instead and keep map private?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189116992
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -0,0 +1,313 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Set;
    +
    +import com.google.common.base.Predicate;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.utils.FBUtilities;
    +
    +/**
    + * A collection like class for Replica objects. Since the Replica class contains inetaddress, range, and
    + * transient replication status, basic contains and remove methods can be ambiguous. Replicas forces you
    + * to be explicit about what you're checking the container for, or removing from it.
    + */
    +public abstract class Replicas implements Iterable<Replica>
    +{
    +
    +    public abstract boolean add(Replica replica);
    +    public abstract void addAll(Iterable<Replica> replicas);
    +    public abstract void removeEndpoint(InetAddressAndPort endpoint);
    +    public abstract void removeReplica(Replica replica);
    +    public abstract int size();
    +
    +    public Iterable<InetAddressAndPort> asEndpoints()
    +    {
    +        return Iterables.transform(this, Replica::getEndpoint);
    +    }
    +
    +    public Set<InetAddressAndPort> asEndpointSet()
    +    {
    +        Set<InetAddressAndPort> result = Sets.newHashSetWithExpectedSize(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getEndpoint());
    +        }
    +        return result;
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> result = new ArrayList<>(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getEndpoint());
    +        }
    +        return result;
    +    }
    +
    +    public Iterable<Range<Token>> asRanges()
    +    {
    +        return Iterables.transform(this, Replica::getRange);
    +    }
    +
    +    public Set<Range<Token>> asRangeSet()
    +    {
    +        Set<Range<Token>> result = Sets.newHashSetWithExpectedSize(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getRange());
    +        }
    +        return result;
    +    }
    +
    +    public Iterable<Range<Token>> fullRanges()
    +    {
    +        return Iterables.transform(Iterables.filter(this, Replica::isFull), Replica::getRange);
    +    }
    +
    +    public boolean containsEndpoint(InetAddressAndPort endpoint)
    +    {
    +        return Iterables.any(this, r -> r.getEndpoint().equals(endpoint));
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r198305783
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -50,6 +50,30 @@
         public abstract int size();
         protected abstract Collection<Replica> getUnmodifiableCollection();
     
    +
    +    public boolean equals(Object o)
    --- End diff --
    
    Sure if they don't implement they implement Iterable instead of Collection I think it's fine for them to not be an implementation of a collection (list, set, multi-set).
    
    Interesting side note, java.util.AbstractQueue doesn't implement equals! Neither does say ArrayBlockingQueue or ArrayDeque.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189397458
  
    --- Diff: src/java/org/apache/cassandra/locator/PendingRangeMaps.java ---
    @@ -23,196 +23,176 @@
     import com.google.common.collect.Iterators;
     import org.apache.cassandra.dht.Range;
     import org.apache.cassandra.dht.Token;
    -import org.slf4j.Logger;
    -import org.slf4j.LoggerFactory;
     
     import java.util.*;
     
    -public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<InetAddressAndPort>>>
    +public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<Replica>>>
     {
    -    private static final Logger logger = LoggerFactory.getLogger(PendingRangeMaps.class);
    -
         /**
          * We have for NavigableMap to be able to search for ranges containing a token efficiently.
          *
          * First two are for non-wrap-around ranges, and the last two are for wrap-around ranges.
          */
         // ascendingMap will sort the ranges by the ascending order of right token
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMap;
    +    private final NavigableMap<Range<Token>, List<Replica>> ascendingMap;
    +
         /**
          * sorting end ascending, if ends are same, sorting begin descending, so that token (end, end) will
          * come before (begin, end] with the same end, and (begin, end) will be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> ascendingComparator = new Comparator<Range<Token>>()
    -        {
    -            @Override
    -            public int compare(Range<Token> o1, Range<Token> o2)
    -            {
    -                int res = o1.right.compareTo(o2.right);
    -                if (res != 0)
    -                    return res;
    +    private static final Comparator<Range<Token>> ascendingComparator = (o1, o2) -> {
    +        int res = o1.right.compareTo(o2.right);
    +        if (res != 0)
    +            return res;
     
    -                return o2.left.compareTo(o1.left);
    -            }
    -        };
    +        return o2.left.compareTo(o1.left);
    +    };
     
         // ascendingMap will sort the ranges by the descending order of left token
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMap;
    +    private final NavigableMap<Range<Token>, List<Replica>> descendingMap;
    +
         /**
          * sorting begin descending, if begins are same, sorting end descending, so that token (begin, begin) will
          * come after (begin, end] with the same begin, and (begin, end) won't be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> descendingComparator = new Comparator<Range<Token>>()
    -        {
    -            @Override
    -            public int compare(Range<Token> o1, Range<Token> o2)
    -            {
    -                int res = o2.left.compareTo(o1.left);
    -                if (res != 0)
    -                    return res;
    +    private static final Comparator<Range<Token>> descendingComparator = (o1, o2) -> {
    +        int res = o2.left.compareTo(o1.left);
    +        if (res != 0)
    +            return res;
     
    -                // if left tokens are same, sort by the descending of the right tokens.
    -                return o2.right.compareTo(o1.right);
    -            }
    -        };
    +        // if left tokens are same, sort by the descending of the right tokens.
    +        return o2.right.compareTo(o1.right);
    +    };
     
         // these two maps are for warp around ranges.
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMapForWrapAround;
    +    private final NavigableMap<Range<Token>, List<Replica>> ascendingMapForWrapAround;
    +
         /**
          * for wrap around range (begin, end], which begin > end.
          * Sorting end ascending, if ends are same, sorting begin ascending,
          * so that token (end, end) will come before (begin, end] with the same end, and (begin, end] will be selected in
          * the tailMap.
          */
    -    static final Comparator<Range<Token>> ascendingComparatorForWrapAround = new Comparator<Range<Token>>()
    -    {
    -        @Override
    -        public int compare(Range<Token> o1, Range<Token> o2)
    -        {
    -            int res = o1.right.compareTo(o2.right);
    -            if (res != 0)
    -                return res;
    +    private static final Comparator<Range<Token>> ascendingComparatorForWrapAround = (o1, o2) -> {
    +        int res = o1.right.compareTo(o2.right);
    +        if (res != 0)
    +            return res;
     
    -            return o1.left.compareTo(o2.left);
    -        }
    +        return o1.left.compareTo(o2.left);
         };
     
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMapForWrapAround;
    +    private final NavigableMap<Range<Token>, List<Replica>> descendingMapForWrapAround;
    +
         /**
          * for wrap around ranges, which begin > end.
          * Sorting end ascending, so that token (begin, begin) will come after (begin, end] with the same begin,
          * and (begin, end) won't be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> descendingComparatorForWrapAround = new Comparator<Range<Token>>()
    -    {
    -        @Override
    -        public int compare(Range<Token> o1, Range<Token> o2)
    -        {
    -            int res = o2.left.compareTo(o1.left);
    -            if (res != 0)
    -                return res;
    -            return o1.right.compareTo(o2.right);
    -        }
    +    private static final Comparator<Range<Token>> descendingComparatorForWrapAround = (o1, o2) -> {
    --- End diff --
    
    :+1:


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188685441
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -3845,16 +3852,9 @@ public void forceTerminateAllRepairSessions()
          * @param key key for which we need to find the endpoint
          * @return the endpoint responsible for this key
          */
    -    public List<InetAddressAndPort> getLiveNaturalEndpoints(Keyspace keyspace, ByteBuffer key)
    -    {
    -        return getLiveNaturalEndpoints(keyspace, tokenMetadata.decorateKey(key));
    -    }
    -
    -    public List<InetAddressAndPort> getLiveNaturalEndpoints(Keyspace keyspace, RingPosition pos)
    +    public ReplicaList getLiveNaturalReplicas(Keyspace keyspace, ByteBuffer key)
    --- End diff --
    
    Unused


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189117042
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -3845,16 +3852,9 @@ public void forceTerminateAllRepairSessions()
          * @param key key for which we need to find the endpoint
          * @return the endpoint responsible for this key
          */
    -    public List<InetAddressAndPort> getLiveNaturalEndpoints(Keyspace keyspace, ByteBuffer key)
    -    {
    -        return getLiveNaturalEndpoints(keyspace, tokenMetadata.decorateKey(key));
    -    }
    -
    -    public List<InetAddressAndPort> getLiveNaturalEndpoints(Keyspace keyspace, RingPosition pos)
    +    public ReplicaList getLiveNaturalReplicas(Keyspace keyspace, ByteBuffer key)
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189092476
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaSet.java ---
    @@ -0,0 +1,142 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.HashSet;
    +import java.util.Iterator;
    +import java.util.LinkedHashSet;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.collect.ImmutableSet;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +public class ReplicaSet extends Replicas
    +{
    +    static final ReplicaSet EMPTY = new ReplicaSet(ImmutableSet.of());
    +
    +    private final Set<Replica> replicaSet;
    +
    +    public ReplicaSet()
    +    {
    +        replicaSet = new HashSet<>();
    +    }
    +
    +    public ReplicaSet(int expectedSize)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(expectedSize);
    +    }
    +
    +    public ReplicaSet(Replicas replicas)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(replicas.size());
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    private ReplicaSet(Set<Replica> replicaSet)
    +    {
    +        this.replicaSet = replicaSet;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaSet that = (ReplicaSet) o;
    +        return Objects.equals(replicaSet, that.replicaSet);
    +    }
    +
    +    public int hashCode()
    +    {
    +        return Objects.hash(replicaSet);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaSet.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaSet.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaSet.removeIf(r -> r.getEndpoint().equals(endpoint));
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188102161
  
    --- Diff: src/java/org/apache/cassandra/locator/PendingRangeMaps.java ---
    @@ -23,196 +23,176 @@
     import com.google.common.collect.Iterators;
     import org.apache.cassandra.dht.Range;
     import org.apache.cassandra.dht.Token;
    -import org.slf4j.Logger;
    -import org.slf4j.LoggerFactory;
     
     import java.util.*;
     
    -public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<InetAddressAndPort>>>
    +public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<Replica>>>
     {
    -    private static final Logger logger = LoggerFactory.getLogger(PendingRangeMaps.class);
    -
         /**
          * We have for NavigableMap to be able to search for ranges containing a token efficiently.
          *
          * First two are for non-wrap-around ranges, and the last two are for wrap-around ranges.
          */
         // ascendingMap will sort the ranges by the ascending order of right token
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMap;
    +    private final NavigableMap<Range<Token>, List<Replica>> ascendingMap;
    +
         /**
          * sorting end ascending, if ends are same, sorting begin descending, so that token (end, end) will
          * come before (begin, end] with the same end, and (begin, end) will be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> ascendingComparator = new Comparator<Range<Token>>()
    -        {
    -            @Override
    -            public int compare(Range<Token> o1, Range<Token> o2)
    -            {
    -                int res = o1.right.compareTo(o2.right);
    -                if (res != 0)
    -                    return res;
    +    private static final Comparator<Range<Token>> ascendingComparator = (o1, o2) -> {
    +        int res = o1.right.compareTo(o2.right);
    +        if (res != 0)
    +            return res;
     
    -                return o2.left.compareTo(o1.left);
    -            }
    -        };
    +        return o2.left.compareTo(o1.left);
    +    };
     
         // ascendingMap will sort the ranges by the descending order of left token
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMap;
    +    private final NavigableMap<Range<Token>, List<Replica>> descendingMap;
    +
         /**
          * sorting begin descending, if begins are same, sorting end descending, so that token (begin, begin) will
          * come after (begin, end] with the same begin, and (begin, end) won't be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> descendingComparator = new Comparator<Range<Token>>()
    -        {
    -            @Override
    -            public int compare(Range<Token> o1, Range<Token> o2)
    -            {
    -                int res = o2.left.compareTo(o1.left);
    -                if (res != 0)
    -                    return res;
    +    private static final Comparator<Range<Token>> descendingComparator = (o1, o2) -> {
    +        int res = o2.left.compareTo(o1.left);
    +        if (res != 0)
    +            return res;
     
    -                // if left tokens are same, sort by the descending of the right tokens.
    -                return o2.right.compareTo(o1.right);
    -            }
    -        };
    +        // if left tokens are same, sort by the descending of the right tokens.
    +        return o2.right.compareTo(o1.right);
    +    };
     
         // these two maps are for warp around ranges.
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMapForWrapAround;
    +    private final NavigableMap<Range<Token>, List<Replica>> ascendingMapForWrapAround;
    +
         /**
          * for wrap around range (begin, end], which begin > end.
          * Sorting end ascending, if ends are same, sorting begin ascending,
          * so that token (end, end) will come before (begin, end] with the same end, and (begin, end] will be selected in
          * the tailMap.
          */
    -    static final Comparator<Range<Token>> ascendingComparatorForWrapAround = new Comparator<Range<Token>>()
    -    {
    -        @Override
    -        public int compare(Range<Token> o1, Range<Token> o2)
    -        {
    -            int res = o1.right.compareTo(o2.right);
    -            if (res != 0)
    -                return res;
    +    private static final Comparator<Range<Token>> ascendingComparatorForWrapAround = (o1, o2) -> {
    +        int res = o1.right.compareTo(o2.right);
    +        if (res != 0)
    +            return res;
     
    -            return o1.left.compareTo(o2.left);
    -        }
    +        return o1.left.compareTo(o2.left);
         };
     
    -    final NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMapForWrapAround;
    +    private final NavigableMap<Range<Token>, List<Replica>> descendingMapForWrapAround;
    +
         /**
          * for wrap around ranges, which begin > end.
          * Sorting end ascending, so that token (begin, begin) will come after (begin, end] with the same begin,
          * and (begin, end) won't be selected in the tailMap.
          */
    -    static final Comparator<Range<Token>> descendingComparatorForWrapAround = new Comparator<Range<Token>>()
    -    {
    -        @Override
    -        public int compare(Range<Token> o1, Range<Token> o2)
    -        {
    -            int res = o2.left.compareTo(o1.left);
    -            if (res != 0)
    -                return res;
    -            return o1.right.compareTo(o2.right);
    -        }
    +    private static final Comparator<Range<Token>> descendingComparatorForWrapAround = (o1, o2) -> {
    +        int res = o2.left.compareTo(o1.left);
    +        if (res != 0)
    +            return res;
    +        return o1.right.compareTo(o2.right);
         };
     
         public PendingRangeMaps()
         {
    -        this.ascendingMap = new TreeMap<Range<Token>, List<InetAddressAndPort>>(ascendingComparator);
    -        this.descendingMap = new TreeMap<Range<Token>, List<InetAddressAndPort>>(descendingComparator);
    -        this.ascendingMapForWrapAround = new TreeMap<Range<Token>, List<InetAddressAndPort>>(ascendingComparatorForWrapAround);
    -        this.descendingMapForWrapAround = new TreeMap<Range<Token>, List<InetAddressAndPort>>(descendingComparatorForWrapAround);
    +        this.ascendingMap = new TreeMap<>(ascendingComparator);
    +        this.descendingMap = new TreeMap<>(descendingComparator);
    +        this.ascendingMapForWrapAround = new TreeMap<>(ascendingComparatorForWrapAround);
    +        this.descendingMapForWrapAround = new TreeMap<>(descendingComparatorForWrapAround);
         }
     
         static final void addToMap(Range<Token> range,
    -                               InetAddressAndPort address,
    -                               NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingMap,
    -                               NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingMap)
    +                               Replica replica,
    +                               NavigableMap<Range<Token>, List<Replica>> ascendingMap,
    +                               NavigableMap<Range<Token>, List<Replica>> descendingMap)
         {
    -        List<InetAddressAndPort> addresses = ascendingMap.get(range);
    -        if (addresses == null)
    +        List<Replica> replicas = ascendingMap.get(range);
    +        if (replicas == null)
             {
    -            addresses = new ArrayList<>(1);
    -            ascendingMap.put(range, addresses);
    -            descendingMap.put(range, addresses);
    +            replicas = new ArrayList<>(1);
    +            ascendingMap.put(range, replicas);
    +            descendingMap.put(range, replicas);
             }
    -        addresses.add(address);
    +        replicas.add(replica);
         }
     
    -    public void addPendingRange(Range<Token> range, InetAddressAndPort address)
    +    public void addPendingRange(Range<Token> range, Replica replica)
         {
             if (Range.isWrapAround(range.left, range.right))
             {
    -            addToMap(range, address, ascendingMapForWrapAround, descendingMapForWrapAround);
    +            addToMap(range, replica, ascendingMapForWrapAround, descendingMapForWrapAround);
             }
             else
             {
    -            addToMap(range, address, ascendingMap, descendingMap);
    +            addToMap(range, replica, ascendingMap, descendingMap);
             }
         }
     
    -    static final void addIntersections(Set<InetAddressAndPort> endpointsToAdd,
    -                                       NavigableMap<Range<Token>, List<InetAddressAndPort>> smallerMap,
    -                                       NavigableMap<Range<Token>, List<InetAddressAndPort>> biggerMap)
    +    static final void addIntersections(ReplicaSet replicasToAdd,
    +                                       NavigableMap<Range<Token>, List<Replica>> smallerMap,
    +                                       NavigableMap<Range<Token>, List<Replica>> biggerMap)
         {
             // find the intersection of two sets
             for (Range<Token> range : smallerMap.keySet())
             {
    -            List<InetAddressAndPort> addresses = biggerMap.get(range);
    -            if (addresses != null)
    +            List<Replica> replicas = biggerMap.get(range);
    +            if (replicas != null)
                 {
    -                endpointsToAdd.addAll(addresses);
    +                replicasToAdd.addAll(replicas);
                 }
             }
         }
     
    -    public Collection<InetAddressAndPort> pendingEndpointsFor(Token token)
    +    public ReplicaSet pendingEndpointsFor(Token token)
         {
    -        Set<InetAddressAndPort> endpoints = new HashSet<>();
    +        ReplicaSet replicas = new ReplicaSet();
     
    -        Range searchRange = new Range(token, token);
    +        Range<Token> searchRange = new Range<>(token, token);
     
             // search for non-wrap-around maps
    -        NavigableMap<Range<Token>, List<InetAddressAndPort>> ascendingTailMap = ascendingMap.tailMap(searchRange, true);
    -        NavigableMap<Range<Token>, List<InetAddressAndPort>> descendingTailMap = descendingMap.tailMap(searchRange, false);
    +        NavigableMap<Range<Token>, List<Replica>> ascendingTailMap = ascendingMap.tailMap(searchRange, true);
    +        NavigableMap<Range<Token>, List<Replica>> descendingTailMap = descendingMap.tailMap(searchRange, false);
     
             // add intersections of two maps
             if (ascendingTailMap.size() < descendingTailMap.size())
             {
    -            addIntersections(endpoints, ascendingTailMap, descendingTailMap);
    +            addIntersections(replicas, ascendingTailMap, descendingTailMap);
             }
             else
             {
    -            addIntersections(endpoints, descendingTailMap, ascendingTailMap);
    +            addIntersections(replicas, descendingTailMap, ascendingTailMap);
             }
     
             // search for wrap-around sets
             ascendingTailMap = ascendingMapForWrapAround.tailMap(searchRange, true);
             descendingTailMap = descendingMapForWrapAround.tailMap(searchRange, false);
     
             // add them since they are all necessary.
    -        for (Map.Entry<Range<Token>, List<InetAddressAndPort>> entry : ascendingTailMap.entrySet())
    +        for (Map.Entry<Range<Token>, List<Replica>> entry : ascendingTailMap.entrySet())
             {
    -            endpoints.addAll(entry.getValue());
    +            replicas.addAll(entry.getValue());
             }
    -        for (Map.Entry<Range<Token>, List<InetAddressAndPort>> entry : descendingTailMap.entrySet())
    +        for (Map.Entry<Range<Token>, List<Replica>> entry : descendingTailMap.entrySet())
             {
    -            endpoints.addAll(entry.getValue());
    +            replicas.addAll(entry.getValue());
             }
     
    -        return endpoints;
    +        return replicas;
         }
     
         public String printPendingRanges()
         {
             StringBuilder sb = new StringBuilder();
     
    -        for (Map.Entry<Range<Token>, List<InetAddressAndPort>> entry : this)
    +        for (Map.Entry<Range<Token>, List<Replica>> entry : this)
             {
                 Range<Token> range = entry.getKey();
     
    -            for (InetAddressAndPort address : entry.getValue())
    +            for (Replica replica : entry.getValue())
                 {
    -                sb.append(address).append(':').append(range);
    +                sb.append(replica).append(':').append(range);
    --- End diff --
    
    What kind of compatibility issues are we creating here by changing the output? (And what kind did I create with InetAddressAndPort?)
    
    Doesn't look like an issue to me since it's used in a trace log statement and TokenMetadata.toString().


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189397262
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    +        ReplicaList replicas = new ReplicaList(normalized.size());
    +        for (Range<Token> normalizedRange: normalized)
    +        {
    +            replicas.add(new Replica(getEndpoint(), normalizedRange, isFull()));
    +        }
    +        return replicas;
    +    }
    +
    +    public boolean contains(Range<Token> that)
    +    {
    +        return getRange().contains(that);
    +    }
    +
    +    public boolean intersectsOnRange(Replica replica)
    +    {
    +        return getRange().intersects(replica.getRange());
    +    }
    +
    +    public Replica decorateSubrange(Range<Token> subrange)
    +    {
    +        Preconditions.checkArgument(range.contains(subrange));
    +        return new Replica(getEndpoint(), subrange, isFull());
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, true);
    +    }
    +
    +    /**
    +     * We need to assume an endpoint is a full replica in a with unknown ranges in a
    +     * few cases, so this returns one that throw an exception if you try to get it's range
    +     */
    +    public static Replica fullStandin(InetAddressAndPort endpoint)
    --- End diff --
    
    I agree it sucks. The problem is that there batchlog and hint related functions that don't work particularly well with the Replica idea, but use the snitch and some write path stuff that were converted to use Replicas to prevent a lot of conversion/copying on the hot path. So using fake replicas solved that problem, but not that well. Let me take another look at the places that use stand ins though, there's probably something less crappy that can be done.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    Uh I may have brain farted there. I don't think there is an issue.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189117001
  
    --- Diff: src/java/org/apache/cassandra/locator/TokenMetadata.java ---
    @@ -1204,21 +1205,21 @@ private String printPendingRanges()
             return sb.toString();
         }
     
    -    public Collection<InetAddressAndPort> pendingEndpointsFor(Token token, String keyspaceName)
    +    public Replicas pendingEndpointsFor(Token token, String keyspaceName)
         {
             PendingRangeMaps pendingRangeMaps = this.pendingRanges.get(keyspaceName);
             if (pendingRangeMaps == null)
    -            return Collections.emptyList();
    +            return new ReplicaList(0);
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    Hey one thing I found is that the various implementations of ReplicaSet and ReplicaList (immutable, singleton) don't handle equals and hash code correctly.
    
    You probably want an AbstractReplicaSet and AbstractReplicaList with no storage that define how equals (against any derived class of AbstractReplicaSet or List depending).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188119101
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    +        for (Replica replica: replicaList)
    --- End diff --
    
    Iterate by index rather than using an iterator.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188444331
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    +        for (Replica replica: replicaList)
    +        {
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    +    {
    +        ReplicaList replicaList = new ReplicaList(replicas.length);
    +        for (Replica replica: replicas)
    +        {
    +            replicaList.add(replica);
    +        }
    +        return replicaList;
    +    }
    +
    +    public ReplicaList subList(int fromIndex, int toIndex)
    +    {
    +        return new ReplicaList(replicaList.subList(fromIndex, toIndex));
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        if (Iterables.all(this, Replica::isFull))
    +        {
    +            ReplicaList normalized = new ReplicaList(size());
    +            for (Replica replica: this)
    +            {
    +                normalized.addAll(replica.normalizeByRange());
    +            }
    +
    +            return normalized;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public static ReplicaList immutableCopyOf(Replicas replicas)
    --- End diff --
    
    You don't need to allocate a ReplicaList if the replicas is already a replica list containing an ImmutableList. You can just return what was passed to you. This is an optimization Immutabe* do so that it's low overhead to ask for defensive copies.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r194581549
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaSet.java ---
    @@ -0,0 +1,156 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.HashSet;
    +import java.util.Iterator;
    +import java.util.LinkedHashSet;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.collect.ImmutableSet;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +public class ReplicaSet extends Replicas
    +{
    +    static final ReplicaSet EMPTY = new ReplicaSet(ImmutableSet.of());
    +
    +    private final Set<Replica> replicaSet;
    +
    +    public ReplicaSet()
    +    {
    +        this(new HashSet<>());
    +    }
    +
    +    public ReplicaSet(int expectedSize)
    +    {
    +        this(Sets.newHashSetWithExpectedSize(expectedSize));
    +    }
    +
    +    public ReplicaSet(Replicas replicas)
    +    {
    +        this(Sets.newHashSetWithExpectedSize(replicas.size()));
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    private ReplicaSet(Set<Replica> replicaSet)
    +    {
    +        this.replicaSet = replicaSet;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaSet that = (ReplicaSet) o;
    +        return Objects.equals(replicaSet, that.replicaSet);
    +    }
    +
    +    public int hashCode()
    +    {
    +        return replicaSet.hashCode();
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaSet.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaSet.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (Replica replica: replicaSet)
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197124802
  
    --- Diff: src/java/org/apache/cassandra/db/ConsistencyLevel.java ---
    @@ -148,40 +150,45 @@ public boolean isLocal(InetAddressAndPort endpoint)
             return DatabaseDescriptor.getLocalDataCenter().equals(DatabaseDescriptor.getEndpointSnitch().getDatacenter(endpoint));
         }
     
    -    public int countLocalEndpoints(Iterable<InetAddressAndPort> liveEndpoints)
    +    public boolean isLocal(Replica replica)
    +    {
    +        return isLocal(replica.getEndpoint());
    +    }
    +
    +    public int countLocalEndpoints(Iterable<Replica> liveReplicas)
    --- End diff --
    
    This can be just `Replicas`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197128191
  
    --- Diff: src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java ---
    @@ -102,35 +104,35 @@ protected AbstractReplicationStrategy(String keyspaceName, TokenMetadata tokenMe
          * @param searchPosition the position the natural endpoints are requested for
          * @return a copy of the natural endpoints for the given token
          */
    -    public ArrayList<InetAddressAndPort> getNaturalEndpoints(RingPosition searchPosition)
    +    public ReplicaList getNaturalReplicas(RingPosition searchPosition)
         {
             Token searchToken = searchPosition.getToken();
             Token keyToken = TokenMetadata.firstToken(tokenMetadata.sortedTokens(), searchToken);
    -        ArrayList<InetAddressAndPort> endpoints = getCachedEndpoints(keyToken);
    +        ReplicaList endpoints = getCachedReplicas(keyToken);
             if (endpoints == null)
             {
                 TokenMetadata tm = tokenMetadata.cachedOnlyTokenMap();
                 // if our cache got invalidated, it's possible there is a new token to account for too
                 keyToken = TokenMetadata.firstToken(tm.sortedTokens(), searchToken);
    -            endpoints = new ArrayList<InetAddressAndPort>(calculateNaturalEndpoints(searchToken, tm));
    -            cachedEndpoints.put(keyToken, endpoints);
    +            endpoints = calculateNaturalReplicas(searchToken, tm);
    +            cachedReplicas.put(keyToken, endpoints);
             }
     
    -        return new ArrayList<InetAddressAndPort>(endpoints);
    +        return new ReplicaList(endpoints);
    --- End diff --
    
    E.g. this copy could be spared if we knew that cached endpoints contains an immutable collection.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197125767
  
    --- Diff: src/java/org/apache/cassandra/db/ConsistencyLevel.java ---
    @@ -190,50 +197,50 @@ public int countLocalEndpoints(Iterable<InetAddressAndPort> liveEndpoints)
              * the blockFor first ones).
              */
             if (isDCLocal)
    -            liveEndpoints.sort(DatabaseDescriptor.getLocalComparator());
    +            liveReplicas.sort(DatabaseDescriptor.getLocalComparator());
     
    -        return liveEndpoints.subList(0, Math.min(liveEndpoints.size(), blockFor(keyspace)));
    +        return liveReplicas.subList(0, Math.min(liveReplicas.size(), blockFor(keyspace)));
         }
     
    -    private List<InetAddressAndPort> filterForEachQuorum(Keyspace keyspace, List<InetAddressAndPort> liveEndpoints)
    +    private ReplicaList filterForEachQuorum(Keyspace keyspace, ReplicaList liveReplicas)
         {
             NetworkTopologyStrategy strategy = (NetworkTopologyStrategy) keyspace.getReplicationStrategy();
     
    -        Map<String, List<InetAddressAndPort>> dcsEndpoints = new HashMap<>();
    +        Map<String, ReplicaList> dcsReplicas = new HashMap<>();
             for (String dc: strategy.getDatacenters())
    -            dcsEndpoints.put(dc, new ArrayList<>());
    +            dcsReplicas.put(dc, ReplicaList.withMaxSize(liveReplicas.size()));
     
    -        for (InetAddressAndPort add : liveEndpoints)
    +        for (Replica replica : liveReplicas)
             {
    -            String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(add);
    -            dcsEndpoints.get(dc).add(add);
    +            String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(replica);
    +            dcsReplicas.get(dc).add(replica);
             }
     
    -        List<InetAddressAndPort> waitSet = new ArrayList<>();
    -        for (Map.Entry<String, List<InetAddressAndPort>> dcEndpoints : dcsEndpoints.entrySet())
    +        ReplicaList waitSet = new ReplicaList(ReplicaList.withMaxSize(liveReplicas.size()));
    +        for (Map.Entry<String, ReplicaList> dcEndpoints : dcsReplicas.entrySet())
             {
    -            List<InetAddressAndPort> dcEndpoint = dcEndpoints.getValue();
    +            ReplicaList dcEndpoint = dcEndpoints.getValue();
                 waitSet.addAll(dcEndpoint.subList(0, Math.min(localQuorumFor(keyspace, dcEndpoints.getKey()), dcEndpoint.size())));
             }
     
             return waitSet;
         }
     
    -    public boolean isSufficientLiveNodes(Keyspace keyspace, Iterable<InetAddressAndPort> liveEndpoints)
    +    public boolean isSufficientLiveNodes(Keyspace keyspace, Iterable<Replica> liveReplicas)
    --- End diff --
    
    We can use `Replicas` here


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189092384
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188436562
  
    --- Diff: src/java/org/apache/cassandra/locator/TokenMetadata.java ---
    @@ -1204,21 +1205,21 @@ private String printPendingRanges()
             return sb.toString();
         }
     
    -    public Collection<InetAddressAndPort> pendingEndpointsFor(Token token, String keyspaceName)
    +    public Replicas pendingEndpointsFor(Token token, String keyspaceName)
         {
             PendingRangeMaps pendingRangeMaps = this.pendingRanges.get(keyspaceName);
             if (pendingRangeMaps == null)
    -            return Collections.emptyList();
    +            return new ReplicaList(0);
    --- End diff --
    
    Use a singleton immutable empty list?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197129915
  
    --- Diff: src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java ---
    @@ -90,41 +98,53 @@ public NetworkTopologyStrategy(String keyspaceName, TokenMetadata tokenMetadata,
             /** Number of replicas left to fill from this DC. */
             int rfLeft;
             int acceptableRackRepeats;
    +        int transients;
     
    -        DatacenterEndpoints(int rf, int rackCount, int nodeCount, Set<InetAddressAndPort> endpoints, Set<Pair<String, String>> racks)
    +        DatacenterEndpoints(ReplicationFactor rf, int rackCount, int nodeCount, ReplicaSet replicas, Set<Pair<String, String>> racks)
             {
    -            this.endpoints = endpoints;
    +            this.replicas = replicas;
                 this.racks = racks;
                 // If there aren't enough nodes in this DC to fill the RF, the number of nodes is the effective RF.
    -            this.rfLeft = Math.min(rf, nodeCount);
    +            this.rfLeft = Math.min(rf.replicas, nodeCount);
                 // If there aren't enough racks in this DC to fill the RF, we'll still use at least one node from each rack,
                 // and the difference is to be filled by the first encountered nodes.
    -            acceptableRackRepeats = rf - rackCount;
    +            acceptableRackRepeats = rf.replicas - rackCount;
    +
    +            // if we have fewer replicas than rf calls for, reduce transients accordingly
    +            int reduceTransients = rf.replicas - this.rfLeft;
    +            transients = Math.max(rf.trans - reduceTransients, 0);
    +            ReplicationFactor.validate(rfLeft, transients);
             }
     
             /**
    -         * Attempts to add an endpoint to the replicas for this datacenter, adding to the endpoints set if successful.
    +         * Attempts to add an endpoint to the replicas for this datacenter, adding to the replicas set if successful.
              * Returns true if the endpoint was added, and this datacenter does not require further replicas.
              */
    -        boolean addEndpointAndCheckIfDone(InetAddressAndPort ep, Pair<String,String> location)
    +        boolean addEndpointAndCheckIfDone(InetAddressAndPort ep, Pair<String,String> location, Range<Token> replicatedRange)
             {
                 if (done())
                     return false;
     
    +            if (replicas.containsEndpoint(ep))
    --- End diff --
    
    Previously this has been handled with `endpoints.add` which was using Set as well, and it was throwing with `assert` rather than skipping. 
    
    Did we change some subtle semantics here?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188800538
  
    --- Diff: src/java/org/apache/cassandra/batchlog/BatchlogManager.java ---
    @@ -490,16 +497,16 @@ private static int gcgs(Collection<Mutation> mutations)
             {
                 private final Set<InetAddressAndPort> undelivered = Collections.newSetFromMap(new ConcurrentHashMap<>());
     
    -            ReplayWriteResponseHandler(Collection<InetAddressAndPort> writeEndpoints, long queryStartNanoTime)
    +            ReplayWriteResponseHandler(Replicas writeReplicas, long queryStartNanoTime)
                 {
    -                super(writeEndpoints, Collections.<InetAddressAndPort>emptySet(), null, null, null, WriteType.UNLOGGED_BATCH, queryStartNanoTime);
    -                undelivered.addAll(writeEndpoints);
    +                super(writeReplicas, ReplicaList.of(), null, null, null, WriteType.UNLOGGED_BATCH, queryStartNanoTime);
    +                Iterables.addAll(undelivered, writeReplicas.asEndpoints());
    --- End diff --
    
    asEndpoints / asRanges doesn't allocate a new list, it just transforms the iterable


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by jeffjirsa <gi...@git.apache.org>.
Github user jeffjirsa commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187779954
  
    --- Diff: conf/cassandra.yaml ---
    @@ -1032,6 +1032,10 @@ enable_scripted_user_defined_functions: false
     # Materialized views are considered experimental and are not recommended for production use.
     enable_materialized_views: true
     
    +# Enables creation of transiently replicated keyspaces on this node.
    +# Transient replication is experimental and is not recommended for production use.
    --- End diff --
    
    Thanks for the experimental line. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187475433
  
    --- Diff: src/java/org/apache/cassandra/db/compaction/CompactionManager.java ---
    @@ -468,7 +465,7 @@ public AllSSTableOpStatus performCleanup(final ColumnFamilyStore cfStore, int jo
                 return AllSSTableOpStatus.ABORTED;
             }
             // if local ranges is empty, it means no data should remain
    -        final Collection<Range<Token>> ranges = StorageService.instance.getLocalRanges(keyspace.getName());
    +        final Collection<Range<Token>> ranges = StorageService.instance.getLocalReplicas(keyspace.getName()).asRangeSet();
    --- End diff --
    
    This could use a Collection2.transform version. It's just iterated.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188116936
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    --- End diff --
    
    Range.subtractAll could take an iterable here and then we could use a lazy transformation.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188115552
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    +        for (Replica replica: replicaList)
    +        {
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    --- End diff --
    
    You can do this without converting or wrapping because l2 implements containsEndpoint.
    
    It might make sense to implement containsEndpoint here in ReplicaList so that it doesn't have to allocate an iterator. You can't do a similar thing with ReplicaSet since you can't access the Set implementations underlying storage for iteration.
    
    The disadvantage is it makes containsEndpoint bi-morphic which is probably fine.
    
    Granted the iterator is probably allocated on the stack since escape analysis should work in this instance and its methods get inlined so??? 
    
    Also Iterables.any requires allocating the predicate since it's binding a value. Probably worth avoiding that.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189030217
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicationFactor.java ---
    @@ -0,0 +1,121 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.Objects;
    +
    +import com.google.common.base.Preconditions;
    +
    +import org.apache.cassandra.config.DatabaseDescriptor;
    +
    +public class ReplicationFactor
    +{
    +    public static final ReplicationFactor ZERO = new ReplicationFactor(0);
    +
    +    public final int trans;
    +    public final int replicas;
    +    public transient final int full;
    --- End diff --
    
    I think I did that as sort of a hint that it's always calculated in the ctor as a convenience. Since rf is defined as `all_replicas/trans_replicas`, and the `full` value is never provided by the caller.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187710293
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -0,0 +1,313 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Set;
    +
    +import com.google.common.base.Predicate;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.utils.FBUtilities;
    +
    +/**
    + * A collection like class for Replica objects. Since the Replica class contains inetaddress, range, and
    + * transient replication status, basic contains and remove methods can be ambiguous. Replicas forces you
    + * to be explicit about what you're checking the container for, or removing from it.
    + */
    +public abstract class Replicas implements Iterable<Replica>
    +{
    +
    +    public abstract boolean add(Replica replica);
    +    public abstract void addAll(Iterable<Replica> replicas);
    +    public abstract void removeEndpoint(InetAddressAndPort endpoint);
    +    public abstract void removeReplica(Replica replica);
    +    public abstract int size();
    +
    +    public Iterable<InetAddressAndPort> asEndpoints()
    +    {
    +        return Iterables.transform(this, Replica::getEndpoint);
    +    }
    +
    +    public Set<InetAddressAndPort> asEndpointSet()
    +    {
    +        Set<InetAddressAndPort> result = Sets.newHashSetWithExpectedSize(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getEndpoint());
    +        }
    +        return result;
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> result = new ArrayList<>(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getEndpoint());
    +        }
    +        return result;
    +    }
    +
    +    public Iterable<Range<Token>> asRanges()
    +    {
    +        return Iterables.transform(this, Replica::getRange);
    +    }
    +
    +    public Set<Range<Token>> asRangeSet()
    +    {
    +        Set<Range<Token>> result = Sets.newHashSetWithExpectedSize(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getRange());
    +        }
    +        return result;
    +    }
    +
    +    public Iterable<Range<Token>> fullRanges()
    --- End diff --
    
    Is this really a good thing to have around? The one use I see of this doesn't look appropriate.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r194581551
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,283 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        this(new ArrayList<>());
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        this(new ArrayList<>(capacity));
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        this(new ArrayList<>(from.replicaList));
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        this(new ArrayList<>(from.size()));
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        this(new ArrayList<>(from));
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +        return replicaList.hashCode();
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        Preconditions.checkNotNull(replica);
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    protected Collection<Replica> getUnmodifiableCollection()
    +    {
    +        return Collections.unmodifiableCollection(replicaList);
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=replicaList.size()-1; i>=0; i--)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +            {
    +                replicaList.remove(i);
    +            }
    +        }
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    @Override
    +    public boolean containsEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=0; i<size(); i++)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +                return true;
    +        }
    +        return false;
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = size() < 10 ? new ArrayList<>(size()) : new ArrayList<>();
    +        for (int i=0; i<size(); i++)
    +        {
    +            Replica replica = replicaList.get(i);
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of()
    +    {
    +        return new ReplicaList(0);
    +    }
    +
    +    public static ReplicaList of(Replica replica)
    +    {
    +        ReplicaList replicaList = new ReplicaList(1);
    +        replicaList.add(replica);
    +        return replicaList;
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    +    {
    +        ReplicaList replicaList = new ReplicaList(replicas.length);
    +        for (Replica replica: replicas)
    +        {
    +            replicaList.add(replica);
    +        }
    +        return replicaList;
    +    }
    +
    +    public ReplicaList subList(int fromIndex, int toIndex)
    +    {
    +        return new ReplicaList(replicaList.subList(fromIndex, toIndex));
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        if (Iterables.all(this, Replica::isFull))
    +        {
    +            ReplicaList normalized = new ReplicaList(size());
    +            for (Replica replica: this)
    +            {
    +                replica.addNormalizeByRange(normalized);
    +            }
    +
    +            return normalized;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public static ReplicaList immutableCopyOf(Replicas replicas)
    +    {
    +        return new ReplicaList(ImmutableList.<Replica>builder().addAll(replicas).build());
    +    }
    +
    +    public static ReplicaList immutableCopyOf(ReplicaList replicas)
    +    {
    +        return new ReplicaList(ImmutableList.copyOf(replicas.replicaList));
    +    }
    +
    +    public static ReplicaList immutableCopyOf(Replica... replicas)
    +    {
    +        return new ReplicaList(ImmutableList.copyOf(replicas));
    +    }
    +
    +    public static ReplicaList empty()
    +    {
    +        return new ReplicaList();
    +    }
    +
    +    public static ReplicaList fullStandIns(Collection<InetAddressAndPort> endpoints)
    +    {
    +        ReplicaList replicaList = new ReplicaList(endpoints.size());
    +        for (InetAddressAndPort endpoint: endpoints)
    +        {
    +            replicaList.add(Replica.fullStandin(endpoint));
    +        }
    +        return replicaList;
    +    }
    +
    +    public static ReplicaList fullStandIns(Iterable<InetAddressAndPort> endpoints)
    +    {
    +        ReplicaList replicaList = new ReplicaList();
    +        for (InetAddressAndPort endpoint: endpoints)
    +        {
    +            replicaList.add(Replica.fullStandin(endpoint));
    +        }
    +        return replicaList;
    +    }
    +
    +    /**
    +     * For allocating ReplicaLists where the final size is unknown, but
    +     * should be less than the given size. Prevents overallocations in cases
    +     * where there are less than the default ArrayList size, and defers to the
    +     * ArrayList algorithem where there might be more
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197125691
  
    --- Diff: src/java/org/apache/cassandra/db/ConsistencyLevel.java ---
    @@ -190,50 +197,50 @@ public int countLocalEndpoints(Iterable<InetAddressAndPort> liveEndpoints)
              * the blockFor first ones).
              */
             if (isDCLocal)
    -            liveEndpoints.sort(DatabaseDescriptor.getLocalComparator());
    +            liveReplicas.sort(DatabaseDescriptor.getLocalComparator());
     
    -        return liveEndpoints.subList(0, Math.min(liveEndpoints.size(), blockFor(keyspace)));
    +        return liveReplicas.subList(0, Math.min(liveReplicas.size(), blockFor(keyspace)));
         }
     
    -    private List<InetAddressAndPort> filterForEachQuorum(Keyspace keyspace, List<InetAddressAndPort> liveEndpoints)
    +    private ReplicaList filterForEachQuorum(Keyspace keyspace, ReplicaList liveReplicas)
    --- End diff --
    
    Here, the sorted variant is not preserved (even though we return a sorted collection from `filterForQuery`).
    
    We could retain sorted invariant if we used `filter`, maybe something like:
    
    ```
        private ReplicaList filterForEachQuorum(Keyspace keyspace, ReplicaList liveReplicas)
        {
            NetworkTopologyStrategy strategy = (NetworkTopologyStrategy) keyspace.getReplicationStrategy();
            Map<String, Integer> dcsReplicas = new HashMap<>();
            for (String dc : strategy.getDatacenters())
            {
                // we put _up to_ dc replicas only
                dcsReplicas.put(dc, localQuorumFor(keyspace, dc));
            }
    
            return liveReplicas.filter((replica) -> {
                String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(replica);
                int replicas = dcsReplicas.get(dc);
                if (replicas > 0)
                {
                    dcsReplicas.put(dc, --replicas);
                    return true;
                }
                return false;
            });
        }
    ```
    
    (also would avoid three iterations in favour of just two).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187710393
  
    --- Diff: src/java/org/apache/cassandra/dht/RangeStreamer.java ---
    @@ -176,25 +179,28 @@ public void addSourceFilter(ISourceFilter filter)
          * Add ranges to be streamed for given keyspace.
          *
          * @param keyspaceName keyspace name
    -     * @param ranges ranges to be streamed
    +     * @param replicas ranges to be streamed
          */
    -    public void addRanges(String keyspaceName, Collection<Range<Token>> ranges)
    +    public void addRanges(String keyspaceName, Replicas replicas)
    --- End diff --
    
    Should this be addReplicas? I would also change the comment to ranges to be fetched. Streaming seems to get used to imply both sending and fetching. But fetching always means fetching.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187440113
  
    --- Diff: src/java/org/apache/cassandra/db/ConsistencyLevel.java ---
    @@ -190,50 +197,50 @@ public int countLocalEndpoints(Iterable<InetAddressAndPort> liveEndpoints)
              * the blockFor first ones).
              */
             if (isDCLocal)
    -            liveEndpoints.sort(DatabaseDescriptor.getLocalComparator());
    +            liveReplicas.sort(DatabaseDescriptor.getLocalComparator());
     
    -        return liveEndpoints.subList(0, Math.min(liveEndpoints.size(), blockFor(keyspace)));
    +        return liveReplicas.subList(0, Math.min(liveReplicas.size(), blockFor(keyspace)));
         }
     
    -    private List<InetAddressAndPort> filterForEachQuorum(Keyspace keyspace, List<InetAddressAndPort> liveEndpoints)
    +    private ReplicaList filterForEachQuorum(Keyspace keyspace, ReplicaList liveReplicas)
         {
             NetworkTopologyStrategy strategy = (NetworkTopologyStrategy) keyspace.getReplicationStrategy();
     
    -        Map<String, List<InetAddressAndPort>> dcsEndpoints = new HashMap<>();
    +        Map<String, ReplicaList> dcsReplicas = new HashMap<>();
             for (String dc: strategy.getDatacenters())
    -            dcsEndpoints.put(dc, new ArrayList<>());
    +            dcsReplicas.put(dc, new ReplicaList());
     
    -        for (InetAddressAndPort add : liveEndpoints)
    +        for (Replica replica : liveReplicas)
             {
    -            String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(add);
    -            dcsEndpoints.get(dc).add(add);
    +            String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(replica);
    +            dcsReplicas.get(dc).add(replica);
             }
     
    -        List<InetAddressAndPort> waitSet = new ArrayList<>();
    -        for (Map.Entry<String, List<InetAddressAndPort>> dcEndpoints : dcsEndpoints.entrySet())
    +        ReplicaList waitSet = new ReplicaList();
    --- End diff --
    
    Another potential over allocation.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197133241
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -344,47 +343,43 @@ private static void recordCasContention(int contentions)
                 casWriteMetrics.contention.update(contentions);
         }
     
    -    private static Predicate<InetAddressAndPort> sameDCPredicateFor(final String dc)
    +    private static Predicate<Replica> sameDCPredicateFor(final String dc)
         {
             final IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
    -        return new Predicate<InetAddressAndPort>()
    -        {
    -            public boolean apply(InetAddressAndPort host)
    -            {
    -                return dc.equals(snitch.getDatacenter(host));
    -            }
    -        };
    +        return replica -> dc.equals(snitch.getDatacenter(replica));
         }
     
         private static PaxosParticipants getPaxosParticipants(TableMetadata metadata, DecoratedKey key, ConsistencyLevel consistencyForPaxos) throws UnavailableException
         {
             Token tk = key.getToken();
    -        List<InetAddressAndPort> naturalEndpoints = StorageService.instance.getNaturalEndpoints(metadata.keyspace, tk);
    -        Collection<InetAddressAndPort> pendingEndpoints = StorageService.instance.getTokenMetadata().pendingEndpointsFor(tk, metadata.keyspace);
    +        ReplicaList naturalReplicas = StorageService.instance.getNaturalReplicas(metadata.keyspace, tk);
    +        ReplicaList pendingReplicas = new ReplicaList(StorageService.instance.getTokenMetadata().pendingEndpointsFor(tk, metadata.keyspace));
             if (consistencyForPaxos == ConsistencyLevel.LOCAL_SERIAL)
             {
    -            // Restrict naturalEndpoints and pendingEndpoints to node in the local DC only
    +            // Restrict naturalReplicas and pendingReplicas to node in the local DC only
                 String localDc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(FBUtilities.getBroadcastAddressAndPort());
    -            Predicate<InetAddressAndPort> isLocalDc = sameDCPredicateFor(localDc);
    -            naturalEndpoints = ImmutableList.copyOf(Iterables.filter(naturalEndpoints, isLocalDc));
    -            pendingEndpoints = ImmutableList.copyOf(Iterables.filter(pendingEndpoints, isLocalDc));
    +            Predicate<Replica> isLocalDc = sameDCPredicateFor(localDc);
    +            naturalReplicas = ReplicaList.immutableCopyOf(naturalReplicas.filter(isLocalDc));
    +            pendingReplicas = ReplicaList.immutableCopyOf(pendingReplicas.filter(isLocalDc));
             }
    -        int participants = pendingEndpoints.size() + naturalEndpoints.size();
    +        int participants = pendingReplicas.size() + naturalReplicas.size();
             int requiredParticipants = participants / 2 + 1; // See CASSANDRA-8346, CASSANDRA-833
    -        List<InetAddressAndPort> liveEndpoints = ImmutableList.copyOf(Iterables.filter(Iterables.concat(naturalEndpoints, pendingEndpoints), IAsyncCallback.isAlive));
    -        if (liveEndpoints.size() < requiredParticipants)
    -            throw new UnavailableException(consistencyForPaxos, requiredParticipants, liveEndpoints.size());
    +
    +        Replicas concatenated = Replicas.concatNaturalAndPending(naturalReplicas, pendingReplicas);
    +        ReplicaList liveReplicas = ReplicaList.immutableCopyOf(Replicas.filter(concatenated, IAsyncCallback.isReplicaAlive));
    --- End diff --
    
    Same here (with Immutable).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189049318
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    --- End diff --
    
    yep, auto generated


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189038958
  
    --- Diff: src/java/org/apache/cassandra/db/compaction/Verifier.java ---
    @@ -209,7 +208,9 @@ public void verify()
                         markAndThrow();
                 }
     
    -            List<Range<Token>> ownedRanges = isOffline ? Collections.emptyList() : Range.normalize(StorageService.instance.getLocalAndPendingRanges(cfs.metadata().keyspace));
    +            List<Range<Token>> ownedRanges = isOffline
    +                                             ? Collections.emptyList()
    +                                             : Range.normalize(StorageService.instance.getLocalAndPendingReplicas(cfs.metadata().keyspace).asRangeSet());
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188081346
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaSet.java ---
    @@ -0,0 +1,142 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.HashSet;
    +import java.util.Iterator;
    +import java.util.LinkedHashSet;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.collect.ImmutableSet;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +public class ReplicaSet extends Replicas
    +{
    +    static final ReplicaSet EMPTY = new ReplicaSet(ImmutableSet.of());
    +
    +    private final Set<Replica> replicaSet;
    +
    +    public ReplicaSet()
    +    {
    +        replicaSet = new HashSet<>();
    +    }
    +
    +    public ReplicaSet(int expectedSize)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(expectedSize);
    +    }
    +
    +    public ReplicaSet(Replicas replicas)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(replicas.size());
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    private ReplicaSet(Set<Replica> replicaSet)
    +    {
    +        this.replicaSet = replicaSet;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaSet that = (ReplicaSet) o;
    +        return Objects.equals(replicaSet, that.replicaSet);
    +    }
    +
    +    public int hashCode()
    +    {
    +        return Objects.hash(replicaSet);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaSet.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaSet.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaSet.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaSet.remove(replica);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaSet.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaSet.iterator();
    +    }
    +
    +    public ReplicaSet differenceOnEndpoint(Replicas differenceOn)
    +    {
    +        if (Iterables.all(this, Replica::isFull) && Iterables.all(differenceOn, Replica::isFull))
    +        {
    +            Set<InetAddressAndPort> diffEndpoints = differenceOn.asEndpointSet();
    +            return new ReplicaSet(Replicas.filterOnEndpoints(this, e -> !diffEndpoints.contains(e)));
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +
    +    }
    +
    +    public static ReplicaSet immutableCopyOf(ReplicaSet from)
    +    {
    +        return new ReplicaSet(ImmutableSet.copyOf(from.replicaSet));
    +    }
    +
    +    public static ReplicaSet immutableCopyOf(Replicas from)
    +    {
    +        return new ReplicaSet(ImmutableSet.<Replica>builder().addAll(from).build());
    +    }
    +
    +    public static ReplicaSet ordered()
    --- End diff --
    
    I had to look this up to realize that ordered doesn't mean sorted. It's the correct name, they are ordered just not according to their natural order. Maybe the right shade for the bike shed here is order preserving?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188685170
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -3863,17 +3863,12 @@ public void forceTerminateAllRepairSessions()
          *
          * @param keyspace keyspace name also known as keyspace
          * @param pos position for which we need to find the endpoint
    -     * @param liveEps the list of endpoints to mutate
          */
    -    public void getLiveNaturalEndpoints(Keyspace keyspace, RingPosition pos, List<InetAddressAndPort> liveEps)
    +    public ReplicaList getLiveNaturalReplicas(Keyspace keyspace, RingPosition pos)
         {
    -        List<InetAddressAndPort> endpoints = keyspace.getReplicationStrategy().getNaturalEndpoints(pos);
    +        ReplicaList replicas = keyspace.getReplicationStrategy().getNaturalReplicas(pos);
     
    -        for (InetAddressAndPort endpoint : endpoints)
    -        {
    -            if (FailureDetector.instance.isAlive(endpoint))
    -                liveEps.add(endpoint);
    -        }
    +        return replicas.filter(r -> FailureDetector.instance.isAlive(r.getEndpoint()));
    --- End diff --
    
    if isAlive were static you wouldn't have to allocate lambda here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188747102
  
    --- Diff: src/java/org/apache/cassandra/locator/PendingRangeMaps.java ---
    @@ -23,196 +23,176 @@
     import com.google.common.collect.Iterators;
     import org.apache.cassandra.dht.Range;
     import org.apache.cassandra.dht.Token;
    -import org.slf4j.Logger;
    -import org.slf4j.LoggerFactory;
     
     import java.util.*;
     
    -public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<InetAddressAndPort>>>
    +public class PendingRangeMaps implements Iterable<Map.Entry<Range<Token>, List<Replica>>>
    --- End diff --
    
    Should this be consistent and stick to ReplicaList so we don't accidentally emit List<Replica> into the rest of Cassandra?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187429963
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicationFactor.java ---
    @@ -0,0 +1,121 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.Objects;
    +
    +import com.google.common.base.Preconditions;
    +
    +import org.apache.cassandra.config.DatabaseDescriptor;
    +
    +public class ReplicationFactor
    +{
    +    public static final ReplicationFactor ZERO = new ReplicationFactor(0);
    +
    +    public final int trans;
    +    public final int replicas;
    +    public transient final int full;
    --- End diff --
    
    Why is this field transient?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189117060
  
    --- Diff: src/java/org/apache/cassandra/service/reads/DataResolver.java ---
    @@ -30,21 +36,23 @@
     import org.apache.cassandra.db.rows.UnfilteredRowIterator;
     import org.apache.cassandra.db.rows.UnfilteredRowIterators;
     import org.apache.cassandra.db.transform.*;
    -import org.apache.cassandra.locator.InetAddressAndPort;
     import org.apache.cassandra.net.*;
     import org.apache.cassandra.schema.TableMetadata;
    -import org.apache.cassandra.service.reads.repair.ReadRepair;
     
     public class DataResolver extends ResponseResolver
     {
         private final long queryStartNanoTime;
         private final boolean enforceStrictLiveness;
    +    private final Map<InetAddressAndPort, Replica> replicaMap;
     
    -    public DataResolver(Keyspace keyspace, ReadCommand command, ConsistencyLevel consistency, int maxResponseCount, long queryStartNanoTime, ReadRepair readRepair)
    +    public DataResolver(Keyspace keyspace, ReadCommand command, ConsistencyLevel consistency, Replicas replicas, int maxResponseCount, long queryStartNanoTime, ReadRepair readRepair)
         {
             super(keyspace, command, consistency, readRepair, maxResponseCount);
             this.queryStartNanoTime = queryStartNanoTime;
             this.enforceStrictLiveness = command.metadata().enforceStrictLiveness();
    +
    +        replicaMap = Maps.newHashMapWithExpectedSize(replicas.size());
    +        replicas.forEach(r -> replicaMap.put(r.getEndpoint(), r));
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189091069
  
    --- Diff: src/java/org/apache/cassandra/db/ConsistencyLevel.java ---
    @@ -190,50 +197,50 @@ public int countLocalEndpoints(Iterable<InetAddressAndPort> liveEndpoints)
              * the blockFor first ones).
              */
             if (isDCLocal)
    -            liveEndpoints.sort(DatabaseDescriptor.getLocalComparator());
    +            liveReplicas.sort(DatabaseDescriptor.getLocalComparator());
     
    -        return liveEndpoints.subList(0, Math.min(liveEndpoints.size(), blockFor(keyspace)));
    +        return liveReplicas.subList(0, Math.min(liveReplicas.size(), blockFor(keyspace)));
         }
     
    -    private List<InetAddressAndPort> filterForEachQuorum(Keyspace keyspace, List<InetAddressAndPort> liveEndpoints)
    +    private ReplicaList filterForEachQuorum(Keyspace keyspace, ReplicaList liveReplicas)
         {
             NetworkTopologyStrategy strategy = (NetworkTopologyStrategy) keyspace.getReplicationStrategy();
     
    -        Map<String, List<InetAddressAndPort>> dcsEndpoints = new HashMap<>();
    +        Map<String, ReplicaList> dcsReplicas = new HashMap<>();
             for (String dc: strategy.getDatacenters())
    -            dcsEndpoints.put(dc, new ArrayList<>());
    +            dcsReplicas.put(dc, new ReplicaList());
     
    -        for (InetAddressAndPort add : liveEndpoints)
    +        for (Replica replica : liveReplicas)
             {
    -            String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(add);
    -            dcsEndpoints.get(dc).add(add);
    +            String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(replica);
    +            dcsReplicas.get(dc).add(replica);
             }
     
    -        List<InetAddressAndPort> waitSet = new ArrayList<>();
    -        for (Map.Entry<String, List<InetAddressAndPort>> dcEndpoints : dcsEndpoints.entrySet())
    +        ReplicaList waitSet = new ReplicaList();
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189092447
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    +        for (Replica replica: replicaList)
    +        {
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    +    {
    +        ReplicaList replicaList = new ReplicaList(replicas.length);
    +        for (Replica replica: replicas)
    +        {
    +            replicaList.add(replica);
    +        }
    +        return replicaList;
    +    }
    +
    +    public ReplicaList subList(int fromIndex, int toIndex)
    +    {
    +        return new ReplicaList(replicaList.subList(fromIndex, toIndex));
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        if (Iterables.all(this, Replica::isFull))
    +        {
    +            ReplicaList normalized = new ReplicaList(size());
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197136141
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -4231,53 +4211,53 @@ private void calculateToFromStreams(Collection<Token> newTokens, List<String> ke
                 InetAddressAndPort localAddress = FBUtilities.getBroadcastAddressAndPort();
                 IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
                 TokenMetadata tokenMetaCloneAllSettled = tokenMetadata.cloneAfterAllSettled();
    -            // clone to avoid concurrent modification in calculateNaturalEndpoints
    +            // clone to avoid concurrent modification in calculateNaturalReplicas
                 TokenMetadata tokenMetaClone = tokenMetadata.cloneOnlyTokenMap();
     
                 for (String keyspace : keyspaceNames)
                 {
                     // replication strategy of the current keyspace
                     AbstractReplicationStrategy strategy = Keyspace.open(keyspace).getReplicationStrategy();
    -                Multimap<InetAddressAndPort, Range<Token>> endpointToRanges = strategy.getAddressRanges();
    +                ReplicaMultimap<InetAddressAndPort, ReplicaSet> endpointToRanges = strategy.getAddressReplicas();
     
                     logger.debug("Calculating ranges to stream and request for keyspace {}", keyspace);
                     for (Token newToken : newTokens)
                     {
                         // getting collection of the currently used ranges by this keyspace
    -                    Collection<Range<Token>> currentRanges = endpointToRanges.get(localAddress);
    +                    ReplicaSet currentReplicas = endpointToRanges.get(localAddress);
                         // collection of ranges which this node will serve after move to the new token
    -                    Collection<Range<Token>> updatedRanges = strategy.getPendingAddressRanges(tokenMetaClone, newToken, localAddress);
    +                    ReplicaSet updatedReplicas = strategy.getPendingAddressRanges(tokenMetaClone, newToken, localAddress);
     
                         // ring ranges and endpoints associated with them
                         // this used to determine what nodes should we ping about range data
    -                    Multimap<Range<Token>, InetAddressAndPort> rangeAddresses = strategy.getRangeAddresses(tokenMetaClone);
    +                    ReplicaMultimap<Range<Token>, ReplicaSet> rangeAddresses = strategy.getRangeAddresses(tokenMetaClone);
     
                         // calculated parts of the ranges to request/stream from/to nodes in the ring
    -                    Pair<Set<Range<Token>>, Set<Range<Token>>> rangesPerKeyspace = calculateStreamAndFetchRanges(currentRanges, updatedRanges);
    +                    Pair<Set<Range<Token>>, Set<Range<Token>>> rangesPerKeyspace = calculateStreamAndFetchRanges(currentReplicas, updatedReplicas);
     
                         /**
                          * In this loop we are going through all ranges "to fetch" and determining
                          * nodes in the ring responsible for data we are interested in
                          */
    -                    Multimap<Range<Token>, InetAddressAndPort> rangesToFetchWithPreferredEndpoints = ArrayListMultimap.create();
    +                    ReplicaMultimap<Range<Token>, ReplicaList> rangesToFetchWithPreferredEndpoints = ReplicaMultimap.list();
    --- End diff --
    
    We can iterate over `entrySet` rather than call `get` below.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197136712
  
    --- Diff: src/java/org/apache/cassandra/service/reads/DataResolver.java ---
    @@ -64,12 +75,19 @@ public PartitionIterator resolve()
             // at the beginning of this method), so grab the response count once and use that through the method.
             int count = responses.size();
             List<UnfilteredPartitionIterator> iters = new ArrayList<>(count);
    -        InetAddressAndPort[] sources = new InetAddressAndPort[count];
    +        Replica[] sources = new Replica[count];
    --- End diff --
    
    This is the only usage of array (e.g. this place and dependent ones), and it seems it can be avoided in favour of list.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197159962
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,270 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        this(new ArrayList<>());
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        this(new ArrayList<>(capacity));
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        this(new ArrayList<>(from.replicaList));
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        this(new ArrayList<>(from.size()));
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        this(new ArrayList<>(from));
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        Preconditions.checkNotNull(replica);
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    protected Collection<Replica> getUnmodifiableCollection()
    +    {
    +        return Collections.unmodifiableCollection(replicaList);
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=replicaList.size()-1; i>=0; i--)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +            {
    +                replicaList.remove(i);
    +            }
    +        }
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    @Override
    +    public boolean containsEndpoint(InetAddressAndPort endpoint)
    +    {
    +        for (int i=0; i<size(); i++)
    +        {
    +            if (replicaList.get(i).getEndpoint().equals(endpoint))
    +                return true;
    +        }
    +        return false;
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = size() < 10 ? new ArrayList<>(size()) : new ArrayList<>();
    +        for (int i=0; i<size(); i++)
    +        {
    +            Replica replica = replicaList.get(i);
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of()
    +    {
    +        return new ReplicaList(0);
    +    }
    +
    +    public static ReplicaList of(Replica replica)
    +    {
    +        ReplicaList replicaList = new ReplicaList(1);
    +        replicaList.add(replica);
    +        return replicaList;
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    +    {
    +        ReplicaList replicaList = new ReplicaList(replicas.length);
    +        for (Replica replica: replicas)
    +        {
    +            replicaList.add(replica);
    +        }
    +        return replicaList;
    +    }
    +
    +    public ReplicaList subList(int fromIndex, int toIndex)
    +    {
    +        return new ReplicaList(replicaList.subList(fromIndex, toIndex));
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        if (Iterables.all(this, Replica::isFull))
    +        {
    +            ReplicaList normalized = new ReplicaList(size());
    +            for (Replica replica: this)
    +            {
    +                replica.addNormalizeByRange(normalized);
    --- End diff --
    
    I might misunderstand something, but woudln't it be simpler to read if we did `replica.normalize()` that would return a new `Replica`, which we would in turn add to `normalized` ourselves here?..


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187476007
  
    --- Diff: src/java/org/apache/cassandra/db/compaction/CompactionManager.java ---
    @@ -871,7 +868,7 @@ public void forceUserDefinedCleanup(String dataFiles)
             {
                 ColumnFamilyStore cfs = entry.getKey();
                 Keyspace keyspace = cfs.keyspace;
    -            Collection<Range<Token>> ranges = StorageService.instance.getLocalRanges(keyspace.getName());
    +            Collection<Range<Token>> ranges = StorageService.instance.getLocalReplicas(keyspace.getName()).asRangeSet();
    --- End diff --
    
    Also just iterated


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197153506
  
    --- Diff: src/java/org/apache/cassandra/db/view/ViewUtils.java ---
    @@ -58,46 +57,55 @@ private ViewUtils()
          *
          * @return Optional.empty() if this method is called using a base token which does not belong to this replica
          */
    -    public static Optional<InetAddressAndPort> getViewNaturalEndpoint(String keyspaceName, Token baseToken, Token viewToken)
    +    public static Optional<Replica> getViewNaturalEndpoint(String keyspaceName, Token baseToken, Token viewToken)
         {
             AbstractReplicationStrategy replicationStrategy = Keyspace.open(keyspaceName).getReplicationStrategy();
     
             String localDataCenter = DatabaseDescriptor.getEndpointSnitch().getDatacenter(FBUtilities.getBroadcastAddressAndPort());
    -        List<InetAddressAndPort> baseEndpoints = new ArrayList<>();
    -        List<InetAddressAndPort> viewEndpoints = new ArrayList<>();
    -        for (InetAddressAndPort baseEndpoint : replicationStrategy.getNaturalEndpoints(baseToken))
    +        ReplicaList baseReplicas = new ReplicaList();
    +        ReplicaList viewReplicas = new ReplicaList();
    +        for (Replica baseEndpoint : replicationStrategy.getNaturalReplicas(baseToken))
             {
                 // An endpoint is local if we're not using Net
                 if (!(replicationStrategy instanceof NetworkTopologyStrategy) ||
                     DatabaseDescriptor.getEndpointSnitch().getDatacenter(baseEndpoint).equals(localDataCenter))
    -                baseEndpoints.add(baseEndpoint);
    +                baseReplicas.add(baseEndpoint);
             }
     
    -        for (InetAddressAndPort viewEndpoint : replicationStrategy.getNaturalEndpoints(viewToken))
    +        for (Replica viewEndpoint : replicationStrategy.getNaturalReplicas(viewToken))
             {
                 // If we are a base endpoint which is also a view replica, we use ourselves as our view replica
    -            if (viewEndpoint.equals(FBUtilities.getBroadcastAddressAndPort()))
    +            if (viewEndpoint.isLocal())
                     return Optional.of(viewEndpoint);
     
                 // We have to remove any endpoint which is shared between the base and the view, as it will select itself
                 // and throw off the counts otherwise.
    -            if (baseEndpoints.contains(viewEndpoint))
    -                baseEndpoints.remove(viewEndpoint);
    +            if (baseReplicas.containsEndpoint(viewEndpoint.getEndpoint()))
    +                baseReplicas.removeEndpoint(viewEndpoint.getEndpoint());
                 else if (!(replicationStrategy instanceof NetworkTopologyStrategy) ||
                          DatabaseDescriptor.getEndpointSnitch().getDatacenter(viewEndpoint).equals(localDataCenter))
    -                viewEndpoints.add(viewEndpoint);
    +                viewReplicas.add(viewEndpoint);
             }
     
             // The replication strategy will be the same for the base and the view, as they must belong to the same keyspace.
             // Since the same replication strategy is used, the same placement should be used and we should get the same
             // number of replicas for all of the tokens in the ring.
    -        assert baseEndpoints.size() == viewEndpoints.size() : "Replication strategy should have the same number of endpoints for the base and the view";
    -        int baseIdx = baseEndpoints.indexOf(FBUtilities.getBroadcastAddressAndPort());
    +        assert baseReplicas.size() == viewReplicas.size() : "Replication strategy should have the same number of endpoints for the base and the view";
    --- End diff --
    
    We could simplify this slightly by using set + iterator instead of `get` here: 
    
    ```diff
             String localDataCenter = DatabaseDescriptor.getEndpointSnitch().getDatacenter(FBUtilities.getBroadcastAddressAndPort());
    -        ReplicaList baseReplicas = new ReplicaList();
    -        ReplicaList viewReplicas = new ReplicaList();
    +        ReplicaSet baseReplicas = new ReplicaSet();
    +        ReplicaSet viewReplicas = new ReplicaSet();
    +        // We might add a method that filters natural endpoints by dc
             for (Replica baseEndpoint : replicationStrategy.getNaturalReplicas(baseToken))
             {
                 // An endpoint is local if we're not using Net
    @@ -92,20 +95,18 @@ public final class ViewUtils
             // number of replicas for all of the tokens in the ring.
             assert baseReplicas.size() == viewReplicas.size() : "Replication strategy should have the same number of endpoints for the base and the view";
    
    -        int baseIdx = -1;
    -        for (int i=0; i<baseReplicas.size(); i++)
    +        // we don't need "get" here, we can just use
    +        Iterator<Replica> baseReplicaIteraror = baseReplicas.iterator();
    +        Iterator<Replica> viewReplicaIteraror = viewReplicas.iterator();
    +        while (baseReplicaIteraror.hasNext() && viewReplicaIteraror.hasNext())
             {
    -            if (baseReplicas.get(i).isLocal())
    -            {
    -                baseIdx = i;
    -                break;
    -            }
    +            Replica baseReplica = baseReplicaIteraror.next();
    +            Replica viewReplica = viewReplicaIteraror.next();
    +            if (baseReplica.isLocal())
    +                return Optional.of(viewReplica);
             }
    
    -        if (baseIdx < 0)
    -            //This node is not a base replica of this key, so we return empty
    -            return Optional.empty();
    -
    -        return Optional.of(viewReplicas.get(baseIdx));
    +        //This node is not a base replica of this key, so we return empty
    +        return Optional.empty();
         }
     }
    ```
    
    One more reason not to use `Set` here is since there's `contains` that would use iteration internally.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187430210
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    +
    +    private void warnIfIncreasingRF(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (newStrategy.getReplicationFactor().full > oldStrategy.getReplicationFactor().full)
    --- End diff --
    
    Should this be the number of full replicas or the total number of replicas?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189117005
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -503,12 +498,12 @@ private static void sendCommit(Commit commit, Iterable<InetAddressAndPort> repli
                 MessagingService.instance().sendOneWay(message, target);
         }
     
    -    private static PrepareCallback preparePaxos(Commit toPrepare, List<InetAddressAndPort> endpoints, int requiredParticipants, ConsistencyLevel consistencyForPaxos, long queryStartNanoTime)
    +    private static PrepareCallback preparePaxos(Commit toPrepare, ReplicaList replicas, int requiredParticipants, ConsistencyLevel consistencyForPaxos, long queryStartNanoTime)
         throws WriteTimeoutException
         {
             PrepareCallback callback = new PrepareCallback(toPrepare.update.partitionKey(), toPrepare.update.metadata(), requiredParticipants, consistencyForPaxos, queryStartNanoTime);
             MessageOut<Commit> message = new MessageOut<Commit>(MessagingService.Verb.PAXOS_PREPARE, toPrepare, Commit.serializer);
    -        for (InetAddressAndPort target : endpoints)
    +        for (InetAddressAndPort target : replicas.asEndpoints())
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197135878
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -3788,7 +3789,10 @@ public void forceTerminateAllRepairSessions()
             if (metadata == null)
                 throw new IllegalArgumentException("Unknown table '" + cf + "' in keyspace '" + keyspaceName + "'");
     
    -        return getNaturalEndpoints(keyspaceName, tokenMetadata.partitioner.getToken(metadata.partitionKeyType.fromString(key))).stream().map(i -> i.address).collect(toList());
    +        ReplicaList replicas = getNaturalReplicas(keyspaceName, tokenMetadata.partitioner.getToken(metadata.partitionKeyType.fromString(key)));
    --- End diff --
    
    We could use `replicas.asEndpointsList`, and return list of `InetAddressAndPort.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197126709
  
    --- Diff: src/java/org/apache/cassandra/dht/RangeStreamer.java ---
    @@ -259,36 +266,36 @@ private boolean useStrictSourcesForRanges(String keyspaceName)
          *
          * @throws java.lang.IllegalStateException when there is no source to get data streamed, or more than 1 source found.
          */
    -    private Multimap<Range<Token>, InetAddressAndPort> getAllRangesWithStrictSourcesFor(String keyspace, Collection<Range<Token>> desiredRanges)
    +    private ReplicaMultimap<Range<Token>, ReplicaList> getAllRangesWithStrictSourcesFor(String keyspace, Iterable<Range<Token>> desiredRanges)
         {
             assert tokens != null;
             AbstractReplicationStrategy strat = Keyspace.open(keyspace).getReplicationStrategy();
     
             // Active ranges
             TokenMetadata metadataClone = metadata.cloneOnlyTokenMap();
    -        Multimap<Range<Token>, InetAddressAndPort> addressRanges = strat.getRangeAddresses(metadataClone);
    +        ReplicaMultimap<Range<Token>, ReplicaSet> addressRanges = strat.getRangeAddresses(metadataClone);
     
             // Pending ranges
             metadataClone.updateNormalTokens(tokens, address);
    -        Multimap<Range<Token>, InetAddressAndPort> pendingRangeAddresses = strat.getRangeAddresses(metadataClone);
    +        ReplicaMultimap<Range<Token>, ReplicaSet> pendingRangeAddresses = strat.getRangeAddresses(metadataClone);
     
             // Collects the source that will have its range moved to the new node
    -        Multimap<Range<Token>, InetAddressAndPort> rangeSources = ArrayListMultimap.create();
    +        ReplicaMultimap<Range<Token>, ReplicaList> rangeSources = ReplicaMultimap.list();
     
             for (Range<Token> desiredRange : desiredRanges)
             {
    -            for (Map.Entry<Range<Token>, Collection<InetAddressAndPort>> preEntry : addressRanges.asMap().entrySet())
    +            for (Map.Entry<Range<Token>, ReplicaSet> preEntry : addressRanges.asMap().entrySet())
                 {
                     if (preEntry.getKey().contains(desiredRange))
                     {
    -                    Set<InetAddressAndPort> oldEndpoints = Sets.newHashSet(preEntry.getValue());
    --- End diff --
    
    Here, we're creating a new `Set`, then removing things from it. We could do an equivalent (and avoid copy of `newEndpoints`) by just running `filter` on `preEntry.getValue()`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187443155
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    +        ReplicaList replicas = new ReplicaList(normalized.size());
    +        for (Range<Token> normalizedRange: normalized)
    +        {
    +            replicas.add(new Replica(getEndpoint(), normalizedRange, isFull()));
    +        }
    +        return replicas;
    +    }
    +
    +    public boolean contains(Range<Token> that)
    +    {
    +        return getRange().contains(that);
    +    }
    +
    +    public boolean intersectsOnRange(Replica replica)
    +    {
    +        return getRange().intersects(replica.getRange());
    +    }
    +
    +    public Replica decorateSubrange(Range<Token> subrange)
    +    {
    +        Preconditions.checkArgument(range.contains(subrange));
    +        return new Replica(getEndpoint(), subrange, isFull());
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, true);
    +    }
    +
    +    /**
    +     * We need to assume an endpoint is a full replica in a with unknown ranges in a
    +     * few cases, so this returns one that throw an exception if you try to get it's range
    +     */
    +    public static Replica fullStandin(InetAddressAndPort endpoint)
    --- End diff --
    
    This feels so unsafe. At the first level it's turning a compile time error into a runtime error. But then it's turning the fullness or not of the replica into a silent error. Do things work if you have isFull and isTransient throw as well?
    
    Another issue is this is going to make isFull() and isTransient() bi-morphic which isn't the best. Feels like it might be premature optimization to worry about that right now though.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187668202
  
    --- Diff: src/java/org/apache/cassandra/dht/RangeFetchMapCalculator.java ---
    @@ -158,14 +159,15 @@ static boolean isTrivial(Range<Token> range)
                 boolean localDCCheck = true;
                 while (!added)
                 {
    -                List<InetAddressAndPort> srcs = new ArrayList<>(rangesWithSources.get(trivialRange));
    +                ReplicaList replicas = new ReplicaList(rangesWithSources.get(trivialRange));
                     // sort with the endpoint having the least number of streams first:
    -                srcs.sort(Comparator.comparingInt(o -> optimisedMap.get(o).size()));
    -                for (InetAddressAndPort src : srcs)
    +                replicas.sort(Comparator.comparingInt(o -> optimisedMap.get(o.getEndpoint()).size()));
    +                Replicas.checkFull(replicas);
    +                for (Replica replica : replicas)
                     {
    -                    if (passFilters(src, localDCCheck))
    +                    if (passFilters(replica, localDCCheck))
                         {
    -                        fetchMap.put(src, trivialRange);
    +                        fetchMap.put(replica.getEndpoint(), trivialRange);
    --- End diff --
    
    So this is something I have been running into on my end, but why unwrap here?  I have been need to fix that because I need the transientness of what I am fetching for to determine whether the remote side is going to send me the transient data or the full data.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188452249
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -1364,68 +1363,72 @@ public static void sendToHintedEndpoints(final Mutation mutation,
                 submitHint(mutation, endpointsToHint, responseHandler);
     
             if (insertLocal)
    -            performLocally(stage, Optional.of(mutation), mutation::apply, responseHandler);
    +        {
    +            Preconditions.checkNotNull(localReplica);
    +            performLocally(stage, localReplica, Optional.of(mutation), mutation::apply, responseHandler);
    +        }
     
             if (localDc != null)
             {
    -            for (InetAddressAndPort destination : localDc)
    -                MessagingService.instance().sendRR(message, destination, responseHandler, true);
    +            for (Replica destination : localDc)
    +                MessagingService.instance().sendWriteRR(message, destination, responseHandler, true);
             }
             if (dcGroups != null)
             {
                 // for each datacenter, send the message to one node to relay the write to other replicas
    -            for (Collection<InetAddressAndPort> dcTargets : dcGroups.values())
    +            for (Replicas dcTargets : dcGroups.values())
                     sendMessagesToNonlocalDC(message, dcTargets, responseHandler);
             }
         }
     
    -    private static void checkHintOverload(InetAddressAndPort destination)
    +    private static void checkHintOverload(Replica destination)
         {
             // avoid OOMing due to excess hints.  we need to do this check even for "live" nodes, since we can
             // still generate hints for those if it's overloaded or simply dead but not yet known-to-be-dead.
             // The idea is that if we have over maxHintsInProgress hints in flight, this is probably due to
             // a small number of nodes causing problems, so we should avoid shutting down writes completely to
             // healthy nodes.  Any node with no hintsInProgress is considered healthy.
             if (StorageMetrics.totalHintsInProgress.getCount() > maxHintsInProgress
    -                && (getHintsInProgressFor(destination).get() > 0 && shouldHint(destination)))
    +                && (getHintsInProgressFor(destination.getEndpoint()).get() > 0 && shouldHint(destination)))
             {
                 throw new OverloadedException("Too many in flight hints: " + StorageMetrics.totalHintsInProgress.getCount() +
                                               " destination: " + destination +
    -                                          " destination hints: " + getHintsInProgressFor(destination).get());
    +                                          " destination hints: " + getHintsInProgressFor(destination.getEndpoint()).get());
             }
         }
     
         private static void sendMessagesToNonlocalDC(MessageOut<? extends IMutation> message,
    -                                                 Collection<InetAddressAndPort> targets,
    +                                                 Replicas targets,
                                                      AbstractWriteResponseHandler<IMutation> handler)
         {
    -        Iterator<InetAddressAndPort> iter = targets.iterator();
    +        Iterator<Replica> iter = targets.iterator();
             int[] messageIds = new int[targets.size()];
    -        InetAddressAndPort target = iter.next();
    +        Replica target = iter.next();
     
             int idIdx = 0;
             // Add the other destinations of the same message as a FORWARD_HEADER entry
             while (iter.hasNext())
             {
    -            InetAddressAndPort destination = iter.next();
    -            int id = MessagingService.instance().addCallback(handler,
    -                                                             message,
    -                                                             destination,
    -                                                             message.getTimeout(),
    -                                                             handler.consistencyLevel,
    -                                                             true);
    +            Replica destination = iter.next();
    +            int id = MessagingService.instance().addWriteCallback(handler,
    +                                                                  message,
    +                                                                  destination,
    +                                                                  message.getTimeout(),
    +                                                                  handler.consistencyLevel,
    +                                                                  true);
                 messageIds[idIdx++] = id;
                 logger.trace("Adding FWD message to {}@{}", id, destination);
             }
    -        message = message.withParameter(ParameterType.FORWARD_TO.FORWARD_TO, new ForwardToContainer(targets, messageIds));
    +        Replicas.checkFull(targets);
    +        message = message.withParameter(ParameterType.FORWARD_TO.FORWARD_TO, new ForwardToContainer(targets.asEndpointList(), messageIds));
    --- End diff --
    
    A candidate for lazily converting.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188107704
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    +        ReplicaList replicas = new ReplicaList(normalized.size());
    +        for (Range<Token> normalizedRange: normalized)
    +        {
    +            replicas.add(new Replica(getEndpoint(), normalizedRange, isFull()));
    +        }
    +        return replicas;
    +    }
    +
    +    public boolean contains(Range<Token> that)
    +    {
    +        return getRange().contains(that);
    +    }
    +
    +    public boolean intersectsOnRange(Replica replica)
    +    {
    +        return getRange().intersects(replica.getRange());
    +    }
    +
    +    public Replica decorateSubrange(Range<Token> subrange)
    +    {
    +        Preconditions.checkArgument(range.contains(subrange));
    +        return new Replica(getEndpoint(), subrange, isFull());
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, true);
    +    }
    +
    +    /**
    +     * We need to assume an endpoint is a full replica in a with unknown ranges in a
    +     * few cases, so this returns one that throw an exception if you try to get it's range
    +     */
    +    public static Replica fullStandin(InetAddressAndPort endpoint)
    +    {
    +        return new Replica(endpoint, null, true) {
    +            @Override
    +            public Range<Token> getRange()
    +            {
    +                throw new UnsupportedOperationException("Can't get range on standin replicas");
    +            }
    +        };
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return full(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, false);
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return trans(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static void assureSufficientFullReplica(Collection<Replica> replicas, ConsistencyLevel cl) throws UnavailableException
    +    {
    +        if (!Iterables.any(replicas, Replica::isFull))
    +        {
    +            throw new UnavailableException(cl, "At least one full replica required", 1, 0);
    +        }
    +    }
    +
    +    public static Iterable<InetAddressAndPort> toEndpoints(Iterable<Replica> replicas)
    --- End diff --
    
    I think all of these collection oriented things should move into Replicas. I also think we should make explicit the lazy vs eager transformation options and generally offer both so that code can try and pick the right one.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r192505398
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    +        for (Replica replica: replicaList)
    +        {
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    --- End diff --
    
    Hmmm so what I thought was a specialization would be possible, but since it's mutable you have to allocate a real list. So really I think it's fine to have of() allocate the wrapper array. So of() could just return a global immutable one. The singleton one could do a singleton list wrapper.
    
    TL;DR this is not important enough to use up more of your time. You can delete the specializations.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189116971
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197130101
  
    --- Diff: src/java/org/apache/cassandra/locator/OldNetworkTopologyStrategy.java ---
    @@ -36,27 +36,30 @@
      */
     public class OldNetworkTopologyStrategy extends AbstractReplicationStrategy
     {
    +    private final ReplicationFactor rf;
         public OldNetworkTopologyStrategy(String keyspaceName, TokenMetadata tokenMetadata, IEndpointSnitch snitch, Map<String, String> configOptions)
         {
             super(keyspaceName, tokenMetadata, snitch, configOptions);
    +        this.rf = ReplicationFactor.fromString(this.configOptions.get("replication_factor"));
         }
     
    -    public List<InetAddressAndPort> calculateNaturalEndpoints(Token token, TokenMetadata metadata)
    +    public ReplicaList calculateNaturalReplicas(Token token, TokenMetadata metadata)
         {
    -        int replicas = getReplicationFactor();
    -        List<InetAddressAndPort> endpoints = new ArrayList<>(replicas);
    +        ReplicaList replicas = new ReplicaList(rf.replicas);
             ArrayList<Token> tokens = metadata.sortedTokens();
     
             if (tokens.isEmpty())
    -            return endpoints;
    +            return replicas;
     
             Iterator<Token> iter = TokenMetadata.ringIterator(tokens, token, false);
             Token primaryToken = iter.next();
    -        endpoints.add(metadata.getEndpoint(primaryToken));
    +        Token previousToken = metadata.getPredecessor(primaryToken);
    +        assert rf.trans == 0: "support transient replicas";
    --- End diff --
    
    Should we move an assert to the top?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187474479
  
    --- Diff: src/java/org/apache/cassandra/db/ReadCommand.java ---
    @@ -128,6 +130,7 @@ protected ReadCommand(Kind kind,
     
         protected abstract void serializeSelection(DataOutputPlus out, int version) throws IOException;
         protected abstract long selectionSerializedSize(int version);
    +    public abstract Replica decorateEndpoint(InetAddressAndPort endpoint);
    --- End diff --
    
    I'm not sure if this is a good idea based on it being used after we have already committed to that endpoint replicating a specific range a specific way. This seems like it races with ring changes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197131214
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicationFactor.java ---
    @@ -0,0 +1,121 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.Objects;
    +
    +import com.google.common.base.Preconditions;
    +
    +import org.apache.cassandra.config.DatabaseDescriptor;
    +
    +public class ReplicationFactor
    +{
    +    public static final ReplicationFactor ZERO = new ReplicationFactor(0);
    +
    +    public final int trans;
    +    public final int replicas;
    +    public transient final int full;
    +
    +    private ReplicationFactor(int replicas, int trans)
    +    {
    +        validate(replicas, trans);
    +        this.replicas = replicas;
    +        this.trans = trans;
    +        this.full = replicas - trans;
    +    }
    +
    +    private ReplicationFactor(int replicas)
    +    {
    +        this(replicas, 0);
    +    }
    +
    +    static void validate(int replicas, int trans)
    +    {
    +        Preconditions.checkArgument(trans == 0 || DatabaseDescriptor.isTransientReplicationEnabled(),
    +                                    "Transient replication is not enabled on this node");
    +        Preconditions.checkArgument(replicas >= 0,
    --- End diff --
    
    To my understanding, replication factor has to be strictly positive (e.g. min of 1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r198265430
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -50,6 +50,30 @@
         public abstract int size();
         protected abstract Collection<Replica> getUnmodifiableCollection();
     
    +
    +    public boolean equals(Object o)
    --- End diff --
    
    > If you look at how java.util.Collections and Guava do it you can't actually construct something that isn't either a set or a list.
    
    The objects returned by com.google.common.collect.Iterables (which these methods are imitating) don't define equals or hashCode. 
    
    The issue here might be that I'm thinking of these objects as iterables, and you seem to be thinking of them as  collections. Maybe this is more a naming issue?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189091280
  
    --- Diff: src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java ---
    @@ -202,61 +204,63 @@ private Keyspace getKeyspace()
          *
          * @return the replication factor
          */
    -    public abstract int getReplicationFactor();
    +    public abstract ReplicationFactor getReplicationFactor();
     
         /*
          * NOTE: this is pretty inefficient. also the inverse (getRangeAddresses) below.
          * this is fine as long as we don't use this on any critical path.
          * (fixing this would probably require merging tokenmetadata into replicationstrategy,
          * so we could cache/invalidate cleanly.)
          */
    -    public Multimap<InetAddressAndPort, Range<Token>> getAddressRanges(TokenMetadata metadata)
    +    public ReplicaMultimap<InetAddressAndPort, ReplicaSet> getAddressReplicas(TokenMetadata metadata)
         {
    -        Multimap<InetAddressAndPort, Range<Token>> map = HashMultimap.create();
    +        ReplicaMultimap<InetAddressAndPort, ReplicaSet> map = ReplicaMultimap.set();
     
             for (Token token : metadata.sortedTokens())
             {
                 Range<Token> range = metadata.getPrimaryRangeFor(token);
    -            for (InetAddressAndPort ep : calculateNaturalEndpoints(token, metadata))
    +            for (Replica replica : calculateNaturalReplicas(token, metadata))
                 {
    -                map.put(ep, range);
    +                Preconditions.checkState(range.equals(replica.getRange()) || this instanceof LocalStrategy);
    --- End diff --
    
    added comment


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r198592967
  
    --- Diff: src/java/org/apache/cassandra/locator/OldNetworkTopologyStrategy.java ---
    @@ -36,27 +36,30 @@
      */
     public class OldNetworkTopologyStrategy extends AbstractReplicationStrategy
     {
    +    private final ReplicationFactor rf;
         public OldNetworkTopologyStrategy(String keyspaceName, TokenMetadata tokenMetadata, IEndpointSnitch snitch, Map<String, String> configOptions)
         {
             super(keyspaceName, tokenMetadata, snitch, configOptions);
    +        this.rf = ReplicationFactor.fromString(this.configOptions.get("replication_factor"));
         }
     
    -    public List<InetAddressAndPort> calculateNaturalEndpoints(Token token, TokenMetadata metadata)
    +    public ReplicaList calculateNaturalReplicas(Token token, TokenMetadata metadata)
         {
    -        int replicas = getReplicationFactor();
    -        List<InetAddressAndPort> endpoints = new ArrayList<>(replicas);
    +        ReplicaList replicas = new ReplicaList(rf.replicas);
             ArrayList<Token> tokens = metadata.sortedTokens();
     
             if (tokens.isEmpty())
    -            return endpoints;
    +            return replicas;
     
             Iterator<Token> iter = TokenMetadata.ringIterator(tokens, token, false);
             Token primaryToken = iter.next();
    -        endpoints.add(metadata.getEndpoint(primaryToken));
    +        Token previousToken = metadata.getPredecessor(primaryToken);
    +        assert rf.trans == 0: "support transient replicas";
    --- End diff --
    
    Also if it's not supported should it be possible to disable the check?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187442479
  
    --- Diff: src/java/org/apache/cassandra/service/reads/DataResolver.java ---
    @@ -64,12 +72,19 @@ public PartitionIterator resolve()
             // at the beginning of this method), so grab the response count once and use that through the method.
             int count = responses.size();
             List<UnfilteredPartitionIterator> iters = new ArrayList<>(count);
    -        InetAddressAndPort[] sources = new InetAddressAndPort[count];
    +        Replica[] sources = new Replica[count];
             for (int i = 0; i < count; i++)
             {
                 MessageIn<ReadResponse> msg = responses.get(i);
                 iters.add(msg.payload.makeIterator(command));
    -            sources[i] = msg.from;
    +
    +            Replica replica = replicaMap.get(msg.from);
    +            if (replica == null)
    +                replica = command.decorateEndpoint(msg.from);
    +            if (replica == null)
    +                replica = Replica.fullStandin(msg.from);
    +
    +            sources[i] = replica != null ? replica : Replica.fullStandin(msg.from);
    --- End diff --
    
    Seems like you doubled up on the stand in?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra issue #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on the issue:

    https://github.com/apache/cassandra/pull/224
  
    Also I haven't reviewed the test changes yet.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188770789
  
    --- Diff: doc/source/architecture/dynamo.rst ---
    @@ -74,6 +74,26 @@ nodes in each rack, the data load on the smallest rack may be much higher.  Simi
     into a new rack, it will be considered a replica for the entire ring.  For this reason, many operators choose to
     configure all nodes on a single "rack".
     
    +.. _transient-replication:
    +
    +Transient Replication
    +~~~~~~~~~~~~~~~~~~~~~
    +
    +Transient replication allows you to configure a subset of replicas to only replicate data that hasn't been incrementally
    +repaired. This allows you to trade data redundancy for storage usage, and increased read and write throughput. For instance,
    +if you have a replication factor of 3, with 1 transient replica, 2 replicas will replicate all data for a given token
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189091062
  
    --- Diff: src/java/org/apache/cassandra/db/ConsistencyLevel.java ---
    @@ -190,50 +197,50 @@ public int countLocalEndpoints(Iterable<InetAddressAndPort> liveEndpoints)
              * the blockFor first ones).
              */
             if (isDCLocal)
    -            liveEndpoints.sort(DatabaseDescriptor.getLocalComparator());
    +            liveReplicas.sort(DatabaseDescriptor.getLocalComparator());
     
    -        return liveEndpoints.subList(0, Math.min(liveEndpoints.size(), blockFor(keyspace)));
    +        return liveReplicas.subList(0, Math.min(liveReplicas.size(), blockFor(keyspace)));
         }
     
    -    private List<InetAddressAndPort> filterForEachQuorum(Keyspace keyspace, List<InetAddressAndPort> liveEndpoints)
    +    private ReplicaList filterForEachQuorum(Keyspace keyspace, ReplicaList liveReplicas)
         {
             NetworkTopologyStrategy strategy = (NetworkTopologyStrategy) keyspace.getReplicationStrategy();
     
    -        Map<String, List<InetAddressAndPort>> dcsEndpoints = new HashMap<>();
    +        Map<String, ReplicaList> dcsReplicas = new HashMap<>();
             for (String dc: strategy.getDatacenters())
    -            dcsEndpoints.put(dc, new ArrayList<>());
    +            dcsReplicas.put(dc, new ReplicaList());
    --- End diff --
    
    why not? fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197126256
  
    --- Diff: src/java/org/apache/cassandra/db/ConsistencyLevel.java ---
    @@ -242,11 +249,11 @@ public boolean isSufficientLiveNodes(Keyspace keyspace, Iterable<InetAddressAndP
                     }
                     // Fallthough on purpose for SimpleStrategy
                 default:
    -                return Iterables.size(liveEndpoints) >= blockFor(keyspace);
    +                return Iterables.size(liveReplicas) >= blockFor(keyspace);
             }
         }
     
    -    public void assureSufficientLiveNodes(Keyspace keyspace, Iterable<InetAddressAndPort> liveEndpoints) throws UnavailableException
    +    public void assureSufficientLiveNodes(Keyspace keyspace, Iterable<Replica> liveReplicas) throws UnavailableException
    --- End diff --
    
    We can use `Replicas` here.
    
    Another question: should there be a distinction between sufficient live nodes for read and write path? Do we want to make sure there's a sufficient amount of live full nodes here or later?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188443108
  
    --- Diff: src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java ---
    @@ -225,7 +223,7 @@ protected boolean waitingFor(InetAddressAndPort from)
     
         public void assureSufficientLiveNodes() throws UnavailableException
         {
    -        consistencyLevel.assureSufficientLiveNodes(keyspace, Iterables.filter(Iterables.concat(naturalEndpoints, pendingEndpoints), isAlive));
    +        consistencyLevel.assureSufficientLiveNodes(keyspace, Replicas.filter(Replicas.concatNaturalAndPending(naturalReplicas, pendingReplicas), isReplicaAlive));
    --- End diff --
    
    This ends up allocating two containers when all that is wanted is Iterables.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187726766
  
    --- Diff: src/java/org/apache/cassandra/locator/AbstractEndpointSnitch.java ---
    @@ -23,17 +23,17 @@
     
     public abstract class AbstractEndpointSnitch implements IEndpointSnitch
     {
    -    public abstract int compareEndpoints(InetAddressAndPort target, InetAddressAndPort a1, InetAddressAndPort a2);
    +    public abstract int compareEndpoints(InetAddressAndPort target, Replica r1, Replica r2);
    --- End diff --
    
    Are they endpoints or replicas now? I think maybe this should just be endpoints to make explicit the snitch isn't going to use the range or transient information. It doesn't have many usages so it's not a big deal to manually


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188345245
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaSet.java ---
    @@ -0,0 +1,142 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.HashSet;
    +import java.util.Iterator;
    +import java.util.LinkedHashSet;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.collect.ImmutableSet;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +public class ReplicaSet extends Replicas
    +{
    +    static final ReplicaSet EMPTY = new ReplicaSet(ImmutableSet.of());
    +
    +    private final Set<Replica> replicaSet;
    +
    +    public ReplicaSet()
    +    {
    +        replicaSet = new HashSet<>();
    +    }
    +
    +    public ReplicaSet(int expectedSize)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(expectedSize);
    +    }
    +
    +    public ReplicaSet(Replicas replicas)
    +    {
    +        replicaSet = Sets.newHashSetWithExpectedSize(replicas.size());
    +        Iterables.addAll(replicaSet, replicas);
    +    }
    +
    +    private ReplicaSet(Set<Replica> replicaSet)
    +    {
    +        this.replicaSet = replicaSet;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaSet that = (ReplicaSet) o;
    +        return Objects.equals(replicaSet, that.replicaSet);
    +    }
    +
    +    public int hashCode()
    +    {
    +        return Objects.hash(replicaSet);
    --- End diff --
    
    Uses the varargs version


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188119369
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    --- End diff --
    
    This only makes sense if size() is smaller than the default allocation size of ArrayList? Otherwise you don't know what will come out of filter and probably want to use the usual algorithm?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188686055
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -3863,17 +3863,12 @@ public void forceTerminateAllRepairSessions()
          *
          * @param keyspace keyspace name also known as keyspace
          * @param pos position for which we need to find the endpoint
    -     * @param liveEps the list of endpoints to mutate
          */
    -    public void getLiveNaturalEndpoints(Keyspace keyspace, RingPosition pos, List<InetAddressAndPort> liveEps)
    +    public ReplicaList getLiveNaturalReplicas(Keyspace keyspace, RingPosition pos)
         {
    -        List<InetAddressAndPort> endpoints = keyspace.getReplicationStrategy().getNaturalEndpoints(pos);
    +        ReplicaList replicas = keyspace.getReplicationStrategy().getNaturalReplicas(pos);
    --- End diff --
    
    If getNaturalReplicas took a filter we could filter out the replicas into the copied list that getNaturalReplicas already creates.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187439941
  
    --- Diff: src/java/org/apache/cassandra/db/ConsistencyLevel.java ---
    @@ -190,50 +197,50 @@ public int countLocalEndpoints(Iterable<InetAddressAndPort> liveEndpoints)
              * the blockFor first ones).
              */
             if (isDCLocal)
    -            liveEndpoints.sort(DatabaseDescriptor.getLocalComparator());
    +            liveReplicas.sort(DatabaseDescriptor.getLocalComparator());
     
    -        return liveEndpoints.subList(0, Math.min(liveEndpoints.size(), blockFor(keyspace)));
    +        return liveReplicas.subList(0, Math.min(liveReplicas.size(), blockFor(keyspace)));
         }
     
    -    private List<InetAddressAndPort> filterForEachQuorum(Keyspace keyspace, List<InetAddressAndPort> liveEndpoints)
    +    private ReplicaList filterForEachQuorum(Keyspace keyspace, ReplicaList liveReplicas)
         {
             NetworkTopologyStrategy strategy = (NetworkTopologyStrategy) keyspace.getReplicationStrategy();
     
    -        Map<String, List<InetAddressAndPort>> dcsEndpoints = new HashMap<>();
    +        Map<String, ReplicaList> dcsReplicas = new HashMap<>();
             for (String dc: strategy.getDatacenters())
    -            dcsEndpoints.put(dc, new ArrayList<>());
    +            dcsReplicas.put(dc, new ReplicaList());
    --- End diff --
    
    Completely overkill and unrelated, but ArrayList will over allocate here if liveReplicas is < the size of the default allocation of 10


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by ifesdjeen <gi...@git.apache.org>.
Github user ifesdjeen commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r197133145
  
    --- Diff: src/java/org/apache/cassandra/service/StorageProxy.java ---
    @@ -344,47 +343,43 @@ private static void recordCasContention(int contentions)
                 casWriteMetrics.contention.update(contentions);
         }
     
    -    private static Predicate<InetAddressAndPort> sameDCPredicateFor(final String dc)
    +    private static Predicate<Replica> sameDCPredicateFor(final String dc)
         {
             final IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
    -        return new Predicate<InetAddressAndPort>()
    -        {
    -            public boolean apply(InetAddressAndPort host)
    -            {
    -                return dc.equals(snitch.getDatacenter(host));
    -            }
    -        };
    +        return replica -> dc.equals(snitch.getDatacenter(replica));
         }
     
         private static PaxosParticipants getPaxosParticipants(TableMetadata metadata, DecoratedKey key, ConsistencyLevel consistencyForPaxos) throws UnavailableException
         {
             Token tk = key.getToken();
    -        List<InetAddressAndPort> naturalEndpoints = StorageService.instance.getNaturalEndpoints(metadata.keyspace, tk);
    -        Collection<InetAddressAndPort> pendingEndpoints = StorageService.instance.getTokenMetadata().pendingEndpointsFor(tk, metadata.keyspace);
    +        ReplicaList naturalReplicas = StorageService.instance.getNaturalReplicas(metadata.keyspace, tk);
    +        ReplicaList pendingReplicas = new ReplicaList(StorageService.instance.getTokenMetadata().pendingEndpointsFor(tk, metadata.keyspace));
             if (consistencyForPaxos == ConsistencyLevel.LOCAL_SERIAL)
             {
    -            // Restrict naturalEndpoints and pendingEndpoints to node in the local DC only
    +            // Restrict naturalReplicas and pendingReplicas to node in the local DC only
                 String localDc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(FBUtilities.getBroadcastAddressAndPort());
    -            Predicate<InetAddressAndPort> isLocalDc = sameDCPredicateFor(localDc);
    -            naturalEndpoints = ImmutableList.copyOf(Iterables.filter(naturalEndpoints, isLocalDc));
    -            pendingEndpoints = ImmutableList.copyOf(Iterables.filter(pendingEndpoints, isLocalDc));
    +            Predicate<Replica> isLocalDc = sameDCPredicateFor(localDc);
    +            naturalReplicas = ReplicaList.immutableCopyOf(naturalReplicas.filter(isLocalDc));
    --- End diff --
    
    Do we need to make it immutable here, since we concat it later.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188688173
  
    --- Diff: src/java/org/apache/cassandra/service/StorageService.java ---
    @@ -4307,32 +4300,38 @@ private void calculateToFromStreams(Collection<Token> newTokens, List<String> ke
                                 if (addressList.size() > 1)
                                     throw new IllegalStateException("Multiple strict sources found for " + toFetch);
     
    -                            InetAddressAndPort sourceIp = addressList.iterator().next();
    +                            InetAddressAndPort sourceIp = addressList.iterator().next().getEndpoint();
                                 if (Gossiper.instance.isEnabled() && !Gossiper.instance.getEndpointStateForEndpoint(sourceIp).isAlive())
                                     throw new RuntimeException("A node required to move the data consistently is down ("+sourceIp+").  If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false");
                             }
                         }
     
                         // calculating endpoints to stream current ranges to if needed
                         // in some situations node will handle current ranges as part of the new ranges
    -                    Multimap<InetAddressAndPort, Range<Token>> endpointRanges = HashMultimap.create();
    +                    ReplicaMultimap<InetAddressAndPort, ReplicaSet> endpointRanges = ReplicaMultimap.set();
                         for (Range<Token> toStream : rangesPerKeyspace.left)
                         {
    -                        Set<InetAddressAndPort> currentEndpoints = ImmutableSet.copyOf(strategy.calculateNaturalEndpoints(toStream.right, tokenMetaClone));
    -                        Set<InetAddressAndPort> newEndpoints = ImmutableSet.copyOf(strategy.calculateNaturalEndpoints(toStream.right, tokenMetaCloneAllSettled));
    +                        Set<Replica> currentEndpoints = ImmutableSet.copyOf(strategy.calculateNaturalReplicas(toStream.right, tokenMetaClone));
    +                        Set<Replica> newEndpoints = ImmutableSet.copyOf(strategy.calculateNaturalReplicas(toStream.right, tokenMetaCloneAllSettled));
    +
    +                        Replicas.checkFull(currentEndpoints);
    +                        Replicas.checkFull(newEndpoints);
    +
                             logger.debug("Range: {} Current endpoints: {} New endpoints: {}", toStream, currentEndpoints, newEndpoints);
    -                        for (InetAddressAndPort address : Sets.difference(newEndpoints, currentEndpoints))
    +                        for (Replica replica : Sets.difference(newEndpoints, currentEndpoints))
    --- End diff --
    
    I am not sure this works since it should be by address. I think I found cases that were suspect when it also compared by range. 
    
    I thought we weren't going to do regular collections of Replicas for the most part so it would be explicit?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189038308
  
    --- Diff: src/java/org/apache/cassandra/db/ColumnFamilyStore.java ---
    @@ -1591,7 +1589,7 @@ public long getExpectedCompactedFileSize(Iterable<SSTableReader> sstables, Opera
     
             // cleanup size estimation only counts bytes for keys local to this node
             long expectedFileSize = 0;
    -        Collection<Range<Token>> ranges = StorageService.instance.getLocalRanges(keyspace.getName());
    +        Collection<Range<Token>> ranges = StorageService.instance.getLocalReplicas(keyspace.getName()).asRangeSet();
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189039639
  
    --- Diff: src/java/org/apache/cassandra/locator/TokenMetadata.java ---
    @@ -856,25 +857,25 @@ private static PendingRangeMaps calculatePendingRanges(AbstractReplicationStrate
         {
             PendingRangeMaps newPendingRanges = new PendingRangeMaps();
     
    -        Multimap<InetAddressAndPort, Range<Token>> addressRanges = strategy.getAddressRanges(metadata);
    +        ReplicaMultimap<InetAddressAndPort, ReplicaSet> addressRanges = strategy.getAddressReplicas(metadata);
     
             // Copy of metadata reflecting the situation after all leave operations are finished.
             TokenMetadata allLeftMetadata = removeEndpoints(metadata.cloneOnlyTokenMap(), leavingEndpoints);
     
             // get all ranges that will be affected by leaving nodes
             Set<Range<Token>> affectedRanges = new HashSet<Range<Token>>();
             for (InetAddressAndPort endpoint : leavingEndpoints)
    -            affectedRanges.addAll(addressRanges.get(endpoint));
    +            affectedRanges.addAll(addressRanges.get(endpoint).asRangeSet());
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188448254
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    +        for (Replica replica: replicaList)
    +        {
    +            if (predicate.test(replica))
    +            {
    +                newReplicaList.add(replica);
    +            }
    +        }
    +        return new ReplicaList(newReplicaList);
    +    }
    +
    +    public void sort(Comparator<Replica> comparator)
    +    {
    +        replicaList.sort(comparator);
    +    }
    +
    +    public static ReplicaList intersectEndpoints(ReplicaList l1, ReplicaList l2)
    +    {
    +        Replicas.checkFull(l1);
    +        Replicas.checkFull(l2);
    +        // Note: we don't use Guava Sets.intersection() for 3 reasons:
    +        //   1) retainAll would be inefficient if l1 and l2 are large but in practice both are the replicas for a range and
    +        //   so will be very small (< RF). In that case, retainAll is in fact more efficient.
    +        //   2) we do ultimately need a list so converting everything to sets don't make sense
    +        //   3) l1 and l2 are sorted by proximity. The use of retainAll  maintain that sorting in the result, while using sets wouldn't.
    +        Collection<InetAddressAndPort> endpoints = l2.asEndpointList();
    +        return l1.filter(r -> endpoints.contains(r.getEndpoint()));
    +    }
    +
    +    public static ReplicaList of(Replica... replicas)
    --- End diff --
    
    Some good specializations of this would be 0, 1 and several like in ImmutableList


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189116984
  
    --- Diff: src/java/org/apache/cassandra/locator/Replicas.java ---
    @@ -0,0 +1,313 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Set;
    +
    +import com.google.common.base.Predicate;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.utils.FBUtilities;
    +
    +/**
    + * A collection like class for Replica objects. Since the Replica class contains inetaddress, range, and
    + * transient replication status, basic contains and remove methods can be ambiguous. Replicas forces you
    + * to be explicit about what you're checking the container for, or removing from it.
    + */
    +public abstract class Replicas implements Iterable<Replica>
    +{
    +
    +    public abstract boolean add(Replica replica);
    +    public abstract void addAll(Iterable<Replica> replicas);
    +    public abstract void removeEndpoint(InetAddressAndPort endpoint);
    +    public abstract void removeReplica(Replica replica);
    +    public abstract int size();
    +
    +    public Iterable<InetAddressAndPort> asEndpoints()
    +    {
    +        return Iterables.transform(this, Replica::getEndpoint);
    +    }
    +
    +    public Set<InetAddressAndPort> asEndpointSet()
    +    {
    +        Set<InetAddressAndPort> result = Sets.newHashSetWithExpectedSize(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getEndpoint());
    +        }
    +        return result;
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> result = new ArrayList<>(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getEndpoint());
    +        }
    +        return result;
    +    }
    +
    +    public Iterable<Range<Token>> asRanges()
    +    {
    +        return Iterables.transform(this, Replica::getRange);
    +    }
    +
    +    public Set<Range<Token>> asRangeSet()
    +    {
    +        Set<Range<Token>> result = Sets.newHashSetWithExpectedSize(size());
    +        for (Replica replica: this)
    +        {
    +            result.add(replica.getRange());
    +        }
    +        return result;
    +    }
    +
    +    public Iterable<Range<Token>> fullRanges()
    +    {
    +        return Iterables.transform(Iterables.filter(this, Replica::isFull), Replica::getRange);
    +    }
    +
    +    public boolean containsEndpoint(InetAddressAndPort endpoint)
    +    {
    +        return Iterables.any(this, r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    /**
    +     * Remove by endpoint. Ranges are ignored when determining what to remove
    +     */
    +    public void removeEndpoints(Replicas toRemove)
    +    {
    +        if (Iterables.all(this, Replica::isFull) && Iterables.all(toRemove, Replica::isFull))
    +        {
    +            for (Replica remove: toRemove)
    +            {
    +                removeEndpoint(remove.getEndpoint());
    +            }
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public void removeReplicas(Replicas toRemove)
    +    {
    +        if (Iterables.all(this, Replica::isFull) && Iterables.all(toRemove, Replica::isFull))
    +        {
    +            for (Replica remove: toRemove)
    +            {
    +                removeReplica(remove);
    +            }
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public boolean isEmpty()
    +    {
    +        return size() == 0;
    +    }
    +
    +    private static abstract class ImmutableReplicaContainer extends Replicas
    +    {
    +        @Override
    +        public boolean add(Replica replica)
    +        {
    +            throw new UnsupportedOperationException();
    +        }
    +
    +        @Override
    +        public void addAll(Iterable<Replica> replicas)
    +        {
    +            throw new UnsupportedOperationException();
    +        }
    +
    +        @Override
    +        public void removeEndpoint(InetAddressAndPort endpoint)
    +        {
    +            throw new UnsupportedOperationException();
    +        }
    +
    +        @Override
    +        public void removeReplica(Replica replica)
    +        {
    +            throw new UnsupportedOperationException();
    +        }
    +    }
    +
    +    public static Replicas filter(Replicas source, Predicate<Replica> predicate)
    +    {
    +        Iterable<Replica> iterable = Iterables.filter(source, predicate);
    +        return new ImmutableReplicaContainer()
    +        {
    +            public int size()
    +            {
    +                return Iterables.size(iterable);
    +            }
    +
    +            public Iterator<Replica> iterator()
    +            {
    +                return iterable.iterator();
    +            }
    +        };
    +    }
    +
    +    public static Replicas filterOnEndpoints(Replicas source, Predicate<InetAddressAndPort> predicate)
    +    {
    +        Iterable<Replica> iterable = Iterables.filter(source, r -> predicate.apply(r.getEndpoint()));
    +        return new ImmutableReplicaContainer()
    +        {
    +            public int size()
    +            {
    +                return Iterables.size(iterable);
    +            }
    +
    +            public Iterator<Replica> iterator()
    +            {
    +                return iterable.iterator();
    +            }
    +        };
    +    }
    +
    +    public static Replicas filterLocalEndpoint(Replicas replicas)
    +    {
    +        InetAddressAndPort local = FBUtilities.getBroadcastAddressAndPort();
    +        return filterOnEndpoints(replicas, e -> !e.equals(local));
    +    }
    +
    +    public static Replicas concatNaturalAndPending(Replicas natural, Replicas pending)
    +    {
    +        Iterable<Replica> iterable;
    +        if (Iterables.all(natural, Replica::isFull) && Iterables.all(pending, Replica::isFull))
    +        {
    +            iterable = Iterables.concat(natural, pending);
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +
    +        return new ImmutableReplicaContainer()
    +        {
    +            public int size()
    +            {
    +                return natural.size() + pending.size();
    +            }
    +
    +            public Iterator<Replica> iterator()
    +            {
    +                return iterable.iterator();
    +            }
    +        };
    +    }
    +
    +    public static Replicas concat(Iterable<Replicas> replicasIterable)
    +    {
    +        Iterable<Replica> iterable = Iterables.concat(replicasIterable);
    +        return new ImmutableReplicaContainer()
    +        {
    +            public int size()
    +            {
    +                return Iterables.size(iterable);
    +            }
    +
    +            public Iterator<Replica> iterator()
    +            {
    +                return iterable.iterator();
    +            }
    +        };
    +    }
    +
    +    public static Replicas of(Collection<Replica> replicas)
    +    {
    +        return new ImmutableReplicaContainer()
    +        {
    +            public int size()
    +            {
    +                return replicas.size();
    +            }
    +
    +            public Iterator<Replica> iterator()
    +            {
    +                return replicas.iterator();
    +            }
    +        };
    +    }
    +
    +    public static Replicas singleton(Replica replica)
    +    {
    +        return of(Collections.singleton(replica));
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r194581559
  
    --- Diff: test/unit/org/apache/cassandra/locator/NetworkTopologyStrategyTest.java ---
    @@ -36,12 +37,17 @@
     
     import org.apache.cassandra.config.DatabaseDescriptor;
     import org.apache.cassandra.dht.Murmur3Partitioner;
    +import org.apache.cassandra.dht.Murmur3Partitioner.LongToken;
     import org.apache.cassandra.dht.OrderPreservingPartitioner.StringToken;
    +import org.apache.cassandra.dht.Range;
     import org.apache.cassandra.dht.Token;
     import org.apache.cassandra.exceptions.ConfigurationException;
     import org.apache.cassandra.locator.TokenMetadata.Topology;
     import org.apache.cassandra.service.StorageService;
     
    +import static org.apache.cassandra.locator.Replica.full;
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187383690
  
    --- Diff: doc/source/architecture/dynamo.rst ---
    @@ -74,6 +74,26 @@ nodes in each rack, the data load on the smallest rack may be much higher.  Simi
     into a new rack, it will be considered a replica for the entire ring.  For this reason, many operators choose to
     configure all nodes on a single "rack".
     
    +.. _transient-replication:
    +
    +Transient Replication
    +~~~~~~~~~~~~~~~~~~~~~
    +
    +Transient replication allows you to configure a subset of replicas to only replicate data that hasn't been incrementally
    +repaired. This allows you to trade data redundancy for storage usage, and increased read and write throughput. For instance,
    +if you have a replication factor of 3, with 1 transient replica, 2 replicas will replicate all data for a given token
    +range, while the 3rd will only keep data that hasn't been incrementally repaired. Since you're reducing the copies kept
    +of data by the number of transient replicas, transient replication is best suited to multiple dc deployments.
    --- End diff --
    
    No you don't need multiple DCs!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189108331
  
    --- Diff: src/java/org/apache/cassandra/net/MessagingService.java ---
    @@ -591,8 +592,9 @@ public void run()
     
                     if (expiredCallbackInfo.shouldHint())
                     {
    -                    Mutation mutation = ((WriteCallbackInfo) expiredCallbackInfo).mutation();
    -                    return StorageProxy.submitHint(mutation, expiredCallbackInfo.target, null);
    +                    WriteCallbackInfo writeCallbackInfo = ((WriteCallbackInfo) expiredCallbackInfo);
    +                    Mutation mutation = writeCallbackInfo.mutation();
    --- End diff --
    
    we need to get both the mutation and the replica out of the write callback info here


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189092409
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188435314
  
    --- Diff: src/java/org/apache/cassandra/locator/TokenMetadata.java ---
    @@ -856,25 +857,25 @@ private static PendingRangeMaps calculatePendingRanges(AbstractReplicationStrate
         {
             PendingRangeMaps newPendingRanges = new PendingRangeMaps();
     
    -        Multimap<InetAddressAndPort, Range<Token>> addressRanges = strategy.getAddressRanges(metadata);
    +        ReplicaMultimap<InetAddressAndPort, ReplicaSet> addressRanges = strategy.getAddressReplicas(metadata);
     
             // Copy of metadata reflecting the situation after all leave operations are finished.
             TokenMetadata allLeftMetadata = removeEndpoints(metadata.cloneOnlyTokenMap(), leavingEndpoints);
     
             // get all ranges that will be affected by leaving nodes
             Set<Range<Token>> affectedRanges = new HashSet<Range<Token>>();
             for (InetAddressAndPort endpoint : leavingEndpoints)
    -            affectedRanges.addAll(addressRanges.get(endpoint));
    +            affectedRanges.addAll(addressRanges.get(endpoint).asRangeSet());
    --- End diff --
    
    No reason to allocate a new set here, this is a candidate for Collections2.transform


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188107218
  
    --- Diff: src/java/org/apache/cassandra/locator/Replica.java ---
    @@ -0,0 +1,221 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Collections;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import com.google.common.base.Preconditions;
    +import com.google.common.collect.Iterables;
    +import com.google.common.collect.Lists;
    +import com.google.common.collect.Sets;
    +
    +import org.apache.cassandra.db.ConsistencyLevel;
    +import org.apache.cassandra.dht.Range;
    +import org.apache.cassandra.dht.Token;
    +import org.apache.cassandra.exceptions.UnavailableException;
    +
    +/**
    + * Decorated Endpoint
    + */
    +public class Replica
    +{
    +    private final InetAddressAndPort endpoint;
    +    private final Range<Token> range;
    +    private final boolean full;
    +
    +    public Replica(InetAddressAndPort endpoint, Range<Token> range, boolean full)
    +    {
    +        Preconditions.checkNotNull(endpoint);
    +        this.endpoint = endpoint;
    +        this.range = range;
    +        this.full = full;
    +    }
    +
    +    public Replica(InetAddressAndPort endpoint, Token start, Token end, boolean full)
    +    {
    +        this(endpoint, new Range<>(start, end), full);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        Replica replica = (Replica) o;
    +        return full == replica.full &&
    +               Objects.equals(endpoint, replica.endpoint) &&
    +               Objects.equals(range, replica.range);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(endpoint, range, full);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        StringBuilder sb = new StringBuilder();
    +        sb.append(full ? "Full" : "Transient");
    +        sb.append('(').append(getEndpoint()).append(',').append(range).append(')');
    +        return sb.toString();
    +    }
    +
    +    public final InetAddressAndPort getEndpoint()
    +    {
    +        return endpoint;
    +    }
    +
    +    public Range<Token> getRange()
    +    {
    +        return range;
    +    }
    +
    +    public boolean isFull()
    +    {
    +        return full;
    +    }
    +
    +    public final boolean isTransient()
    +    {
    +        return !isFull();
    +    }
    +
    +    public ReplicaSet subtract(Replica that)
    +    {
    +        assert isFull() && that.isFull();  // FIXME: this
    +        Set<Range<Token>> ranges = range.subtract(that.range);
    +        ReplicaSet replicatedRanges = new ReplicaSet(ranges.size());
    +        for (Range<Token> range: ranges)
    +        {
    +            replicatedRanges.add(new Replica(getEndpoint(), range, isFull()));
    +        }
    +        return replicatedRanges;
    +    }
    +
    +    /**
    +     * Subtract the ranges of the given replicas from the range of this replica,
    +     * returning a set of replicas with the endpoint and transient information of
    +     * this replica, and the ranges resulting from the subtraction.
    +     */
    +    public ReplicaSet subtractByRange(Replicas toSubtract)
    +    {
    +        if (isFull() && Iterables.all(toSubtract, Replica::isFull))
    +        {
    +            Set<Range<Token>> subtractedRanges = getRange().subtractAll(toSubtract.asRangeSet());
    +            ReplicaSet replicaSet = new ReplicaSet(subtractedRanges.size());
    +            for (Range<Token> range: subtractedRanges)
    +            {
    +                replicaSet.add(new Replica(getEndpoint(), range, isFull()));
    +            }
    +            return replicaSet;
    +        }
    +        else
    +        {
    +            // FIXME: add support for transient replicas
    +            throw new UnsupportedOperationException("transient replicas are currently unsupported");
    +        }
    +    }
    +
    +    public ReplicaList normalizeByRange()
    +    {
    +        List<Range<Token>> normalized = Range.normalize(Collections.singleton(getRange()));
    +        ReplicaList replicas = new ReplicaList(normalized.size());
    +        for (Range<Token> normalizedRange: normalized)
    +        {
    +            replicas.add(new Replica(getEndpoint(), normalizedRange, isFull()));
    +        }
    +        return replicas;
    +    }
    +
    +    public boolean contains(Range<Token> that)
    +    {
    +        return getRange().contains(that);
    +    }
    +
    +    public boolean intersectsOnRange(Replica replica)
    +    {
    +        return getRange().intersects(replica.getRange());
    +    }
    +
    +    public Replica decorateSubrange(Range<Token> subrange)
    +    {
    +        Preconditions.checkArgument(range.contains(subrange));
    +        return new Replica(getEndpoint(), subrange, isFull());
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, true);
    +    }
    +
    +    /**
    +     * We need to assume an endpoint is a full replica in a with unknown ranges in a
    +     * few cases, so this returns one that throw an exception if you try to get it's range
    +     */
    +    public static Replica fullStandin(InetAddressAndPort endpoint)
    +    {
    +        return new Replica(endpoint, null, true) {
    +            @Override
    +            public Range<Token> getRange()
    +            {
    +                throw new UnsupportedOperationException("Can't get range on standin replicas");
    +            }
    +        };
    +    }
    +
    +    public static Replica full(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return full(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Range<Token> range)
    +    {
    +        return new Replica(endpoint, range, false);
    +    }
    +
    +    public static Replica trans(InetAddressAndPort endpoint, Token start, Token end)
    +    {
    +        return trans(endpoint, new Range<>(start, end));
    +    }
    +
    +    public static void assureSufficientFullReplica(Collection<Replica> replicas, ConsistencyLevel cl) throws UnavailableException
    --- End diff --
    
    Unused


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r187434827
  
    --- Diff: src/java/org/apache/cassandra/batchlog/BatchlogManager.java ---
    @@ -490,16 +497,16 @@ private static int gcgs(Collection<Mutation> mutations)
             {
                 private final Set<InetAddressAndPort> undelivered = Collections.newSetFromMap(new ConcurrentHashMap<>());
     
    -            ReplayWriteResponseHandler(Collection<InetAddressAndPort> writeEndpoints, long queryStartNanoTime)
    +            ReplayWriteResponseHandler(Replicas writeReplicas, long queryStartNanoTime)
                 {
    -                super(writeEndpoints, Collections.<InetAddressAndPort>emptySet(), null, null, null, WriteType.UNLOGGED_BATCH, queryStartNanoTime);
    -                undelivered.addAll(writeEndpoints);
    +                super(writeReplicas, ReplicaList.of(), null, null, null, WriteType.UNLOGGED_BATCH, queryStartNanoTime);
    +                Iterables.addAll(undelivered, writeReplicas.asEndpoints());
    --- End diff --
    
    This is a case where it seems like we could call a Collections2.transform version and not have to create a new list. One thing to keep in mind is that Collections2.transform makes contains and remove slow because it's O(n) no matter what the underlying collection is.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189092419
  
    --- Diff: src/java/org/apache/cassandra/locator/ReplicaList.java ---
    @@ -0,0 +1,244 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.locator;
    +
    +import java.util.ArrayList;
    +import java.util.Collection;
    +import java.util.Comparator;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Objects;
    +import java.util.function.Predicate;
    +
    +import com.google.common.collect.ImmutableList;
    +import com.google.common.collect.Iterables;
    +
    +public class ReplicaList extends Replicas
    +{
    +    static final ReplicaList EMPTY = new ReplicaList(ImmutableList.of());
    +
    +    private final List<Replica> replicaList;
    +
    +    public ReplicaList()
    +    {
    +        replicaList = new ArrayList<>();
    +    }
    +
    +    public ReplicaList(int capacity)
    +    {
    +        replicaList = new ArrayList<>(capacity);
    +    }
    +
    +    public ReplicaList(ReplicaList from)
    +    {
    +        replicaList = new ArrayList<>(from.replicaList);
    +    }
    +
    +    public ReplicaList(Replicas from)
    +    {
    +        replicaList = new ArrayList<>(from.size());
    +        addAll(from);
    +    }
    +
    +    public ReplicaList(Collection<Replica> from)
    +    {
    +        replicaList = new ArrayList<>(from);
    +    }
    +
    +    private ReplicaList(List<Replica> replicaList)
    +    {
    +        this.replicaList = replicaList;
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ReplicaList that = (ReplicaList) o;
    +        return Objects.equals(replicaList, that.replicaList);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(replicaList);
    +    }
    +
    +    @Override
    +    public String toString()
    +    {
    +        return replicaList.toString();
    +    }
    +
    +    @Override
    +    public boolean add(Replica replica)
    +    {
    +        return replicaList.add(replica);
    +    }
    +
    +    @Override
    +    public void addAll(Iterable<Replica> replicas)
    +    {
    +        Iterables.addAll(replicaList, replicas);
    +    }
    +
    +    @Override
    +    public int size()
    +    {
    +        return replicaList.size();
    +    }
    +
    +    @Override
    +    public Iterator<Replica> iterator()
    +    {
    +        return replicaList.iterator();
    +    }
    +
    +    public Replica get(int idx)
    +    {
    +        return replicaList.get(idx);
    +    }
    +
    +    public List<InetAddressAndPort> asEndpointList()
    +    {
    +        List<InetAddressAndPort> endpoints = new ArrayList<>(replicaList.size());
    +        for (Replica replica: replicaList)
    +        {
    +            endpoints.add(replica.getEndpoint());
    +        }
    +        return endpoints;
    +    }
    +
    +    @Override
    +    public void removeEndpoint(InetAddressAndPort endpoint)
    +    {
    +        replicaList.removeIf(r -> r.getEndpoint().equals(endpoint));
    +    }
    +
    +    @Override
    +    public void removeReplica(Replica replica)
    +    {
    +        replicaList.remove(replica);
    +    }
    +
    +    public ReplicaList filter(Predicate<Replica> predicate)
    +    {
    +        ArrayList<Replica> newReplicaList = new ArrayList<>(size());
    +        for (Replica replica: replicaList)
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by aweisberg <gi...@git.apache.org>.
Github user aweisberg commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r188782956
  
    --- Diff: src/java/org/apache/cassandra/cql3/statements/AlterKeyspaceStatement.java ---
    @@ -96,7 +98,35 @@ private void warnIfIncreasingRF(KeyspaceMetadata ksm, KeyspaceParams params)
                                                                                                             StorageService.instance.getTokenMetadata(),
                                                                                                             DatabaseDescriptor.getEndpointSnitch(),
                                                                                                             params.replication.options);
    -        if (newStrategy.getReplicationFactor() > oldStrategy.getReplicationFactor())
    +
    +        validateTransientReplication(oldStrategy, newStrategy);
    +        warnIfIncreasingRF(oldStrategy, newStrategy);
    +    }
    +
    +    private void validateTransientReplication(AbstractReplicationStrategy oldStrategy, AbstractReplicationStrategy newStrategy)
    +    {
    +        if (oldStrategy.getReplicationFactor().trans == 0 && newStrategy.getReplicationFactor().trans > 0)
    +        {
    +            Keyspace ks = Keyspace.open(keyspace());
    +            for (ColumnFamilyStore cfs: ks.getColumnFamilyStores())
    +            {
    +                if (cfs.viewManager.hasViews())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using materialized views");
    +                }
    +
    +                if (cfs.indexManager.hasIndexes())
    +                {
    +                    throw new ConfigurationException("Cannot use transient replication on keyspaces using secondary indexes");
    +                }
    +            }
    +
    +        }
    +    }
    --- End diff --
    
    I've never actually thought about the procedure for changing RF. I think I get the principle behind one at a time changes.
    
    What happens when you do this and write at CL.ALL and read at CL.ONE? What happens for ranges where the node that is now a full replica but doesn't have the full data is read from?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] cassandra pull request #224: 14405 replicas

Posted by bdeggleston <gi...@git.apache.org>.
Github user bdeggleston commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/224#discussion_r189038759
  
    --- Diff: src/java/org/apache/cassandra/db/compaction/CompactionManager.java ---
    @@ -533,7 +530,7 @@ public AllSSTableOpStatus relocateSSTables(final ColumnFamilyStore cfs, int jobs
                 logger.info("Partitioner does not support splitting");
                 return AllSSTableOpStatus.ABORTED;
             }
    -        final Collection<Range<Token>> r = StorageService.instance.getLocalRanges(cfs.keyspace.getName());
    +        final Collection<Range<Token>> r = StorageService.instance.getLocalReplicas(cfs.keyspace.getName()).asRangeSet();
    --- End diff --
    
    fixed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org