You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by sh...@apache.org on 2019/02/18 22:26:34 UTC

[lucene-solr] branch branch_8x updated (a1efe97 -> d697b4a)

This is an automated email from the ASF dual-hosted git repository.

shalin pushed a change to branch branch_8x
in repository https://gitbox.apache.org/repos/asf/lucene-solr.git.


    from a1efe97  SOLR-11876: In-place updates fail during resolution if required fields are present
     new 7ede4e2  SOLR-13248: Autoscaling based replica placement is broken out of the box.
     new d697b4a  SOLR-13248: Adding upgrade notes which explain the problem and the mitigation as well as steps to revert to the old behavior

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 solr/CHANGES.txt                                   | 24 ++++++++++++++++++++++
 .../apache/solr/cloud/api/collections/Assign.java  |  4 ++--
 .../solr/cloud/api/collections/AssignTest.java     | 21 +++++++++++++++----
 .../security/JWTAuthPluginIntegrationTest.java     |  1 +
 .../hadoop/TestSolrCloudWithHadoopAuthPlugin.java  |  1 +
 .../solr/common/cloud/ClusterProperties.java       |  4 ++--
 .../apache/solr/common/cloud/ZkStateReader.java    |  4 +++-
 .../client/solrj/cloud/autoscaling/TestPolicy.java |  6 ++++++
 .../org/apache/solr/cloud/SolrCloudTestCase.java   | 19 +++++++++++++++--
 9 files changed, 73 insertions(+), 11 deletions(-)


[lucene-solr] 01/02: SOLR-13248: Autoscaling based replica placement is broken out of the box.

Posted by sh...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

shalin pushed a commit to branch branch_8x
in repository https://gitbox.apache.org/repos/asf/lucene-solr.git

commit 7ede4e2b430c7e0c36dedf21fcab35fe48ed783d
Author: shalinmangar <sh...@apache.org>
AuthorDate: Mon Feb 18 13:46:05 2019 -0800

    SOLR-13248: Autoscaling based replica placement is broken out of the box.
    
    Solr 7.5 enabled autoscaling based replica placement by default but in the absence of default cluster policies, autoscaling can place more than 1 replica of the  same shard on the same node. Also, the maxShardsPerNode and createNodeSet was not respected. Due to these reasons,  this issue reverts the default replica placement policy to the 'legacy' assignment policy that was the default until Solr 7.4.
    
    (cherry picked from commit 7e2d40197cb096fe0519652c2ebbbf38a70d0d65)
---
 solr/CHANGES.txt                                    |  6 ++++++
 .../apache/solr/cloud/api/collections/Assign.java   |  4 ++--
 .../solr/cloud/api/collections/AssignTest.java      | 21 +++++++++++++++++----
 .../solr/security/JWTAuthPluginIntegrationTest.java |  1 +
 .../hadoop/TestSolrCloudWithHadoopAuthPlugin.java   |  1 +
 .../apache/solr/common/cloud/ClusterProperties.java |  4 ++--
 .../org/apache/solr/common/cloud/ZkStateReader.java |  4 +++-
 .../client/solrj/cloud/autoscaling/TestPolicy.java  |  6 ++++++
 .../org/apache/solr/cloud/SolrCloudTestCase.java    | 19 +++++++++++++++++--
 9 files changed, 55 insertions(+), 11 deletions(-)

diff --git a/solr/CHANGES.txt b/solr/CHANGES.txt
index a6b8979..3dfaa0c 100644
--- a/solr/CHANGES.txt
+++ b/solr/CHANGES.txt
@@ -204,6 +204,12 @@ Bug Fixes
 * SOLR-13126: Query boosts were not being combined correctly for documents where not all boost queries
   matched (Alan Woodward, Mikhail Khludnev)
 
+* SOLR-13248: Autoscaling based replica placement is broken out of the box. Solr 7.5 enabled autoscaling based replica
+  placement by default but in the absence of default cluster policies, autoscaling can place more than 1 replica of the
+  same shard on the same node. Also, the maxShardsPerNode and createNodeSet was not respected. Due to these reasons,
+  this issue reverts the default replica placement policy to the 'legacy' assignment policy that was the default until
+  Solr 7.4. (Gus Heck, Andrzej Bialecki, Bram Van Dam, shalin)
+
 Improvements
 ----------------------
 
diff --git a/solr/core/src/java/org/apache/solr/cloud/api/collections/Assign.java b/solr/core/src/java/org/apache/solr/cloud/api/collections/Assign.java
index 318cdf7..cf47ab4 100644
--- a/solr/core/src/java/org/apache/solr/cloud/api/collections/Assign.java
+++ b/solr/core/src/java/org/apache/solr/cloud/api/collections/Assign.java
@@ -278,12 +278,12 @@ public class Assign {
   }
 
   private static boolean usePolicyFramework(Optional<DocCollection> collection, SolrCloudManager cloudManager) throws IOException, InterruptedException {
-    boolean useLegacyAssignment = false;
+    boolean useLegacyAssignment = true;
     Map<String, Object> clusterProperties = cloudManager.getClusterStateProvider().getClusterProperties();
     if (clusterProperties.containsKey(CollectionAdminParams.DEFAULTS))  {
       Map<String, Object> defaults = (Map<String, Object>) clusterProperties.get(CollectionAdminParams.DEFAULTS);
       Map<String, Object> collectionDefaults = (Map<String, Object>) defaults.getOrDefault(CollectionAdminParams.CLUSTER, Collections.emptyMap());
-      useLegacyAssignment = (boolean) collectionDefaults.getOrDefault(CollectionAdminParams.USE_LEGACY_REPLICA_ASSIGNMENT, false);
+      useLegacyAssignment = Boolean.parseBoolean(collectionDefaults.getOrDefault(CollectionAdminParams.USE_LEGACY_REPLICA_ASSIGNMENT, "true").toString());
     }
 
     if (!useLegacyAssignment) {
diff --git a/solr/core/src/test/org/apache/solr/cloud/api/collections/AssignTest.java b/solr/core/src/test/org/apache/solr/cloud/api/collections/AssignTest.java
index 638496a..512a397 100644
--- a/solr/core/src/test/org/apache/solr/cloud/api/collections/AssignTest.java
+++ b/solr/core/src/test/org/apache/solr/cloud/api/collections/AssignTest.java
@@ -158,22 +158,35 @@ public class AssignTest extends SolrTestCaseJ4 {
   }
 
   @Test
-  public void testUsePolicyByDefault() throws Exception {
+  public void testUseLegacyByDefault() throws Exception {
     assumeWorkingMockito();
 
     SolrCloudManager solrCloudManager = mock(SolrCloudManager.class);
     ClusterStateProvider clusterStateProvider = mock(ClusterStateProvider.class);
     when(solrCloudManager.getClusterStateProvider()).thenReturn(clusterStateProvider);
+    DistribStateManager distribStateManager = mock(DistribStateManager.class);
+    when(solrCloudManager.getDistribStateManager()).thenReturn(distribStateManager);
+    when(distribStateManager.getAutoScalingConfig()).thenReturn(new AutoScalingConfig(Collections.emptyMap()));
+
+    // first we don't set any cluster property and assert that legacy assignment is used
+    when(clusterStateProvider.getClusterProperties()).thenReturn(Collections.emptyMap());
+    // verify
+    boolean usePolicyFramework = Assign.usePolicyFramework(solrCloudManager);
+    assertFalse(usePolicyFramework);
+    // another sanity check
+    when(clusterStateProvider.getClusterProperties()).thenReturn(Utils.makeMap("defaults", Collections.emptyMap()));
+    // verify
+    usePolicyFramework = Assign.usePolicyFramework(solrCloudManager);
+    assertFalse(usePolicyFramework);
+
     // first we set useLegacyReplicaAssignment=false, so autoscaling should always be used
     when(clusterStateProvider.getClusterProperties()).thenReturn(Utils.makeMap("defaults", Utils.makeMap("cluster", Utils.makeMap("useLegacyReplicaAssignment", false))));
     // verify
-    boolean usePolicyFramework = Assign.usePolicyFramework(solrCloudManager);
+    usePolicyFramework = Assign.usePolicyFramework(solrCloudManager);
     assertTrue(usePolicyFramework);
 
     // now we set useLegacyReplicaAssignment=true, so autoscaling can only be used if an explicit policy or preference exists
     when(clusterStateProvider.getClusterProperties()).thenReturn(Utils.makeMap("defaults", Utils.makeMap("cluster", Utils.makeMap("useLegacyReplicaAssignment", true))));
-    DistribStateManager distribStateManager = mock(DistribStateManager.class);
-    when(solrCloudManager.getDistribStateManager()).thenReturn(distribStateManager);
     when(distribStateManager.getAutoScalingConfig()).thenReturn(new AutoScalingConfig(Collections.emptyMap()));
     assertFalse(Assign.usePolicyFramework(solrCloudManager));
 
diff --git a/solr/core/src/test/org/apache/solr/security/JWTAuthPluginIntegrationTest.java b/solr/core/src/test/org/apache/solr/security/JWTAuthPluginIntegrationTest.java
index 237ec0d..b11300b 100644
--- a/solr/core/src/test/org/apache/solr/security/JWTAuthPluginIntegrationTest.java
+++ b/solr/core/src/test/org/apache/solr/security/JWTAuthPluginIntegrationTest.java
@@ -78,6 +78,7 @@ public class JWTAuthPluginIntegrationTest extends SolrCloudAuthTestCase {
     configureCluster(NUM_SERVERS)// nodes
         .withSecurityJson(TEST_PATH().resolve("security").resolve("jwt_plugin_jwk_security.json"))
         .addConfig("conf1", TEST_PATH().resolve("configsets").resolve("cloud-minimal").resolve("conf"))
+        .withDefaultClusterProperty("useLegacyReplicaAssignment", "false")
         .configure();
     baseUrl = cluster.getRandomJetty(random()).getBaseUrl().toString();
 
diff --git a/solr/core/src/test/org/apache/solr/security/hadoop/TestSolrCloudWithHadoopAuthPlugin.java b/solr/core/src/test/org/apache/solr/security/hadoop/TestSolrCloudWithHadoopAuthPlugin.java
index ff643da..6f9a25d 100644
--- a/solr/core/src/test/org/apache/solr/security/hadoop/TestSolrCloudWithHadoopAuthPlugin.java
+++ b/solr/core/src/test/org/apache/solr/security/hadoop/TestSolrCloudWithHadoopAuthPlugin.java
@@ -48,6 +48,7 @@ public class TestSolrCloudWithHadoopAuthPlugin extends SolrCloudAuthTestCase {
     configureCluster(NUM_SERVERS)// nodes
         .withSecurityJson(TEST_PATH().resolve("security").resolve("hadoop_kerberos_config.json"))
         .addConfig("conf1", TEST_PATH().resolve("configsets").resolve("cloud-minimal").resolve("conf"))
+        .withDefaultClusterProperty("useLegacyReplicaAssignment", "false")
         .configure();
   }
 
diff --git a/solr/solrj/src/java/org/apache/solr/common/cloud/ClusterProperties.java b/solr/solrj/src/java/org/apache/solr/common/cloud/ClusterProperties.java
index 21d1298..96e5371 100644
--- a/solr/solrj/src/java/org/apache/solr/common/cloud/ClusterProperties.java
+++ b/solr/solrj/src/java/org/apache/solr/common/cloud/ClusterProperties.java
@@ -149,11 +149,11 @@ public class ClusterProperties {
    * This method sets a cluster property.
    *
    * @param propertyName  The property name to be set.
-   * @param propertyValue The value of the property.
+   * @param propertyValue The value of the property, could also be a nested structure.
    * @throws IOException if there is an error writing data to the cluster
    */
   @SuppressWarnings("unchecked")
-  public void setClusterProperty(String propertyName, String propertyValue) throws IOException {
+  public void setClusterProperty(String propertyName, Object propertyValue) throws IOException {
 
     validatePropertyName(propertyName);
 
diff --git a/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java b/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java
index 1a26451..7a3f7d2 100644
--- a/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java
+++ b/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java
@@ -50,6 +50,7 @@ import org.apache.solr.common.Callable;
 import org.apache.solr.common.SolrException;
 import org.apache.solr.common.SolrException.ErrorCode;
 import org.apache.solr.common.params.AutoScalingParams;
+import org.apache.solr.common.params.CollectionAdminParams;
 import org.apache.solr.common.params.CoreAdminParams;
 import org.apache.solr.common.util.ExecutorUtil;
 import org.apache.solr.common.util.ObjectReleaseTracker;
@@ -239,7 +240,8 @@ public class ZkStateReader implements Closeable {
       URL_SCHEME,
       AUTO_ADD_REPLICAS,
       CoreAdminParams.BACKUP_LOCATION,
-      MAX_CORES_PER_NODE)));
+      MAX_CORES_PER_NODE,
+      CollectionAdminParams.DEFAULTS)));
 
   /**
    * Returns config set name for collection.
diff --git a/solr/solrj/src/test/org/apache/solr/client/solrj/cloud/autoscaling/TestPolicy.java b/solr/solrj/src/test/org/apache/solr/client/solrj/cloud/autoscaling/TestPolicy.java
index fc64b11..c46eadf 100644
--- a/solr/solrj/src/test/org/apache/solr/client/solrj/cloud/autoscaling/TestPolicy.java
+++ b/solr/solrj/src/test/org/apache/solr/client/solrj/cloud/autoscaling/TestPolicy.java
@@ -58,6 +58,7 @@ import org.apache.solr.common.cloud.Replica;
 import org.apache.solr.common.cloud.ReplicaPosition;
 import org.apache.solr.common.cloud.ZkStateReader;
 import org.apache.solr.common.cloud.rule.ImplicitSnitch;
+import org.apache.solr.common.params.CollectionAdminParams;
 import org.apache.solr.common.params.CollectionParams;
 import org.apache.solr.common.params.CollectionParams.CollectionAction;
 import org.apache.solr.common.params.SolrParams;
@@ -2609,6 +2610,11 @@ public class TestPolicy extends SolrTestCaseJ4 {
           public ClusterState getClusterState() throws IOException {
             return ClusterState.load(0, clusterState, getLiveNodes(), ZkStateReader.getCollectionPath("c1"));
           }
+
+          @Override
+          public Map<String, Object> getClusterProperties() {
+            return Collections.singletonMap("defaults", Collections.singletonMap("cluster", Collections.singletonMap(CollectionAdminParams.USE_LEGACY_REPLICA_ASSIGNMENT, false)));
+          }
         };
       }
 
diff --git a/solr/test-framework/src/java/org/apache/solr/cloud/SolrCloudTestCase.java b/solr/test-framework/src/java/org/apache/solr/cloud/SolrCloudTestCase.java
index 6e2f780..acb09be 100644
--- a/solr/test-framework/src/java/org/apache/solr/cloud/SolrCloudTestCase.java
+++ b/solr/test-framework/src/java/org/apache/solr/cloud/SolrCloudTestCase.java
@@ -52,6 +52,7 @@ import org.apache.solr.common.cloud.Replica;
 import org.apache.solr.common.cloud.Slice;
 import org.apache.solr.common.cloud.SolrZkClient;
 import org.apache.solr.common.cloud.ZkStateReader;
+import org.apache.solr.common.params.CollectionAdminParams;
 import org.apache.solr.common.util.NamedList;
 import org.junit.AfterClass;
 import org.junit.Before;
@@ -104,7 +105,7 @@ public class SolrCloudTestCase extends SolrTestCaseJ4 {
     private Optional<String> securityJson = Optional.empty();
 
     private List<Config> configs = new ArrayList<>();
-    private Map<String, String> clusterProperties = new HashMap<>();
+    private Map<String, Object> clusterProperties = new HashMap<>();
 
     /**
      * Create a builder
@@ -211,13 +212,27 @@ public class SolrCloudTestCase extends SolrTestCaseJ4 {
 
       if (clusterProperties.size() > 0) {
         ClusterProperties props = new ClusterProperties(cluster.getSolrClient().getZkStateReader().getZkClient());
-        for (Map.Entry<String, String> entry : clusterProperties.entrySet()) {
+        for (Map.Entry<String, Object> entry : clusterProperties.entrySet()) {
           props.setClusterProperty(entry.getKey(), entry.getValue());
         }
       }
       return cluster;
     }
 
+    public Builder withDefaultClusterProperty(String key, String value) {
+      HashMap<String,Object> defaults = (HashMap<String, Object>) this.clusterProperties.get(CollectionAdminParams.DEFAULTS);
+      if (defaults == null) {
+        defaults = new HashMap<>();
+        this.clusterProperties.put(CollectionAdminParams.DEFAULTS, defaults);
+      }
+      HashMap<String,Object> cluster = (HashMap<String, Object>) defaults.get(CollectionAdminParams.CLUSTER);
+      if (cluster == null) {
+        cluster = new HashMap<>();
+        defaults.put(CollectionAdminParams.CLUSTER, cluster);
+      }
+      cluster.put(key, value);
+      return this;
+    }
   }
 
   /** The cluster */


[lucene-solr] 02/02: SOLR-13248: Adding upgrade notes which explain the problem and the mitigation as well as steps to revert to the old behavior

Posted by sh...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

shalin pushed a commit to branch branch_8x
in repository https://gitbox.apache.org/repos/asf/lucene-solr.git

commit d697b4a6ad3ac977a2890f57225293c17a44779f
Author: shalinmangar <sh...@apache.org>
AuthorDate: Mon Feb 18 13:56:24 2019 -0800

    SOLR-13248: Adding upgrade notes which explain the problem and the mitigation as well as steps to revert to the old behavior
    
    (cherry picked from commit 97875af3f93477f48e4ead1979b2f36797106e06)
---
 solr/CHANGES.txt | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/solr/CHANGES.txt b/solr/CHANGES.txt
index 3dfaa0c..9147ca1 100644
--- a/solr/CHANGES.txt
+++ b/solr/CHANGES.txt
@@ -152,6 +152,24 @@ Upgrade Notes
   are problematic.  In-place-updates are safe though.  If you want to delete certain child documents and if you know
   they don't themselves have nested children then you must do so with a delete-by-query technique.
 
+* SOLR-13248: The default replica placement strategy used in Solr has been reverted to the 'legacy' policy used by Solr
+  7.4 and previous versions. This is due to multiple bugs in the autoscaling based replica placement strategy that was
+  made default in Solr 7.5 which causes multiple replicas of the same shard to be placed on the same node in addition
+  to the maxShardsPerNode and createNodeSet parameters being ignored. Although the default has changed, autoscaling
+  will continue to be used if a cluster policy or preference is specified or a collection level policy is in use.
+  The default replica placement strategy can be changed to use autoscaling again by setting a cluster property:
+  curl -X POST -H 'Content-type:application/json' --data-binary '
+  {
+    "set-obj-property": {
+      "defaults" : {
+        "cluster": {
+          "useLegacyReplicaAssignment":false
+        }
+      }
+    }
+  }' http://$SOLR_HOST:$SOLR_PORT/api/cluster
+
+
 New Features
 ----------------------