You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "colvinco (via GitHub)" <gi...@apache.org> on 2023/01/31 12:30:18 UTC

[GitHub] [solr] colvinco opened a new pull request, #1321: SOLR-13396 - Add property to disable deletion of unknown cores

colvinco opened a new pull request, #1321:
URL: https://github.com/apache/solr/pull/1321

   https://issues.apache.org/jira/browse/SOLR-13396
   
   <!--
   _(If you are a project committer then you may remove some/all of the following template.)_
   
   Before creating a pull request, please file an issue in the ASF Jira system for Solr:
   
   * https://issues.apache.org/jira/projects/SOLR
   
   For something minor (i.e. that wouldn't be worth putting in release notes), you can skip JIRA. 
   To create a Jira issue, you will need to create an account there first.
   
   The title of the PR should reference the Jira issue number in the form:
   
   * SOLR-####: <short description of problem or changes>
   
   SOLR must be fully capitalized. A short description helps people scanning pull requests for items they can work on.
   
   Properly referencing the issue in the title ensures that Jira is correctly updated with code review comments and commits. -->
   
   
   # Description
   
   This is a fix for SOLR-13396 to prevent the automatic deletion of cores that aren't referenced in the ZK Clusterstate.
   It's a Draft at the moment, as it still needs a test and some other changes, but I'm hoping we can get a general agreement on fixing this.
   
   # Solution
   
   I've added a system property and associated ENV flags to determine whether unreferenced cores should be deleted. This is a simple fix that should work for most people who have had an issue with SOLR-13396 (simply disabling the automatic deletion). A "better" fix may exist in the form of adding additional safety checks to try and detect misconfiguration, but this still feels like a safe option to have in addition to that in any case.
   
   Currently I've not changed the default behavior, so cores will still be deleted automatically OOTB. I've therefore named the property `preserveUnknownCores` and the env `SOLR_PRESERVE_UNKNOWN_CORES`.
   I would prefer to flip the default behavior to no longer automatically delete (and I know others would too), and instead have a `deleteUnknownCores` property. However I appreciate that that would be a change to the behavior introduced in SOLR-12066, so I'm not proposing that currently.
   
   # Tests
   
   TODO: Still need to add tests, but I don't know what the best approach is. I've found the test that was added for SOLR-12066 (https://github.com/apache/solr/blob/main/solr/core/src/test/org/apache/solr/cloud/DeleteInactiveReplicaTest.java) but I don't know if I should add a new case to cover this or create a separate test. Also I think it might be best to test this by connecting to a different zookeeper / wrong chroot as part of the test, but I don't know how best to do that with the test scaffolding in use.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [x ] I have created a Jira issue and added the issue ID to my pull request title.
   - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [x] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] colvinco merged pull request #1321: SOLR-13396 - Disable deletion of unknown cores by default

Posted by "colvinco (via GitHub)" <gi...@apache.org>.
colvinco merged PR #1321:
URL: https://github.com/apache/solr/pull/1321


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] janhoy commented on pull request #1321: SOLR-13396 - Add property to disable deletion of unknown cores

Posted by "janhoy (via GitHub)" <gi...@apache.org>.
janhoy commented on PR #1321:
URL: https://github.com/apache/solr/pull/1321#issuecomment-1493981843

   Yea, good question.
   
   If we decide to change the default to not delete (which I perhaps think we should), you'd for sure also add an entry to Major Changes in 9.3.
   
   The feature is also self documenting in solr.in.sh of course, but a short mention in taking-solr-to-production would not hurt.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] janhoy commented on a diff in pull request #1321: SOLR-13396 - Add property to disable deletion of unknown cores

Posted by "janhoy (via GitHub)" <gi...@apache.org>.
janhoy commented on code in PR #1321:
URL: https://github.com/apache/solr/pull/1321#discussion_r1093285558


##########
solr/bin/solr.in.sh:
##########
@@ -83,6 +83,12 @@
 # By default Solr will try to connect to Zookeeper with 30 seconds in timeout; override the timeout if needed
 #SOLR_WAIT_FOR_ZK="30"
 
+# By default Solr will delete cores that are not registered in Zookeeper at startup;
+# set to "true" to disable that behavior and preserve unknown cores so that you can
+# delete them manually. This may protect against misconfiguration (e.g. connecting
+# to the wrong Zookeeper instance or chroot).
+#SOLR_PRESERVE_UNKNOWN_CORES="true"

Review Comment:
   We normally use the default value as the commented-out value in solr.in.sh, so this would be `false`.



##########
solr/core/src/java/org/apache/solr/core/CoreContainer.java:
##########
@@ -1655,8 +1655,22 @@ private SolrCore createFromDescriptor(
     } catch (Exception e) {
       coreInitFailures.put(dcore.getName(), new CoreLoadFailure(dcore, e));
       if (e instanceof ZkController.NotInClusterStateException && !newCollection) {
-        // this mostly happen when the core is deleted when this node is down
-        unload(dcore.getName(), true, true, true);
+        // this mostly happens when the core is deleted when this node is down
+        // but it can also happen if connecting to the wrong zookeeper
+        final boolean preserveUnknownCores = Boolean.getBoolean("preserveUnknownCores");

Review Comment:
   Can we perhaps prefix all sysprops with `solr.` since many other libraries may also read system properties, and sooner or later there will be a name clash.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] colvinco commented on pull request #1321: SOLR-13396 - Add property to disable deletion of unknown cores

Posted by "colvinco (via GitHub)" <gi...@apache.org>.
colvinco commented on PR #1321:
URL: https://github.com/apache/solr/pull/1321#issuecomment-1609909574

   Based on what we've discussed previously and what @elyograg said on Slack, it seems like disabling it by default is reasonable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] colvinco commented on pull request #1321: SOLR-13396 - Add property to disable deletion of unknown cores

Posted by "colvinco (via GitHub)" <gi...@apache.org>.
colvinco commented on PR #1321:
URL: https://github.com/apache/solr/pull/1321#issuecomment-1493929189

   Where do you think this would fit best into the Reference Guide? I've not seen an obvious section to add it to, it doesn't feel big enough to get its own page, and I don't think there's a page that just lists out supported system properties / solr.in.sh options.
   
   It could maybe slot in somewhere around here https://solr.apache.org/guide/solr/9_2/deployment-guide/taking-solr-to-production.html#zookeeper-chroot), or in https://solr.apache.org/guide/solr/9_2/deployment-guide/solrcloud-shards-indexing.html / https://solr.apache.org/guide/solr/9_2/deployment-guide/solrcloud-recoveries-and-write-tolerance.html
   
   Is there somewhere better you have in mind?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] colvinco commented on pull request #1321: SOLR-13396 - Add property to disable deletion of unknown cores

Posted by "colvinco (via GitHub)" <gi...@apache.org>.
colvinco commented on PR #1321:
URL: https://github.com/apache/solr/pull/1321#issuecomment-1412401969

   > I'm ok with either _preserve_ or _delete_. If you rather want _delete_ then choose it, with a java-level default of "true".
   > 
   > Do we need to treat this behaviour as a backward compatibility promise? If Solr 9.2 starts preserving cores, and it is documented in the upgrade notes, no client ode needs to be changed anywhere. 
   
   Thanks for reviewing Jan :)
   Just to confirm what you're saying, you're okay with the idea of changing the existing behavior so that it no longer automatically deletes unknown cores (and documenting accordingly)? Just checking because when you said `If you rather want _delete_ then choose it, with a java-level default of "true".` that would keep the existing behavior of deleting cores by default, wouldn't it?
   
   > Only users who perhaps explicitly modifies cluster-state directly in ZK and have high core churn, i.e. rely on current behavior to reclaim disk space, would suffer. Or do I understand the domain right?
   
   My understanding is that the original motivation for the automatic deletion was to clean up after autoscaling operations or explicit deletes occurring when one or more nodes hosting replicas of the collection are offline SOLR-12066. Some of the comments on SOLR-13396 from 2019 suggest that manual cleanup on a large cluster would be painful. I don't know what the majority wants for this as a default. Not deleting is definitely the safest IMO.
   
   I see that some of the comments on the issue suggest some better end-to-end solutions which might put the administrator in the loop so that they can action/authorize the deletion, which sounds like a good long term solution.
   
   We're currently using Solr 8 for our product with several modifications that we maintain, my (selfish :)) motivation is to get onto Solr 9 without any modification to the distribution, in the not too distant future. Configuration options are good enough for that stepping stone.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] colvinco commented on pull request #1321: SOLR-13396 - Add property to disable deletion of unknown cores

Posted by "colvinco (via GitHub)" <gi...@apache.org>.
colvinco commented on PR #1321:
URL: https://github.com/apache/solr/pull/1321#issuecomment-1576980769

   Sorry it's taken a long time, I've finally got back to this. See what you think?
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] janhoy commented on pull request #1321: SOLR-13396 - Add property to disable deletion of unknown cores

Posted by "janhoy (via GitHub)" <gi...@apache.org>.
janhoy commented on PR #1321:
URL: https://github.com/apache/solr/pull/1321#issuecomment-1412477182

   My point is the property name could be deleteā€¦ with a default of true.
   
   Users would need to set it to false to override. In 10.0 the default could change, unless we found a better / smarter solution to the entire problem.
   
   Or we could default it to false if we are all ok with a change in behavior. The conservative approach is of course the safest.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org