You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/02/25 20:24:41 UTC

[GitHub] [lucene-solr-operator] HoustonPutman opened a new pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

HoustonPutman opened a new pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226


   This will use the `terminationGracePeriod`, minus a few seconds, and give that to Solr as the `SOLR_STOP_WAIT` value.
   
   That way Solr and Kubernetes are on the same page as to how much time to wait before forcefully stopping Solr.
   
   `terminationGracePeriod` used to be statically set to 10 seconds. Now it defaults to 30 seconds, but is configurable.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226#discussion_r585087532



##########
File path: docs/solr-cloud/solr-cloud-crd.md
##########
@@ -810,3 +812,22 @@ Also, changing the password for this user in the K8s secret will not update Solr
 
 If you enable basic auth for your SolrCloud cluster, then you need to point the Prometheus exporter at the basic auth secret; 
 refer to [Prometheus Exporter with Basic Auth](../solr-prometheus-exporter/README.md#prometheus-exporter-with-basic-auth) for more details.
+
+## Various Runtime Parameters
+
+There are various runtime parameters that allow you to customize the running of your Solr Cloud via the Solr Operator.
+
+### Time to wait for Solr to be killed gracefully
+
+The Solr Operator manages the Solr StatefulSet in a way that when a Solr pod needs to be stopped, or deleted, Kubernetes and Solr are on the same page for how long to wait for the process to die gracefully.
+
+The default time given is 60 seconds, before Solr or Kubernetes tries to forcefully stop the Solr process.
+You can override this default with the field:
+
+```yaml
+spec:
+  ...
+  customSolrKubeOptions:
+    podOptions:
+      terminationGracePeriodSeconds: 120

Review comment:
       This is a setting that users can set on pods, we merely forward this information to the pod. There is no k8s source of truth other than the value that each pod has. The default, set by the Solr Operator is 60 seconds.
   
   Also I think the default value (for the pod) is different based on the kubernetes distribution you use, but we shouldn't be relying on a default that has nothing to do with Solr.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226#discussion_r585699930



##########
File path: docs/solr-cloud/solr-cloud-crd.md
##########
@@ -810,3 +812,22 @@ Also, changing the password for this user in the K8s secret will not update Solr
 
 If you enable basic auth for your SolrCloud cluster, then you need to point the Prometheus exporter at the basic auth secret; 
 refer to [Prometheus Exporter with Basic Auth](../solr-prometheus-exporter/README.md#prometheus-exporter-with-basic-auth) for more details.
+
+## Various Runtime Parameters
+
+There are various runtime parameters that allow you to customize the running of your Solr Cloud via the Solr Operator.
+
+### Time to wait for Solr to be killed gracefully
+
+The Solr Operator manages the Solr StatefulSet in a way that when a Solr pod needs to be stopped, or deleted, Kubernetes and Solr are on the same page for how long to wait for the process to die gracefully.
+
+The default time given is 60 seconds, before Solr or Kubernetes tries to forcefully stop the Solr process.
+You can override this default with the field:
+
+```yaml
+spec:
+  ...
+  customSolrKubeOptions:
+    podOptions:
+      terminationGracePeriodSeconds: 120

Review comment:
       That command is kind of similar to kill -9. I'm not sure there is a way to protect against that. Is there a way to figure out what grace period is given when your pod is trying to be deleted?
   
   Most of the time, Pods are going to be deleted by Kubernetes (the statefulset controller) or the solrcloud controller, during managed upgrades. Neither of those use a different grace period, so the usage here is safe.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226#discussion_r585699930



##########
File path: docs/solr-cloud/solr-cloud-crd.md
##########
@@ -810,3 +812,22 @@ Also, changing the password for this user in the K8s secret will not update Solr
 
 If you enable basic auth for your SolrCloud cluster, then you need to point the Prometheus exporter at the basic auth secret; 
 refer to [Prometheus Exporter with Basic Auth](../solr-prometheus-exporter/README.md#prometheus-exporter-with-basic-auth) for more details.
+
+## Various Runtime Parameters
+
+There are various runtime parameters that allow you to customize the running of your Solr Cloud via the Solr Operator.
+
+### Time to wait for Solr to be killed gracefully
+
+The Solr Operator manages the Solr StatefulSet in a way that when a Solr pod needs to be stopped, or deleted, Kubernetes and Solr are on the same page for how long to wait for the process to die gracefully.
+
+The default time given is 60 seconds, before Solr or Kubernetes tries to forcefully stop the Solr process.
+You can override this default with the field:
+
+```yaml
+spec:
+  ...
+  customSolrKubeOptions:
+    podOptions:
+      terminationGracePeriodSeconds: 120

Review comment:
       That command is kind of similar to kill -9. I'm not sure there is a way to protect against that. Is there a way to figure out what grace period is given when your pod is trying to be deleted?  `Pod.spec` is static, so the `Pod.spec.terminationGracePeriodSeconds`, provided when the pod is created, will never change. If a user runs `k delete --grace-period=5s`, it will stay at `60` and not change to `5`.
   
   Most of the time, Pods are going to be deleted by Kubernetes (the statefulset controller) or the solrcloud controller, during managed upgrades. Neither of those use a different grace period, so the usage here is safe.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] madrob commented on a change in pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226#discussion_r585074599



##########
File path: docs/solr-cloud/solr-cloud-crd.md
##########
@@ -810,3 +812,22 @@ Also, changing the password for this user in the K8s secret will not update Solr
 
 If you enable basic auth for your SolrCloud cluster, then you need to point the Prometheus exporter at the basic auth secret; 
 refer to [Prometheus Exporter with Basic Auth](../solr-prometheus-exporter/README.md#prometheus-exporter-with-basic-auth) for more details.
+
+## Various Runtime Parameters
+
+There are various runtime parameters that allow you to customize the running of your Solr Cloud via the Solr Operator.
+
+### Time to wait for Solr to be killed gracefully
+
+The Solr Operator manages the Solr StatefulSet in a way that when a Solr pod needs to be stopped, or deleted, Kubernetes and Solr are on the same page for how long to wait for the process to die gracefully.
+
+The default time given is 60 seconds, before Solr or Kubernetes tries to forcefully stop the Solr process.
+You can override this default with the field:
+
+```yaml
+spec:
+  ...
+  customSolrKubeOptions:
+    podOptions:
+      terminationGracePeriodSeconds: 120

Review comment:
       So this still means that administrators need to set this manually, rather than automatically picking up what is derived from the k8s source of truth?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226#discussion_r583248581



##########
File path: controllers/util/solr_util.go
##########
@@ -266,6 +270,9 @@ func GenerateStatefulSet(solrCloud *solr.SolrCloud, solrCloudStatus *solr.SolrCl
 	solrHostName := solrCloud.AdvertisedNodeHost("$(POD_HOSTNAME)")
 	solrAdressingPort := solrCloud.NodePort()
 
+	// Solr can take longer than SOLR_STOP_WAIT to run solr stop, give it a few extra seconds before forcefully killing the pod.
+	solrStopWait := terminationGracePeriod - 5

Review comment:
       Added the minimum as well as a check for a negative value, in case we remove the minimum in the future.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226#discussion_r583151756



##########
File path: controllers/util/solr_util.go
##########
@@ -266,6 +270,9 @@ func GenerateStatefulSet(solrCloud *solr.SolrCloud, solrCloudStatus *solr.SolrCl
 	solrHostName := solrCloud.AdvertisedNodeHost("$(POD_HOSTNAME)")
 	solrAdressingPort := solrCloud.NodePort()
 
+	// Solr can take longer than SOLR_STOP_WAIT to run solr stop, give it a few extra seconds before forcefully killing the pod.
+	solrStopWait := terminationGracePeriod - 5

Review comment:
       maybe we set a minimum possible value for terminationGracePeriod? Like 10 seconds?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] madrob commented on pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
madrob commented on pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226#issuecomment-789355147


   Please add a `since 0.3.0` somewhere in the docs for this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] HoustonPutman merged pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
HoustonPutman merged pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] madrob commented on a change in pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226#discussion_r585116510



##########
File path: docs/solr-cloud/solr-cloud-crd.md
##########
@@ -810,3 +812,22 @@ Also, changing the password for this user in the K8s secret will not update Solr
 
 If you enable basic auth for your SolrCloud cluster, then you need to point the Prometheus exporter at the basic auth secret; 
 refer to [Prometheus Exporter with Basic Auth](../solr-prometheus-exporter/README.md#prometheus-exporter-with-basic-auth) for more details.
+
+## Various Runtime Parameters
+
+There are various runtime parameters that allow you to customize the running of your Solr Cloud via the Solr Operator.
+
+### Time to wait for Solr to be killed gracefully
+
+The Solr Operator manages the Solr StatefulSet in a way that when a Solr pod needs to be stopped, or deleted, Kubernetes and Solr are on the same page for how long to wait for the process to die gracefully.
+
+The default time given is 60 seconds, before Solr or Kubernetes tries to forcefully stop the Solr process.
+You can override this default with the field:
+
+```yaml
+spec:
+  ...
+  customSolrKubeOptions:
+    podOptions:
+      terminationGracePeriodSeconds: 120

Review comment:
       If a user issues a command `k delete --grace-period=5s` than that will override our configured setting. We should figure out how to do our best to react to the new directives.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr-operator] madrob commented on a change in pull request #226: Add terminationGracePeriod option, and use it when killing Solr.

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #226:
URL: https://github.com/apache/lucene-solr-operator/pull/226#discussion_r583150510



##########
File path: controllers/util/solr_util.go
##########
@@ -266,6 +270,9 @@ func GenerateStatefulSet(solrCloud *solr.SolrCloud, solrCloudStatus *solr.SolrCl
 	solrHostName := solrCloud.AdvertisedNodeHost("$(POD_HOSTNAME)")
 	solrAdressingPort := solrCloud.NodePort()
 
+	// Solr can take longer than SOLR_STOP_WAIT to run solr stop, give it a few extra seconds before forcefully killing the pod.
+	solrStopWait := terminationGracePeriod - 5

Review comment:
       Check for positive value here, somebody might have set grace period on their cluster to 5s or less.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org