You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/07/06 19:15:16 UTC

[GitHub] [spark] agrawaldevesh opened a new pull request #29015: [WIP] Expose a (protected) /workers/kill endpoint on the MasterWebUI

agrawaldevesh opened a new pull request #29015:
URL: https://github.com/apache/spark/pull/29015


   ### What changes were proposed in this pull request?
   
   This PR allows an external agent to inform the Master that certain nodes
   (or host-ports) are being decommissioned.
   
   ### Why are the changes needed?
   
   The current decommissioning is triggered by the Worker getting getting a SIGPWR
   (out of band possibly by some cleanup hook), which then informs the Master
   about it. This approach may not be feasible in some environments that cannot
   trigger a clean up hook on the Worker. In addition, when a large number of
   worker nodes are being decommissioned then the master will get a flood of
   messages.
   
   So we add a new post endpoint `/workers/kill` on the MasterWebUI that allows an
   external agent to inform the master about all the nodes being decommissioned in
   bulk. The workers are identified by either their `host:port` or just the host
   -- in which case all workers on the host would be decommissioned.
   
   This API is merely a new entry point into the existing decommissioning
   logic. It does not change how the decommissioning request is handled in
   its core.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, a new endpoint `/workers/kill` is added to the MasterWebUI. By default this endpoint is disabled.
   
   ### How was this patch tested?
   
   Added unit tests
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r453843780



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
##########
@@ -863,7 +872,36 @@ private[deploy] class Master(
     true
   }
 
-  private def decommissionWorker(worker: WorkerInfo): Unit = {
+  private def handleDecommissionWorkers(hostPorts: Seq[String]): Integer = {
+    val hostsWithoutPorts = new HashSet[String]
+    val addressToRemove = new HashSet[RpcAddress]
+    hostPorts.foreach(hp => {

Review comment:
       The challenge here is that Its not a "clean filter" due to the presence of hosts without ports. The main use case is actually to decommission a given set of hosts (because the external system may not have visibility into the worker ports). Perhaps I can split this into a union of sorts ... let me see.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659139663






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659845800






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r454492097



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ROOT)).toSeq
+        logInfo(s"Received request to decommission workers $hostPorts from ${req.getRemoteAddr}")
+        if (!isDecommissioningRequestAllowed(req)) {
+          resp.sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED)
+        } else {
+          val removedWorkers = masterEndpointRef.askSync[Integer](DecommissionHostPorts(hostPorts))

Review comment:
       Yeah: We would wait for each one to be processed iteratively by the master's message handling thread. Having said that, the decommissioning does not block for actually sending/acking the messages to the executors. Its merely updating some (potentially persistent) state in the Master so shouldn't be that slow. 
   
   Having said that, would this be a problem ? I am assuming that the JettyHandler that the MasterWebUI is built atop can indeed handle multiple requests in flight, where some of them are blocking.
   
   The use case for making this handler synchronous is so that the external agent doing the decommissioning of the hosts can know whether the cleanup succeeded or not. While this information is scrapeable from the MasterPage (that returns the status of the Workers), it would require some brittle scraping on the external end point. So I figured it would be better for this call to return the number of workers it was actually able to decommission.
   
   I am happy to switch this logic to Async if you see any red flags.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [WIP] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-654418776


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [WIP] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-654419342


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659162409


   **[Test build #125935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125935/testReport)** for PR 29015 at commit [`3ee87f3`](https://github.com/apache/spark/commit/3ee87f376fe499df8aa0000710863f5bf6d9648f).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r454483988



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
##########
@@ -863,7 +872,29 @@ private[deploy] class Master(
     true
   }
 
-  private def decommissionWorker(worker: WorkerInfo): Unit = {
+  private def decommissionHostPorts(hostPorts: Seq[String]): Integer = {
+    val hostPortsParsed = hostPorts.map(Utils.parseHostPort)
+    val (hostPortsWithoutPorts, hostPortsWithPorts) = hostPortsParsed.partition(_._2 == 0)

Review comment:
       I would like to please let it remain it as it is since I find two line statements confusing to read. Its also just 90 columns and within the 100 limit. There is also precedence for one line partition statements elsewhere.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455997070



##########
File path: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala
##########
@@ -108,6 +108,9 @@ private[deploy] object DeployMessages {
 
   case class Heartbeat(workerId: String, worker: RpcEndpointRef) extends DeployMessage
 
+  // Out of band commands to Master
+  case class DecommissionHosts(hostnames: Seq[String])

Review comment:
       Makes sense !. I like DecommissionWorkersByHost. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r454479729



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ROOT)).toSeq
+        logInfo(s"Received request to decommission workers $hostPorts from ${req.getRemoteAddr}")
+        if (!isDecommissioningRequestAllowed(req)) {
+          resp.sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED)
+        } else {
+          val removedWorkers = masterEndpointRef.askSync[Integer](DecommissionHostPorts(hostPorts))
+          if (removedWorkers <= 0) {

Review comment:
       Never :-P ... I will handle that as an error separately.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659202075






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jiangxb1987 commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
jiangxb1987 commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455252197



##########
File path: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala
##########
@@ -108,6 +108,9 @@ private[deploy] object DeployMessages {
 
   case class Heartbeat(workerId: String, worker: RpcEndpointRef) extends DeployMessage
 
+  // Out of band commands to Master
+  case class DecommissionHostPorts(hostPorts: Seq[String])

Review comment:
       It would be confusing to say `DecommissionWorkers` but passes in sequence of hostPorts...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659706029






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659094325






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-658918376






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455392886



##########
File path: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala
##########
@@ -108,6 +108,9 @@ private[deploy] object DeployMessages {
 
   case class Heartbeat(workerId: String, worker: RpcEndpointRef) extends DeployMessage
 
+  // Out of band commands to Master
+  case class DecommissionHostPorts(hostPorts: Seq[String])

Review comment:
       I am narrowing the scope to simply decommission a bunch of hostnames, and all workers within a host will go away. This is the only production use case I have in mind and there is no need to design for the flexibility of wanting to decommission an individual worker on the node.
   
   As such, I have renamed the API to `DecommissionHosts` and it takes a list of host names.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659705182


   **[Test build #125998 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125998/testReport)** for PR 29015 at commit [`c6a6a90`](https://github.com/apache/spark/commit/c6a6a90893279dd6deb02af007ba4ec5553a8cd4).
    * This patch **fails PySpark unit tests**.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `  case class DecommissionWorkersOnHosts(hostnames: Seq[String])`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659042917






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #29015:
URL: https://github.com/apache/spark/pull/29015


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455246869



##########
File path: core/src/main/scala/org/apache/spark/internal/config/UI.scala
##########
@@ -191,4 +191,14 @@ private[spark] object UI {
     .version("3.0.0")
     .stringConf
     .createOptional
+
+  val MASTER_UI_DECOMMISSION_ALLOW_MODE = ConfigBuilder("spark.master.ui.decommission.allow.mode")
+    .doc("Specifies the behavior of the Master Web UI's /workers/kill endpoint. Possible choices" +
+      " are: `local` means allow this endpoint from IP's that are local to the machine running" +
+      " the Master, `deny` means to completely disable this endpoint, `allow` means to allow" +
+      " calling this endpoint from any IP.")
+    .internal()
+    .version("3.1.0")
+    .stringConf
+    .createWithDefault("deny")

Review comment:
       shall we use `local` as default? looks safe enough.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659093134


   **[Test build #125922 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125922/testReport)** for PR 29015 at commit [`9ea178b`](https://github.com/apache/spark/commit/9ea178bb5091651f590017cb2b86225ec6f648c0).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659065467






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659621891






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r454493980



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ROOT)).toSeq
+        logInfo(s"Received request to decommission workers $hostPorts from ${req.getRemoteAddr}")
+        if (!isDecommissioningRequestAllowed(req)) {
+          resp.sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED)
+        } else {
+          val removedWorkers = masterEndpointRef.askSync[Integer](DecommissionHostPorts(hostPorts))

Review comment:
       Actually screw it. On thinking more about this, it seems like I can make this be async just fine and thereby avoid any such pitfalls. I am going to do that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659706029


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455244251



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
##########
@@ -863,6 +872,35 @@ private[deploy] class Master(
     true
   }
 
+  private def decommissionHostPorts(hostPorts: Seq[String]): Integer = {

Review comment:
       ditto `decommissionWorkers`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659156968


   Retest this please.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659203698


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125935/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jiangxb1987 commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
jiangxb1987 commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455267326



##########
File path: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala
##########
@@ -108,6 +108,9 @@ private[deploy] object DeployMessages {
 
   case class Heartbeat(workerId: String, worker: RpcEndpointRef) extends DeployMessage
 
+  // Out of band commands to Master
+  case class DecommissionHostPorts(hostPorts: Seq[String])

Review comment:
       let's change to `DecommissionWorkers` then, since `WorkerStateResponse` also passes in `host` and `port`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-654419342


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659139670


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125922/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659027181


   **[Test build #125911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125911/testReport)** for PR 29015 at commit [`d8e241f`](https://github.com/apache/spark/commit/d8e241fc492a6a626d6cd00ef1f666fa62ffd178).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-658918376






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659773878


   **[Test build #126012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126012/testReport)** for PR 29015 at commit [`31b231e`](https://github.com/apache/spark/commit/31b231e1b0a984ebdfc408beedaadeec6881ddff).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659103651






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455262363



##########
File path: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala
##########
@@ -108,6 +108,9 @@ private[deploy] object DeployMessages {
 
   case class Heartbeat(workerId: String, worker: RpcEndpointRef) extends DeployMessage
 
+  // Out of band commands to Master
+  case class DecommissionHostPorts(hostPorts: Seq[String])

Review comment:
       but `DecommissionHostPorts` is more confusing as you don't even know what it does unless you look at the comment.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [WIP] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-654418776


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659094325






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659202085


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125927/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659093682


   **[Test build #125911 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125911/testReport)** for PR 29015 at commit [`d8e241f`](https://github.com/apache/spark/commit/d8e241fc492a6a626d6cd00ef1f666fa62ffd178).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `  case class DecommissionHostPorts(hostPorts: Seq[String])`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659621152


   **[Test build #125998 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125998/testReport)** for PR 29015 at commit [`c6a6a90`](https://github.com/apache/spark/commit/c6a6a90893279dd6deb02af007ba4ec5553a8cd4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659139325


   **[Test build #125922 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125922/testReport)** for PR 29015 at commit [`9ea178b`](https://github.com/apache/spark/commit/9ea178bb5091651f590017cb2b86225ec6f648c0).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659119735


   **[Test build #125927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125927/testReport)** for PR 29015 at commit [`08a4c9b`](https://github.com/apache/spark/commit/08a4c9bf04f5818c0563637a63f0a5793a112723).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659093134


   **[Test build #125922 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125922/testReport)** for PR 29015 at commit [`9ea178b`](https://github.com/apache/spark/commit/9ea178bb5091651f590017cb2b86225ec6f648c0).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659845800






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r453521642



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ENGLISH)).toSeq
+        logInfo(s"Received request to decommission hostPorts $hostPorts from ${req.getRemoteAddr}")

Review comment:
       nit: `decommission hostPorts` -> `decommission workers`?

##########
File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
##########
@@ -863,7 +872,36 @@ private[deploy] class Master(
     true
   }
 
-  private def decommissionWorker(worker: WorkerInfo): Unit = {
+  private def handleDecommissionWorkers(hostPorts: Seq[String]): Integer = {
+    val hostsWithoutPorts = new HashSet[String]
+    val addressToRemove = new HashSet[RpcAddress]
+    hostPorts.foreach(hp => {

Review comment:
       we can first filter the valid workers and then apply `decommissionWorker` for each worker, e.g.
   
   `hostPorsts.filter(...).foreach(...)`

##########
File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
##########
@@ -863,7 +872,36 @@ private[deploy] class Master(
     true
   }
 
-  private def decommissionWorker(worker: WorkerInfo): Unit = {
+  private def handleDecommissionWorkers(hostPorts: Seq[String]): Integer = {
+    val hostsWithoutPorts = new HashSet[String]
+    val addressToRemove = new HashSet[RpcAddress]
+    hostPorts.foreach(hp => {
+      val (host, port) = Utils.parseHostPort(hp)
+      if (port == 0) {
+        hostsWithoutPorts.add(host)
+      } else {
+        val rpcAddress = new RpcAddress(host, port)
+        if (addressToWorker.contains(rpcAddress)) {
+          addressToRemove.add(rpcAddress)
+        }
+      }
+    })
+
+    if (hostsWithoutPorts.nonEmpty) {
+      addressToRemove ++= addressToWorker.keys.filter(addr => hostsWithoutPorts.contains(addr.host))
+    }

Review comment:
       It seems that comparing the `host` only should be enough?

##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ENGLISH)).toSeq

Review comment:
       Why use `Locale.ENGLISH`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659027181


   **[Test build #125911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125911/testReport)** for PR 29015 at commit [`d8e241f`](https://github.com/apache/spark/commit/d8e241fc492a6a626d6cd00ef1f666fa62ffd178).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r454493980



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ROOT)).toSeq
+        logInfo(s"Received request to decommission workers $hostPorts from ${req.getRemoteAddr}")
+        if (!isDecommissioningRequestAllowed(req)) {
+          resp.sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED)
+        } else {
+          val removedWorkers = masterEndpointRef.askSync[Integer](DecommissionHostPorts(hostPorts))

Review comment:
       Actually screw it. On thinking more about this, it seems like I can make this be async just fine and thereby avoid any such pitfalls. I am going to do that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659103651






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659879374


   thanks, merging to master!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659706043


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125998/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659162409


   **[Test build #125935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125935/testReport)** for PR 29015 at commit [`3ee87f3`](https://github.com/apache/spark/commit/3ee87f376fe499df8aa0000710863f5bf6d9648f).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659201315


   **[Test build #125927 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125927/testReport)** for PR 29015 at commit [`08a4c9b`](https://github.com/apache/spark/commit/08a4c9bf04f5818c0563637a63f0a5793a112723).
    * This patch **fails PySpark pip packaging tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-656474929


   @holdenk, @jiangxb1987 @cloud-fan @Ngone51 -- This PR is ready for your review please. Thanks !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jiangxb1987 commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
jiangxb1987 commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r454137251



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
##########
@@ -863,7 +872,29 @@ private[deploy] class Master(
     true
   }
 
-  private def decommissionWorker(worker: WorkerInfo): Unit = {
+  private def decommissionHostPorts(hostPorts: Seq[String]): Integer = {
+    val hostPortsParsed = hostPorts.map(Utils.parseHostPort)
+    val (hostPortsWithoutPorts, hostPortsWithPorts) = hostPortsParsed.partition(_._2 == 0)

Review comment:
       nit: val (hostPortsWithoutPorts, hostPortsWithPorts) = hostPorts.map(Utils.parseHostPort).partition(_._2 == 0)

##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ROOT)).toSeq
+        logInfo(s"Received request to decommission workers $hostPorts from ${req.getRemoteAddr}")
+        if (!isDecommissioningRequestAllowed(req)) {
+          resp.sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED)
+        } else {
+          val removedWorkers = masterEndpointRef.askSync[Integer](DecommissionHostPorts(hostPorts))
+          if (removedWorkers <= 0) {

Review comment:
       when would the `removedWorkers < 0` happen?

##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ROOT)).toSeq
+        logInfo(s"Received request to decommission workers $hostPorts from ${req.getRemoteAddr}")
+        if (!isDecommissioningRequestAllowed(req)) {
+          resp.sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED)
+        } else {
+          val removedWorkers = masterEndpointRef.askSync[Integer](DecommissionHostPorts(hostPorts))

Review comment:
       will multiple requests block each other, since we use `askSync` here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455247627



##########
File path: core/src/main/scala/org/apache/spark/internal/config/UI.scala
##########
@@ -191,4 +191,14 @@ private[spark] object UI {
     .version("3.0.0")
     .stringConf
     .createOptional
+
+  val MASTER_UI_DECOMMISSION_ALLOW_MODE = ConfigBuilder("spark.master.ui.decommission.allow.mode")
+    .doc("Specifies the behavior of the Master Web UI's /workers/kill endpoint. Possible choices" +
+      " are: `local` means allow this endpoint from IP's that are local to the machine running" +
+      " the Master, `deny` means to completely disable this endpoint, `allow` means to allow" +
+      " calling this endpoint from any IP.")
+    .internal()
+    .version("3.1.0")
+    .stringConf

Review comment:
       it's common to always upper case the config value, as it should be case insensitive. e.g.
   ```
   ...
     .stringConf
     .transform(_.toUpperCase(Locale.ROOT))
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659139663


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455688453



##########
File path: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala
##########
@@ -108,6 +108,9 @@ private[deploy] object DeployMessages {
 
   case class Heartbeat(workerId: String, worker: RpcEndpointRef) extends DeployMessage
 
+  // Out of band commands to Master
+  case class DecommissionHosts(hostnames: Seq[String])

Review comment:
       If we want to be accurate, this should be `DecommissionWorkersByHost`.
   
   Looking at other classes in this file, they are all about master/worker, e.g. `RegisterWorker`, `WorkerDecommission`, etc. That's why I suggest `DecommissionWorkers`, and add comments to explain the workers are specified by hosts.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r454545680



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ROOT)).toSeq
+        logInfo(s"Received request to decommission workers $hostPorts from ${req.getRemoteAddr}")
+        if (!isDecommissioningRequestAllowed(req)) {
+          resp.sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED)
+        } else {
+          val removedWorkers = masterEndpointRef.askSync[Integer](DecommissionHostPorts(hostPorts))

Review comment:
       I changed it such that the actual decommissioning is done async in the master: So now the DecommissionHostPorts call should be very quick and it is okay to be synchronous. ie, DecommissionHostPorts will simply enqueue multiple WorkerDecommission to decommission a worker at a time.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659203688






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jiangxb1987 commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
jiangxb1987 commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455517786



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +55,26 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostnames: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).toSeq
+        if (!isDecommissioningRequestAllowed(req)) {
+          resp.sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED)
+        } else {
+          val removedWorkers = masterEndpointRef.askSync[Integer](DecommissionHosts(hostnames))
+          logInfo(s"Decommissioning of hosts $hostnames decommissioned ${removedWorkers} workers")

Review comment:
       nit: `${removedWorkers}` -> `$removedWorkers`

##########
File path: core/src/test/scala/org/apache/spark/deploy/master/MasterSuite.scala
##########
@@ -726,6 +726,61 @@ class MasterSuite extends SparkFunSuite
     }
   }
 
+  def testWorkerDecommissioning(
+      numWorkers: Int,
+      numWorkersExpectedToDecom: Int,
+      hostnames: Seq[String]): Unit = {
+    val conf = new SparkConf()
+    val master = makeAliveMaster(conf)
+    val workerRegs = (1 to numWorkers).map{idx =>
+      val worker = new MockWorker(master.self, conf)
+      worker.rpcEnv.setupEndpoint("worker", worker)
+      val workerReg = RegisterWorker(
+        worker.id,
+        "localhost",
+        worker.self.address.port,
+        worker.self,
+        10,
+        1024,
+        "http://localhost:8080",
+        RpcAddress("localhost", 10000))
+      master.self.send(workerReg)
+      workerReg
+    }
+
+    eventually(timeout(10.seconds)) {
+      val masterState = master.self.askSync[MasterStateResponse](RequestMasterState)
+      assert(masterState.workers.length === numWorkers)
+      assert(masterState.workers.forall(_.state == WorkerState.ALIVE))
+      assert(masterState.workers.map(_.id).toSet == workerRegs.map(_.id).toSet)
+      masterState.workers

Review comment:
       nit: this is not needed

##########
File path: core/src/test/scala/org/apache/spark/deploy/master/MasterSuite.scala
##########
@@ -726,6 +726,61 @@ class MasterSuite extends SparkFunSuite
     }
   }
 
+  def testWorkerDecommissioning(
+      numWorkers: Int,
+      numWorkersExpectedToDecom: Int,
+      hostnames: Seq[String]): Unit = {
+    val conf = new SparkConf()
+    val master = makeAliveMaster(conf)
+    val workerRegs = (1 to numWorkers).map{idx =>
+      val worker = new MockWorker(master.self, conf)
+      worker.rpcEnv.setupEndpoint("worker", worker)
+      val workerReg = RegisterWorker(
+        worker.id,
+        "localhost",
+        worker.self.address.port,
+        worker.self,
+        10,
+        1024,
+        "http://localhost:8080",
+        RpcAddress("localhost", 10000))
+      master.self.send(workerReg)
+      workerReg
+    }
+
+    eventually(timeout(10.seconds)) {
+      val masterState = master.self.askSync[MasterStateResponse](RequestMasterState)
+      assert(masterState.workers.length === numWorkers)
+      assert(masterState.workers.forall(_.state == WorkerState.ALIVE))
+      assert(masterState.workers.map(_.id).toSet == workerRegs.map(_.id).toSet)
+      masterState.workers
+    }
+
+    val decomWorkersCount = master.self.askSync[Integer](DecommissionHosts(hostnames))
+    assert(decomWorkersCount === numWorkersExpectedToDecom)
+
+    // Decommissioning is actually async ... wait for the workers to actually be decommissioned by
+    // polling the master's state.
+    eventually(timeout(10.seconds)) {

Review comment:
       nit: we may want to give a longer timeout to avoid flakyness.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455240018



##########
File path: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala
##########
@@ -108,6 +108,9 @@ private[deploy] object DeployMessages {
 
   case class Heartbeat(workerId: String, worker: RpcEndpointRef) extends DeployMessage
 
+  // Out of band commands to Master
+  case class DecommissionHostPorts(hostPorts: Seq[String])

Review comment:
       nit: maybe `DecommissionWorkers`? In the comment, we can say that the worker is identified by host and port(optional)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659202075


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r454483988



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
##########
@@ -863,7 +872,29 @@ private[deploy] class Master(
     true
   }
 
-  private def decommissionWorker(worker: WorkerInfo): Unit = {
+  private def decommissionHostPorts(hostPorts: Seq[String]): Integer = {
+    val hostPortsParsed = hostPorts.map(Utils.parseHostPort)
+    val (hostPortsWithoutPorts, hostPortsWithPorts) = hostPortsParsed.partition(_._2 == 0)

Review comment:
       I would like to remain it as it is since I find two line statements confusing to read. Its also just 90 columns and within the 100 limit. There is also precedence for one line partition statements elsewhere.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-658915498


   ok to test


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659774343






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659774343






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659203688


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659844348


   **[Test build #126012 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126012/testReport)** for PR 29015 at commit [`31b231e`](https://github.com/apache/spark/commit/31b231e1b0a984ebdfc408beedaadeec6881ddff).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `  case class DecommissionWorkersOnHosts(hostnames: Seq[String])`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659155613


   jenkins retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] agrawaldevesh commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
agrawaldevesh commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r453845348



##########
File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
##########
@@ -49,6 +56,23 @@ class MasterWebUI(
       "/app/kill", "/", masterPage.handleAppKillRequest, httpMethods = Set("POST")))
     attachHandler(createRedirectHandler(
       "/driver/kill", "/", masterPage.handleDriverKillRequest, httpMethods = Set("POST")))
+    attachHandler(createServletHandler("/workers/kill", new HttpServlet {
+      override def doPost(req: HttpServletRequest, resp: HttpServletResponse): Unit = {
+        val hostPorts: Seq[String] = Option(req.getParameterValues("host"))
+          .getOrElse(Array[String]()).map(_.toLowerCase(Locale.ENGLISH)).toSeq

Review comment:
       Bad habbit :-) 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659042917






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659119735


   **[Test build #125927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125927/testReport)** for PR 29015 at commit [`08a4c9b`](https://github.com/apache/spark/commit/08a4c9bf04f5818c0563637a63f0a5793a112723).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659621891






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659065467






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #29015:
URL: https://github.com/apache/spark/pull/29015#discussion_r455248954



##########
File path: core/src/test/scala/org/apache/spark/deploy/master/ui/MasterWebUISuite.scala
##########
@@ -21,22 +21,24 @@ import java.io.DataOutputStream
 import java.net.{HttpURLConnection, URL}
 import java.nio.charset.StandardCharsets
 import java.util.Date
+import javax.servlet.http.HttpServletResponse
 
 import scala.collection.mutable.HashMap
 
 import org.mockito.Mockito.{mock, times, verify, when}
 import org.scalatest.BeforeAndAfterAll
 
 import org.apache.spark.{SecurityManager, SparkConf, SparkFunSuite}
-import org.apache.spark.deploy.DeployMessages.{KillDriverResponse, RequestKillDriver}
+import org.apache.spark.deploy.DeployMessages.{DecommissionHostPorts, KillDriverResponse, RequestKillDriver}
 import org.apache.spark.deploy.DeployTestUtils._
 import org.apache.spark.deploy.master._
+import org.apache.spark.internal.config.UI
 import org.apache.spark.rpc.{RpcEndpointRef, RpcEnv}
 
 
 class MasterWebUISuite extends SparkFunSuite with BeforeAndAfterAll {
 
-  val conf = new SparkConf
+  val conf = new SparkConf().set(UI.MASTER_UI_DECOMMISSION_ALLOW_MODE, "local")

Review comment:
       we don't need to change it if "local" is the default




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659773878


   **[Test build #126012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126012/testReport)** for PR 29015 at commit [`31b231e`](https://github.com/apache/spark/commit/31b231e1b0a984ebdfc408beedaadeec6881ddff).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659203148


   **[Test build #125935 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125935/testReport)** for PR 29015 at commit [`3ee87f3`](https://github.com/apache/spark/commit/3ee87f376fe499df8aa0000710863f5bf6d9648f).
    * This patch **fails due to an unknown error code, -9**.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `  case class DecommissionHosts(hostnames: Seq[String])`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29015:
URL: https://github.com/apache/spark/pull/29015#issuecomment-659621152


   **[Test build #125998 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125998/testReport)** for PR 29015 at commit [`c6a6a90`](https://github.com/apache/spark/commit/c6a6a90893279dd6deb02af007ba4ec5553a8cd4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org