You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by zjzzjz <ji...@gmail.com> on 2019/03/10 01:14:29 UTC
How to know if a machine in a Spark cluster 'participate's a job
I wanted to know when it is safe to remove a node from a machine from a
cluster.
My assumption is that it could be safe to remove a machine if the machine
does not have any containers, and it does not store any useful data.
By the APIs at
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html,
we can do
GET http://<rm http address:port>/ws/v1/cluster/nodes
to get the information of each node like
<node>
<rack>/default-rack</rack>
<state>RUNNING</state>
<id>host1.domain.com:54158</id>
<nodeHostName>host1.domain.com</nodeHostName>
<nodeHTTPAddress>host1.domain.com:8042</nodeHTTPAddress>
<lastHealthUpdate>1476995346399</lastHealthUpdate>
<version>3.0.0-SNAPSHOT</version>
<healthReport></healthReport>
<numContainers>0</numContainers>
<usedMemoryMB>0</usedMemoryMB>
<availMemoryMB>8192</availMemoryMB>
<usedVirtualCores>0</usedVirtualCores>
<availableVirtualCores>8</availableVirtualCores>
<resourceUtilization>
<nodePhysicalMemoryMB>1027</nodePhysicalMemoryMB>
<nodeVirtualMemoryMB>1027</nodeVirtualMemoryMB>
<nodeCPUUsage>0.006664445623755455</nodeCPUUsage>
<aggregatedContainersPhysicalMemoryMB>0</aggregatedContainersPhysicalMemoryMB>
<aggregatedContainersVirtualMemoryMB>0</aggregatedContainersVirtualMemoryMB>
<containersCPUUsage>0.0</containersCPUUsage>
</resourceUtilization>
</node>
If numContainers is 0, I assume it does not run containers. However can it
still store any data on disk that other downstream tasks can read?
I did not get if Spark lets us know this. I assume if a machine still stores
some data useful for the running job, the machine may maintain a heart beat
with Spark Driver or some central controller? Can we check this by scanning
tcp or udp connections?
Is there any other way to check if a machine in a Spark cluster participates
a job?
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org