You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by GitBox <gi...@apache.org> on 2020/06/09 15:51:49 UTC

[GitHub] [storm] RuiLi8080 opened a new pull request #3283: [STORM-3650] Add topology config to attempt to schedule system component evenly throughout workers

RuiLi8080 opened a new pull request #3283:
URL: https://github.com/apache/storm/pull/3283


   ## What is the purpose of the change
   
   Add topology config for RAS scheduler to attempt to schedule system components/executors across all workers evenly instead of scheduling them all together after having finished all topology components.
   
   ## How was the change tested
   
   1. test with even distribution option:
   `storm jar /home/y/lib64/jars/storm-starter.jar org.apache.storm.starter.WordCountTopology wc1 -c topology..system.components.even.distribution=true -c topology.acker.executors=8 -c topology.component.resources.onheap.memory.mb=100 -c topology.acker.resources.onheap.memory.mb=150 -c topology.workers=4 -c experimental.topology.ras.order.executors.by.proximity.needs=true -c topology.worker.max.heap.size.mb=1000`
   
   ![image](https://user-images.githubusercontent.com/15622046/84088095-5941a000-a9b1-11ea-87e0-26d3b75fc455.png)
   
   2. test without even distribution option `storm jar /home/y/lib64/jars/storm-starter.jar org.apache.storm.starter.WordCountTopology wc-no-even-distribution -c topology.acker.executors=8 -c topology.component.resources.onheap.memory.mb=100 -c topology.acker.resources.onheap.memory.mb=150 -c topology.workers=4 -c experimental.topology.ras.order.executors.by.proximity.needs=true -c topology.worker.max.heap.size.mb=1000`
   ![image](https://user-images.githubusercontent.com/15622046/84088124-6bbbd980-a9b1-11ea-9839-b65b46329353.png)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [storm] bipinprasad commented on a change in pull request #3283: [STORM-3650] Add topology config to attempt to schedule system component evenly throughout workers

Posted by GitBox <gi...@apache.org>.
bipinprasad commented on a change in pull request #3283:
URL: https://github.com/apache/storm/pull/3283#discussion_r437712750



##########
File path: storm-server/src/main/java/org/apache/storm/scheduler/resource/strategies/scheduling/BaseResourceAwareStrategy.java
##########
@@ -539,30 +541,34 @@ protected String nodeToRack(RasNode node) {
      * @return a list of executors in sorted order
      */
     private List<ExecutorDetails> orderExecutorsDefault(

Review comment:
       It will be convenient if the even-distribution executor re-sorting was a single method that could be applied to executor list.
   This way the upcoming refactoring of the scheduling strategies would be easier.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [storm] bipinprasad commented on a change in pull request #3283: [STORM-3650] Add topology config to attempt to schedule system component evenly throughout workers

Posted by GitBox <gi...@apache.org>.
bipinprasad commented on a change in pull request #3283:
URL: https://github.com/apache/storm/pull/3283#discussion_r437718577



##########
File path: storm-server/src/main/java/org/apache/storm/scheduler/resource/strategies/scheduling/IStrategy.java
##########
@@ -40,5 +42,5 @@
      * @return returns a SchedulingResult object containing SchedulingStatus object to indicate whether scheduling is
      *     successful.
      */
-    SchedulingResult schedule(Cluster schedulingState, TopologyDetails td);
+    SchedulingResult schedule(Cluster schedulingState, TopologyDetails td) throws InvalidTopologyException;

Review comment:
       Why is InvalidTopologyException being thrown in schedule.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [storm] bipinprasad commented on a change in pull request #3283: [STORM-3650] Add topology config to attempt to schedule system component evenly throughout workers

Posted by GitBox <gi...@apache.org>.
bipinprasad commented on a change in pull request #3283:
URL: https://github.com/apache/storm/pull/3283#discussion_r437700141



##########
File path: storm-server/src/main/java/org/apache/storm/scheduler/TopologyDetails.java
##########
@@ -84,6 +89,8 @@ public TopologyDetails(String topologyId, Map<String, Object> topologyConf, Stor
         initConfigs();
         this.launchTime = launchTime;
         this.topoName = (String) topologyConf.get(Config.TOPOLOGY_NAME);
+        this.distributeSysCompEvenly

Review comment:
       This property is more useful in Scheduler code where is can be used to sort executors in a manner conducive to even distribution.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [storm] bipinprasad commented on a change in pull request #3283: [STORM-3650] Add topology config to attempt to schedule system component evenly throughout workers

Posted by GitBox <gi...@apache.org>.
bipinprasad commented on a change in pull request #3283:
URL: https://github.com/apache/storm/pull/3283#discussion_r437703384



##########
File path: storm-server/src/main/java/org/apache/storm/scheduler/TopologyDetails.java
##########
@@ -193,22 +200,44 @@ private void initResourceList() {
     }
 
     /**
-     * Returns a representation of the non-system components of the topology graph Each Component object in the returning map is populated
+     * Returns a representation of the non-system components of the topology graph. Each Component object in the returning map is populated
      * with the list of its parents, children and execs assigned to that component.
      *
      * @return a map of components
      */
-    public Map<String, Component> getComponents() {
+    public Map<String, Component> getTopoComponents() {

Review comment:
       Can this method retain the old name? And the getAllComponents be an overloaded getComponents()?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [storm] RuiLi8080 commented on pull request #3283: [STORM-3650] Add topology config to attempt to schedule system component evenly throughout workers

Posted by GitBox <gi...@apache.org>.
RuiLi8080 commented on pull request #3283:
URL: https://github.com/apache/storm/pull/3283#issuecomment-641361584


   Travis has a ExecutorTransferMultiThreadingTest failed. Re-run the test locally and it passed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [storm] bipinprasad commented on a change in pull request #3283: [STORM-3650] Add topology config to attempt to schedule system component evenly throughout workers

Posted by GitBox <gi...@apache.org>.
bipinprasad commented on a change in pull request #3283:
URL: https://github.com/apache/storm/pull/3283#discussion_r437684384



##########
File path: storm-server/src/main/java/org/apache/storm/daemon/nimbus/Nimbus.java
##########
@@ -1695,10 +1695,10 @@ public HeartbeatCache getHeartbeatsCache() {
 
     private Set<List<Integer>> getOrUpdateExecutors(String topoId, StormBase base, Map<String, Object> topoConf,
                                                     StormTopology topology)
-        throws IOException, AuthorizationException, InvalidTopologyException, KeyNotFoundException {
+        throws InvalidTopologyException {

Review comment:
       Can the Nimbus cleanup of parameter (topoId) and throws clause be part of a separate Jira?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [storm] bipinprasad commented on a change in pull request #3283: [STORM-3650] Add topology config to attempt to schedule system component evenly throughout workers

Posted by GitBox <gi...@apache.org>.
bipinprasad commented on a change in pull request #3283:
URL: https://github.com/apache/storm/pull/3283#discussion_r437702240



##########
File path: storm-server/src/main/java/org/apache/storm/scheduler/TopologyDetails.java
##########
@@ -193,22 +200,44 @@ private void initResourceList() {
     }
 
     /**
-     * Returns a representation of the non-system components of the topology graph Each Component object in the returning map is populated
+     * Returns a representation of the non-system components of the topology graph. Each Component object in the returning map is populated
      * with the list of its parents, children and execs assigned to that component.
      *
      * @return a map of components
      */
-    public Map<String, Component> getComponents() {
+    public Map<String, Component> getTopoComponents() {
+        if (topologyComponentsMap == null) {

Review comment:
       TopologyDetails gets instantiated a large number of times. It was my observation that adding members or such cacheing in TopologyDetails was counterproductive. topologyComponentsMap and its sister cache allComponentsMap should not be added unless we actually gain from cacheing.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [storm] RuiLi8080 closed pull request #3283: [STORM-3650] Add topology config to attempt to schedule system component evenly throughout workers

Posted by GitBox <gi...@apache.org>.
RuiLi8080 closed pull request #3283:
URL: https://github.com/apache/storm/pull/3283


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org