You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by kishorvpatil <gi...@git.apache.org> on 2018/04/11 00:49:41 UTC

[GitHub] storm pull request #2631: [STORM-3025] Optimize Cluster methods with Caching...

GitHub user kishorvpatil opened a pull request:

    https://github.com/apache/storm/pull/2631

    [STORM-3025] Optimize Cluster methods with Caching to avoid loopy loops 

    Optimizing the `Cluster` methods that are called from with high frequency calls to speed-up scheduling time on new topologies on large clusters.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kishorvpatil/incubator-storm storm3025

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/2631.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2631
    
----
commit 46486d87fc8a26262d75abc05c1398afceee687c
Author: Kishor Patil <kp...@...>
Date:   2018-04-10T21:42:03Z

    Remove loopy loops in scheduler cluster state

commit caf7f885d6c10eaed440b02ec345e863af700762
Author: Kishor Patil <kp...@...>
Date:   2018-04-10T22:10:10Z

    Clean up caching vars from Cluster

commit b57654c8a90705029ba3555ccd75d7056051c31c
Author: Kishor Patil <kp...@...>
Date:   2018-04-10T22:42:54Z

    Cache supervisor to Used Ports in Cluster state

----


---

[GitHub] storm pull request #2631: [STORM-3025] Optimize Cluster methods with Caching...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/2631


---

[GitHub] storm pull request #2631: [STORM-3025] Optimize Cluster methods with Caching...

Posted by agresch <gi...@git.apache.org>.
Github user agresch commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2631#discussion_r180794169
  
    --- Diff: storm-server/src/main/java/org/apache/storm/scheduler/resource/normalization/NormalizedResourceRequest.java ---
    @@ -187,4 +187,29 @@ public double getTotalCpu() {
         public NormalizedResources getNormalizedResources() {
             return this.normalizedResources;
         }
    +
    +    public void removeOffHeap(final double offHeap) {
    +        this.offHeap += offHeap;
    +    }
    +
    +    public void remove(WorkerResources value) {
    --- End diff --
    
    were these used?


---

[GitHub] storm pull request #2631: [STORM-3025] Optimize Cluster methods with Caching...

Posted by kishorvpatil <gi...@git.apache.org>.
Github user kishorvpatil commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2631#discussion_r180800090
  
    --- Diff: storm-server/src/main/java/org/apache/storm/scheduler/resource/normalization/NormalizedResourceRequest.java ---
    @@ -187,4 +187,29 @@ public double getTotalCpu() {
         public NormalizedResources getNormalizedResources() {
             return this.normalizedResources;
         }
    +
    +    public void removeOffHeap(final double offHeap) {
    --- End diff --
    
    This is unused method. I was trying some thing else. Let me delete this method.


---

[GitHub] storm pull request #2631: [STORM-3025] Optimize Cluster methods with Caching...

Posted by agresch <gi...@git.apache.org>.
Github user agresch commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2631#discussion_r180767148
  
    --- Diff: storm-server/src/main/java/org/apache/storm/scheduler/Cluster.java ---
    @@ -301,17 +304,7 @@ public boolean needsSchedulingRas(TopologyDetails topology) {
     
         @Override
         public Set<Integer> getUsedPorts(SupervisorDetails supervisor) {
    -        Set<Integer> usedPorts = new HashSet<>();
    -
    -        for (SchedulerAssignment assignment : assignments.values()) {
    -            for (WorkerSlot slot : assignment.getExecutorToSlot().values()) {
    -                if (slot.getNodeId().equals(supervisor.getId())) {
    -                    usedPorts.add(slot.getPort());
    -                }
    -            }
    -        }
    -
    -        return usedPorts;
    +        return nodeToUsedPortsCache.computeIfAbsent(supervisor.getId(), (x) -> new HashSet<>()).stream().map(WorkerSlot::getPort).collect(Collectors.toSet());
    --- End diff --
    
    It looks like updateScheduledResourcesCache() - called from assign() - actually adds the worker slots to nodeToUsedPortsCache.  This may be fine, but might be nice to add a Java doc note about this since this is a public method.  



---

[GitHub] storm pull request #2631: [STORM-3025] Optimize Cluster methods with Caching...

Posted by agresch <gi...@git.apache.org>.
Github user agresch commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2631#discussion_r180755909
  
    --- Diff: storm-server/src/main/java/org/apache/storm/scheduler/Cluster.java ---
    @@ -79,8 +81,8 @@
         private Set<String> blackListedHosts = new HashSet<>();
         private INimbus inimbus;
         private final Topologies topologies;
    -    private final Map<String, Double> scheduledCpuCache = new HashMap<>();
    -    private final Map<String, Double> scheduledMemoryCache = new HashMap<>();
    +    private final Map<String, Map<WorkerSlot, NormalizedResourceRequest>> nodeToScheduledResourcesCache;
    +    private final HashMap<String, Set<WorkerSlot>> nodeToUsedPortsCache;
    --- End diff --
    
    use Map instead of HashMap?


---

[GitHub] storm issue #2631: [STORM-3025] Optimize Cluster methods with Caching to avo...

Posted by kishorvpatil <gi...@git.apache.org>.
Github user kishorvpatil commented on the issue:

    https://github.com/apache/storm/pull/2631
  
    @agresch I have addressed the code review comments.


---

[GitHub] storm pull request #2631: [STORM-3025] Optimize Cluster methods with Caching...

Posted by agresch <gi...@git.apache.org>.
Github user agresch commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2631#discussion_r180792829
  
    --- Diff: storm-server/src/main/java/org/apache/storm/scheduler/resource/normalization/NormalizedResourceRequest.java ---
    @@ -187,4 +187,29 @@ public double getTotalCpu() {
         public NormalizedResources getNormalizedResources() {
             return this.normalizedResources;
         }
    +
    +    public void removeOffHeap(final double offHeap) {
    --- End diff --
    
    where is this used?  Why increment?


---

[GitHub] storm pull request #2631: [STORM-3025] Optimize Cluster methods with Caching...

Posted by kishorvpatil <gi...@git.apache.org>.
Github user kishorvpatil commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2631#discussion_r180811674
  
    --- Diff: storm-server/src/main/java/org/apache/storm/scheduler/resource/normalization/NormalizedResourceRequest.java ---
    @@ -187,4 +187,29 @@ public double getTotalCpu() {
         public NormalizedResources getNormalizedResources() {
             return this.normalizedResources;
         }
    +
    +    public void removeOffHeap(final double offHeap) {
    +        this.offHeap += offHeap;
    +    }
    +
    +    public void remove(WorkerResources value) {
    --- End diff --
    
    Not used



---