You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by GitBox <gi...@apache.org> on 2020/02/28 17:39:03 UTC

[GitHub] [samza] Sanil15 opened a new pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Sanil15 opened a new pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297
 
 
   **Design**: [SEP-22: Container Placements in Samza](https://cwiki.apache.org/confluence/display/SAMZA/SEP-22:+Container+Placements+in+Samza)
   
   **Changes:**: Today if the config "cluster-manager.container.fail.job.after.retries" is set to false that means after exhausting all the retry on the containers, Job Coordinator does not fail the job but keeps running the job without the failed container. In such scenarios, this container can be started back on the host it was last running on or any other host using container placements APIs 
   
   **API Changes:** None
   
   **Upgrade Instructions:** None
   
   **Usage Instructions:** Instantiate ContainerPlacementMetadataStore to write Container placement messages to Metastore, query it by the UUID generated

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389048888
 
 

 ##########
 File path: samza-core/src/test/java/org/apache/samza/clustermanager/TestContainerPlacementActions.java
 ##########
 @@ -528,6 +528,124 @@ public Void answer(InvocationOnMock invocation) {
     assertFalse(containerPlacementMetadataStore.readContainerPlacementRequestMessage(requestMessage.getUuid()).isPresent());
   }
 
+
+  @Test(timeout = 20000)
 
 Review comment:
   This test is littered with thread.sleep and timeouts -- a typical recipe for flaky jarvis/test, and increasing build times.
   Would it be possible to reduce/consolidate them?
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389032381
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
 ##########
 @@ -429,6 +429,25 @@ private void writeContainerPlacementResponseMessage(ContainerPlacementRequestMes
             responseMessage, System.currentTimeMillis()));
   }
 
+  /**
+   * Gets the hostname on which container is either currently running or was last seen on if it is not running
+   */
+  private String getSourceHostForContainer(ContainerPlacementRequestMessage requestMessage) {
+    String sourceHost = null;
+    String processorId = requestMessage.getProcessorId();
+    if (samzaApplicationState.runningProcessors.containsKey(processorId)) {
+      SamzaResource currentResource = samzaApplicationState.runningProcessors.get(processorId);
+      LOG.info("Processor ID: {} matched a running container with containerId ID: {} is running on host: {} for ContainerPlacement action: {}",
+          processorId, currentResource.getContainerId(), currentResource.getHost(), requestMessage);
+      sourceHost = currentResource.getHost();
+    } else {
+      sourceHost = samzaApplicationState.jobModelManager.jobModel().getContainerToHostValue(processorId, SetContainerHostMapping.HOST_KEY);
+      LOG.info("Processor ID: {} is not running and was last seen on host: {} for ContainerPlacement action: {}",
+          processorId, sourceHost, requestMessage);
 
 Review comment:
   maybe link jira for that here?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389880976
 
 

 ##########
 File path: samza-core/src/test/java/org/apache/samza/clustermanager/TestContainerPlacementActions.java
 ##########
 @@ -528,6 +528,124 @@ public Void answer(InvocationOnMock invocation) {
     assertFalse(containerPlacementMetadataStore.readContainerPlacementRequestMessage(requestMessage.getUuid()).isPresent());
   }
 
+
+  @Test(timeout = 20000)
 
 Review comment:
   These are integ tests and it takes 35 seconds to run all of them serially, we can also make them run in parallel to further reduce build times

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389877478
 
 

 ##########
 File path: samza-core/src/test/java/org/apache/samza/clustermanager/MockClusterResourceManager.java
 ##########
 @@ -102,12 +102,17 @@ public void launchStreamProcessor(SamzaResource resource, CommandBuilder builder
 
   @Override
   public void stopStreamProcessor(SamzaResource resource) {
-    SamzaResourceStatus status = new SamzaResourceStatus(resource.getContainerId(), "diagnostics", SamzaResourceStatus.PREEMPTED);
+    stopStreamProcessor(resource, SamzaResourceStatus.PREEMPTED);
+  }
+
+  void stopStreamProcessor(SamzaResource resource, int exitCode) {
 
 Review comment:
   this class is a Mock itself but sure Ill add it

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r388601777
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
 ##########
 @@ -429,6 +429,25 @@ private void writeContainerPlacementResponseMessage(ContainerPlacementRequestMes
             responseMessage, System.currentTimeMillis()));
   }
 
+  /**
+   * Gets the hostname on which container is either currently running or was last seen on if it is not running
+   */
+  private String getSourceHostForContainer(ContainerPlacementRequestMessage requestMessage) {
+    String sourceHost = null;
+    String processorId = requestMessage.getProcessorId();
+    if (samzaApplicationState.runningProcessors.containsKey(processorId)) {
+      SamzaResource currentResource = samzaApplicationState.runningProcessors.get(processorId);
+      LOG.info("Processor ID: {} matched a running container with containerId ID: {} is running on host: {} for ContainerPlacement action: {}",
+          processorId, currentResource.getContainerId(), currentResource.getHost(), requestMessage);
+      sourceHost = currentResource.getHost();
+    } else {
+      sourceHost = samzaApplicationState.jobModelManager.jobModel().getContainerToHostValue(processorId, SetContainerHostMapping.HOST_KEY);
+      LOG.info("Processor ID: {} is not running and was last seen on host: {} for ContainerPlacement action: {}",
+          processorId, sourceHost, requestMessage);
 
 Review comment:
   Yes it is possible, one last part left in CPM refactor is moving onResourcesCompleted logic to container manager, let's do this as a part of that logic

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r388599758
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
 ##########
 @@ -443,6 +462,11 @@ private boolean deQueueAction(ContainerPlacementRequestMessage requestMessage) {
     if (checkIfActiveOrStandbyContainerHasActivePlacementAction(requestMessage)) {
       return false;
     }
+
+    if (samzaApplicationState.failedProcessors.containsKey(requestMessage.getProcessorId())) {
+      return true;
 
 Review comment:
   I can add one but the de-queuing is already logged at the caller

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 removed a comment on issue #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 removed a comment on issue #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#issuecomment-592644119
 
 
   Note: this is stacked PR on (#1281 ), please review changes from this commit: https://github.com/apache/samza/pull/1297/commits/9d8df53c2315b73cb21d099eb95f2bd309d6334d

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389876973
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/SamzaApplicationState.java
 ##########
 @@ -110,7 +110,17 @@
    */
   public final ConcurrentMap<String, SamzaResource> runningProcessors = new ConcurrentHashMap<>(0);
 
-   /**
+  /**
+   *  Map of the failed Samza processor ID to resource status of the last attempted of the container.
+   *  This map is only used when {@link org.apache.samza.config.ClusterManagerConfig#CLUSTER_MANAGER_CONTAINER_FAIL_JOB_AFTER_RETRIES}
+   *  is set to false, this map tracks the containers which have exhausted all retires for restart and JobCoordinator is
+   *  no longer attempting to restart this container
+   *
+   *  Modified by both the AMRMCallbackThread and the ContainerAllocator thread
+   */
+  public final ConcurrentHashMap<String, SamzaResourceStatus> failedProcessors = new ConcurrentHashMap<>(0);
 
 Review comment:
   thats YarnAppState not SamzaApplicationState

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389876229
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
 ##########
 @@ -429,6 +429,25 @@ private void writeContainerPlacementResponseMessage(ContainerPlacementRequestMes
             responseMessage, System.currentTimeMillis()));
   }
 
+  /**
+   * Gets the hostname on which container is either currently running or was last seen on if it is not running
+   */
+  private String getSourceHostForContainer(ContainerPlacementRequestMessage requestMessage) {
+    String sourceHost = null;
+    String processorId = requestMessage.getProcessorId();
+    if (samzaApplicationState.runningProcessors.containsKey(processorId)) {
+      SamzaResource currentResource = samzaApplicationState.runningProcessors.get(processorId);
+      LOG.info("Processor ID: {} matched a running container with containerId ID: {} is running on host: {} for ContainerPlacement action: {}",
+          processorId, currentResource.getContainerId(), currentResource.getHost(), requestMessage);
+      sourceHost = currentResource.getHost();
+    } else {
+      sourceHost = samzaApplicationState.jobModelManager.jobModel().getContainerToHostValue(processorId, SetContainerHostMapping.HOST_KEY);
+      LOG.info("Processor ID: {} is not running and was last seen on host: {} for ContainerPlacement action: {}",
+          processorId, sourceHost, requestMessage);
 
 Review comment:
   sure

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] mynameborat merged pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
mynameborat merged pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r387923398
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
 ##########
 @@ -429,6 +429,25 @@ private void writeContainerPlacementResponseMessage(ContainerPlacementRequestMes
             responseMessage, System.currentTimeMillis()));
   }
 
+  /**
+   * Gets the hostname on which container is either currently running or was last seen on if it is not running
+   */
+  private String getSourceHostForContainer(ContainerPlacementRequestMessage requestMessage) {
+    String sourceHost = null;
+    String processorId = requestMessage.getProcessorId();
+    if (samzaApplicationState.runningProcessors.containsKey(processorId)) {
+      SamzaResource currentResource = samzaApplicationState.runningProcessors.get(processorId);
+      LOG.info("Processor ID: {} matched a running container with containerId ID: {} is running on host: {} for ContainerPlacement action: {}",
+          processorId, currentResource.getContainerId(), currentResource.getHost(), requestMessage);
+      sourceHost = currentResource.getHost();
+    } else {
+      sourceHost = samzaApplicationState.jobModelManager.jobModel().getContainerToHostValue(processorId, SetContainerHostMapping.HOST_KEY);
+      LOG.info("Processor ID: {} is not running and was last seen on host: {} for ContainerPlacement action: {}",
+          processorId, sourceHost, requestMessage);
 
 Review comment:
   CPM does a similar thing at 
   ```
   String lastSeenOn = state.jobModelManager.jobModel().getContainerToHostValue(processorId, SetContainerHostMapping.HOST_KEY);
   ```
   Is it possible to consolidate these multiple job-model-manager accesses?
   
   maybe call it just getHostForContainer?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r388595627
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
 ##########
 @@ -464,7 +488,8 @@ private boolean deQueueAction(ContainerPlacementRequestMessage requestMessage) {
     Boolean invalidAction = false;
     String errorMessage = null;
     if (!samzaApplicationState.runningProcessors.containsKey(requestMessage.getProcessorId()) &&
-        !samzaApplicationState.pendingProcessors.containsKey(requestMessage.getProcessorId())
+        !samzaApplicationState.pendingProcessors.containsKey(requestMessage.getProcessorId()) &&
+        !samzaApplicationState.failedProcessors.containsKey(requestMessage.getProcessorId())
 
 Review comment:
   sure

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on issue #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on issue #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#issuecomment-592644119
 
 
   Note: this is stacked PR on (#1281 ), please review changes from this commit: https://github.com/apache/samza/pull/1297/commits/9d8df53c2315b73cb21d099eb95f2bd309d6334d

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389040864
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/SamzaApplicationState.java
 ##########
 @@ -110,7 +110,17 @@
    */
   public final ConcurrentMap<String, SamzaResource> runningProcessors = new ConcurrentHashMap<>(0);
 
-   /**
+  /**
+   *  Map of the failed Samza processor ID to resource status of the last attempted of the container.
+   *  This map is only used when {@link org.apache.samza.config.ClusterManagerConfig#CLUSTER_MANAGER_CONTAINER_FAIL_JOB_AFTER_RETRIES}
+   *  is set to false, this map tracks the containers which have exhausted all retires for restart and JobCoordinator is
+   *  no longer attempting to restart this container
+   *
+   *  Modified by both the AMRMCallbackThread and the ContainerAllocator thread
+   */
+  public final ConcurrentHashMap<String, SamzaResourceStatus> failedProcessors = new ConcurrentHashMap<>(0);
 
 Review comment:
   As long as we use one of the two. 
   Its also used by YarnClusterResourceManager for updating "completed" containers, 
   which are containers that exit after finishing gracefully.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r388595567
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/SamzaApplicationState.java
 ##########
 @@ -110,7 +110,17 @@
    */
   public final ConcurrentMap<String, SamzaResource> runningProcessors = new ConcurrentHashMap<>(0);
 
-   /**
+  /**
+   *  Map of the failed Samza processor ID to resource status of the last attempted of the container.
+   *  This map is only used when {@link org.apache.samza.config.ClusterManagerConfig#CLUSTER_MANAGER_CONTAINER_FAIL_JOB_AFTER_RETRIES}
+   *  is set to false, this map tracks the containers which have exhausted all retires for restart and JobCoordinator is
+   *  no longer attempting to restart this container
+   *
+   *  Modified by both the AMRMCallbackThread and the ContainerAllocator thread
+   */
+  public final ConcurrentHashMap<String, SamzaResourceStatus> failedProcessors = new ConcurrentHashMap<>(0);
 
 Review comment:
   good catch, it was there but was only populated by onResourceCompletedWithUnknownStatus and not used anywhere else, we do not need it I will remove that one

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r387925995
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/SamzaApplicationState.java
 ##########
 @@ -110,7 +110,17 @@
    */
   public final ConcurrentMap<String, SamzaResource> runningProcessors = new ConcurrentHashMap<>(0);
 
-   /**
+  /**
+   *  Map of the failed Samza processor ID to resource status of the last attempted of the container.
+   *  This map is only used when {@link org.apache.samza.config.ClusterManagerConfig#CLUSTER_MANAGER_CONTAINER_FAIL_JOB_AFTER_RETRIES}
+   *  is set to false, this map tracks the containers which have exhausted all retires for restart and JobCoordinator is
+   *  no longer attempting to restart this container
+   *
+   *  Modified by both the AMRMCallbackThread and the ContainerAllocator thread
+   */
+  public final ConcurrentHashMap<String, SamzaResourceStatus> failedProcessors = new ConcurrentHashMap<>(0);
 
 Review comment:
   This class already has a map called 
   ```
   /**
      * ContainerStatuses of failed containers.	   * ContainerStatuses of failed containers.
      */	 
     public final ConcurrentMap<String, SamzaResourceStatus> failedContainersStatus = new ConcurrentHashMap<>();	 
   ```
   
   Could we please reuse that?
   
   Also the variable type can just be a Map<,> doesnt need to be ConcurrentHashMap

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r388596882
 
 

 ##########
 File path: samza-core/src/test/java/org/apache/samza/clustermanager/MockClusterResourceManager.java
 ##########
 @@ -102,12 +102,17 @@ public void launchStreamProcessor(SamzaResource resource, CommandBuilder builder
 
   @Override
   public void stopStreamProcessor(SamzaResource resource) {
-    SamzaResourceStatus status = new SamzaResourceStatus(resource.getContainerId(), "diagnostics", SamzaResourceStatus.PREEMPTED);
+    stopStreamProcessor(resource, SamzaResourceStatus.PREEMPTED);
+  }
+
+  void stopStreamProcessor(SamzaResource resource, int exitCode) {
 
 Review comment:
   yes 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389032919
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
 ##########
 @@ -486,9 +512,12 @@ private boolean deQueueAction(ContainerPlacementRequestMessage requestMessage) {
     String errorMessagePrefix = ContainerPlacementMessage.StatusCode.BAD_REQUEST + " reason: %s";
     Boolean invalidAction = false;
     String errorMessage = null;
-    if (!samzaApplicationState.runningProcessors.containsKey(requestMessage.getProcessorId()) &&
-        !samzaApplicationState.pendingProcessors.containsKey(requestMessage.getProcessorId())
-    ) {
+
+    boolean isRunning = samzaApplicationState.runningProcessors.containsKey(requestMessage.getProcessorId());
+    boolean isPending = samzaApplicationState.pendingProcessors.containsKey(requestMessage.getProcessorId());
+    boolean isFailed = samzaApplicationState.failedProcessors.containsKey(requestMessage.getProcessorId());
+
+    if (!isRunning && !isPending && !isFailed) {
       errorMessage = String.format(errorMessagePrefix, "invalid processor id neither in running or pending processors");
 
 Review comment:
   "neither in running, pending or failed state"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r387917740
 
 

 ##########
 File path: samza-core/src/test/java/org/apache/samza/clustermanager/MockClusterResourceManager.java
 ##########
 @@ -102,12 +102,17 @@ public void launchStreamProcessor(SamzaResource resource, CommandBuilder builder
 
   @Override
   public void stopStreamProcessor(SamzaResource resource) {
-    SamzaResourceStatus status = new SamzaResourceStatus(resource.getContainerId(), "diagnostics", SamzaResourceStatus.PREEMPTED);
+    stopStreamProcessor(resource, SamzaResourceStatus.PREEMPTED);
+  }
+
+  void stopStreamProcessor(SamzaResource resource, int exitCode) {
 
 Review comment:
   Is this required only for testing?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389041424
 
 

 ##########
 File path: samza-core/src/test/java/org/apache/samza/clustermanager/MockClusterResourceManager.java
 ##########
 @@ -102,12 +102,17 @@ public void launchStreamProcessor(SamzaResource resource, CommandBuilder builder
 
   @Override
   public void stopStreamProcessor(SamzaResource resource) {
-    SamzaResourceStatus status = new SamzaResourceStatus(resource.getContainerId(), "diagnostics", SamzaResourceStatus.PREEMPTED);
+    stopStreamProcessor(resource, SamzaResourceStatus.PREEMPTED);
+  }
+
+  void stopStreamProcessor(SamzaResource resource, int exitCode) {
 
 Review comment:
   Maybe @visible for testing or 
   // package private for testing?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] mynameborat commented on issue #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
mynameborat commented on issue #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#issuecomment-596807198
 
 
   Committing the patch based on - https://travis-ci.org/Sanil15/samza/builds/660300111

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
Sanil15 commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r389879023
 
 

 ##########
 File path: samza-core/src/test/java/org/apache/samza/clustermanager/TestContainerPlacementActions.java
 ##########
 @@ -528,6 +528,124 @@ public Void answer(InvocationOnMock invocation) {
     assertFalse(containerPlacementMetadataStore.readContainerPlacementRequestMessage(requestMessage.getUuid()).isPresent());
   }
 
+
+  @Test(timeout = 20000)
 
 Review comment:
   I have used semaphores where it was easy to inject (awaiting container launch), this timeout configured on tests is generous enough, each callback spawns a different thread to mimic AMRMClientAsync behavior, so if I were to do all of these with using semaphores instead of thread.sleep for state change, I would need some significant refactors

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r387924270
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
 ##########
 @@ -443,6 +462,11 @@ private boolean deQueueAction(ContainerPlacementRequestMessage requestMessage) {
     if (checkIfActiveOrStandbyContainerHasActivePlacementAction(requestMessage)) {
       return false;
     }
+
+    if (samzaApplicationState.failedProcessors.containsKey(requestMessage.getProcessorId())) {
+      return true;
 
 Review comment:
   Log statements needed?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state

Posted by GitBox <gi...@apache.org>.
rmatharu commented on a change in pull request #1297: SAMZA-2379: Support Container Placements for job running in degraded state 
URL: https://github.com/apache/samza/pull/1297#discussion_r387925005
 
 

 ##########
 File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
 ##########
 @@ -464,7 +488,8 @@ private boolean deQueueAction(ContainerPlacementRequestMessage requestMessage) {
     Boolean invalidAction = false;
     String errorMessage = null;
     if (!samzaApplicationState.runningProcessors.containsKey(requestMessage.getProcessorId()) &&
-        !samzaApplicationState.pendingProcessors.containsKey(requestMessage.getProcessorId())
+        !samzaApplicationState.pendingProcessors.containsKey(requestMessage.getProcessorId()) &&
+        !samzaApplicationState.failedProcessors.containsKey(requestMessage.getProcessorId())
 
 Review comment:
   To make this readable, maybe define variables for each of the lookups and 
   define the if to be 
   if (inRunning || inPending || inFailed) {//do blah}
   else {
   this
   }

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services