You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by GitBox <gi...@apache.org> on 2020/08/25 17:36:35 UTC

[GitHub] [samza] Sanil15 commented on a change in pull request #1421: SAMZA-2439: Remove LocalityManager and container location information from JobModel

Sanil15 commented on a change in pull request #1421:
URL: https://github.com/apache/samza/pull/1421#discussion_r476597594



##########
File path: samza-api/src/main/java/org/apache/samza/job/model/ContainerLocality.java
##########
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.samza.job.model;
+
+import java.util.Objects;
+
+/**
+ * A data model to represent the container locality information. The locality information refers to the whereabouts
+ * of the physical execution of container.
+ * Fields such as <i>jmxUrl</i> and <i>jmxTunnelingUrl</i> exist for backward compatibility reasons as they were
+ * historically stored under the same name space as locality and surfaced within the framework through the locality
+ * manager.
+ */
+public class ContainerLocality {
+  /* Container identifier */
+  private String id;

Review comment:
       Is this the physical id of the container? for example in yarn it will be container_0000_99... if it is yes lets explicitly call it containerId and add a doc suggesting phyical **containerId** of a container
   
   Also don't you need the logical id (called processorId here...)

##########
File path: samza-core/src/main/java/org/apache/samza/clustermanager/ClusterBasedJobCoordinator.java
##########
@@ -174,6 +176,7 @@
   private final MetadataStore metadataStore;
 
   private final SystemAdmins systemAdmins;
+  private final LocalityManager localityManager;

Review comment:
       Here is another thought:
   LocalityManager and the ContainerPlacementMetadataStore both are present in this class ClusterBasedJobCoordinator because they both need a hold of metadata store
   
   Will it be cleaner to move this to ContainerProcessManager and then eventually move it to ContainerManager (when all the refactors we have in process are in)
   
   We can just pass metadata store from here and then instantiate ContainerPlacementMetadatStore and LocalityManager in ContainerManager 
   

##########
File path: samza-api/src/main/java/org/apache/samza/job/model/LocalityModel.java
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.samza.job.model;
+
+import java.util.Objects;
+import java.util.Map;
+
+/**
+ * A model to represent the locality information of an application. The locality information refers to the
+ * whereabouts of the physical execution of a samza container. With this information, samza achieves (best effort) affinity
+ * i.e. place the container on the host in which it was running before. By doing this, stateful applications can minimize
+ * the bootstrap time of their state by leveraging the local copy.
+ *
+ * It is suffice to have only {@link ContainerLocality} model and use it within locality manager. However, this abstraction
+ * enables us extend locality beyond container. e.g. It is useful to track task locality to enable heterogeneous containers
+ * or fine grained execution model.
+ */
+public class LocalityModel {
+  private Map<String, ContainerLocality> containerLocalities;

Review comment:
       Lets add docs here for key of this hashmap, i think its the logical processor id for ex 1,2,3 

##########
File path: samza-core/src/main/java/org/apache/samza/clustermanager/ContainerManager.java
##########
@@ -80,19 +82,23 @@
 
   private final Optional<StandbyContainerManager> standbyContainerManager;
 
+  private final LocalityManager localityManager;

Review comment:
       Now we are passing this LocalityManager everywhere ContainerManager, CPM, StandbyContainerManager just to get the host that container was last seen....
   
   I think we can have it cleaner by either maintaining that in SamzaApplicationState or having static util in LocalityManager to fetch the last seen host

##########
File path: samza-api/src/main/java/org/apache/samza/job/model/LocalityModel.java
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.samza.job.model;
+
+import java.util.Objects;
+import java.util.Map;
+
+/**
+ * A model to represent the locality information of an application. The locality information refers to the
+ * whereabouts of the physical execution of a samza container. With this information, samza achieves (best effort) affinity
+ * i.e. place the container on the host in which it was running before. By doing this, stateful applications can minimize
+ * the bootstrap time of their state by leveraging the local copy.
+ *
+ * It is suffice to have only {@link ContainerLocality} model and use it within locality manager. However, this abstraction
+ * enables us extend locality beyond container. e.g. It is useful to track task locality to enable heterogeneous containers
+ * or fine grained execution model.
+ */
+public class LocalityModel {
+  private Map<String, ContainerLocality> containerLocalities;
+
+  /**
+   * Construct locality model for the job from the input map of container localities.
+   * @param containerLocalities host locality information for the job keyed by container id
+   */
+  public LocalityModel(Map<String, ContainerLocality> containerLocalities) {
+    this.containerLocalities = containerLocalities;
+  }
+
+  /*
+   * Returns a {@link Map} of {@link ContainerLocality} keyed by container id.
+   */
+  public Map<String, ContainerLocality> getContainerLocalities() {
+    return containerLocalities;
+  }
+
+  /*
+   * Returns the {@link ContainerLocality} for the given container id.
+   */
+  public ContainerLocality getContainerLocality(String id) {

Review comment:
       s/id/processorId
   
   Lets actually explicitly call it processorId

##########
File path: samza-core/src/main/java/org/apache/samza/serializers/model/JsonContainerLocalityMixIn.java
##########
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.serializers.model;
+
+import org.codehaus.jackson.annotate.JsonCreator;
+import org.codehaus.jackson.annotate.JsonIgnoreProperties;
+import org.codehaus.jackson.annotate.JsonProperty;
+
+
+/**
+ * A mix-in Jackson class to convert {@link org.apache.samza.job.model.ContainerLocality} to/from JSON
+ */
+@JsonIgnoreProperties(ignoreUnknown = true)
+public abstract class JsonContainerLocalityMixIn {
+  @JsonCreator
+  public JsonContainerLocalityMixIn(@JsonProperty("id") String id, @JsonProperty("host") String host,
+      @JsonProperty("jmx-url") String jmxUrl, @JsonProperty("jmx-tunneling-url") String jmxTunnelingUrl) {
+  }
+
+  @JsonProperty("id")

Review comment:
       Same lets be specific here about the fact that is it processorId or containerId




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org