You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by GitBox <gi...@apache.org> on 2022/05/05 00:43:33 UTC

[GitHub] [samza] cameronlee314 opened a new pull request, #1604: SAMZA-2740: Provide a way to specify a static file name for container metadata file

cameronlee314 opened a new pull request, #1604:
URL: https://github.com/apache/samza/pull/1604

   Issue: For environments in which apps run in isolated containers (e.g. Kubernetes), there is no need to have unique file names for container metadata files. It is much easier to find these files when they have static names instead of depending on the exec-env-container-id. A concrete Kubernetes use case for this is to be able to set the `terminationMessagePath` to this container metadata file in order to pass some information through to controllers.
   
   Changes: Add a new environment variable `CONTAINER_METADATA_FILENAME` which specifies the name of the container metadata file.
   
   API/usage changes: (backwards compatible) If `CONTAINER_METADATA_FILENAME` is specified, then use that for generating the container metadata file name. Otherwise, fall back to using the exec-env-container-id to generate the file name.
   
   Tests: Deployed in minikube and verified that the container metadata file matched the value set in the `CONTAINER_METADATA_FILENAME` environment variable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@samza.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [samza] rmatharu commented on a diff in pull request #1604: SAMZA-2740: Provide a way to specify a static file name for container metadata file

Posted by GitBox <gi...@apache.org>.
rmatharu commented on code in PR #1604:
URL: https://github.com/apache/samza/pull/1604#discussion_r879942331


##########
samza-core/src/main/java/org/apache/samza/config/JobConfig.java:
##########
@@ -431,18 +433,30 @@ public boolean getStandbyTasksEnabled() {
   }
 
   /**
-   * The metadata file is written in a {@code exec-env-container-id}.metadata file in the log-dir of the container.
-   * Here the {@code exec-env-container-id} refers to the ID assigned by the cluster manager (e.g., YARN) to the container,
-   * which uniquely identifies a container's lifecycle.
+   * Get the {@link File} in the "samza.log.dir" for writing container metadata.
+   * If {@link EnvironmentVariables#ENV_CONTAINER_METADATA_FILENAME} is specified, then use that as the file name.
+   * Otherwise, the file name is {@code exec-env-container-id}.metadata. Here the {@code exec-env-container-id} refers
+   * to the ID assigned by the cluster manager (e.g., YARN) to the container, which uniquely identifies a container's
+   * lifecycle.
    */
   public static Optional<File> getMetadataFile(String execEnvContainerId) {
     String dir = System.getProperty(CONTAINER_METADATA_DIRECTORY_SYS_PROPERTY);
-    if (dir == null || execEnvContainerId == null) {
-      return Optional.empty();
-    } else {
-      return Optional.of(
-          new File(dir, String.format(CONTAINER_METADATA_FILENAME_FORMAT, execEnvContainerId)));
+    if (dir != null) {
+      String fileName = System.getenv(EnvironmentVariables.ENV_CONTAINER_METADATA_FILENAME);
+      if (StringUtils.isNotBlank(fileName)) {
+        if (fileName.contains(File.separator)) {
+          throw new IllegalStateException(String.format("%s should not include directories, but it is %s",
+              EnvironmentVariables.ENV_CONTAINER_METADATA_FILENAME, fileName));
+        } else {
+          return Optional.of(new File(dir, fileName));
+        }
+      } else {
+        if (execEnvContainerId != null) {

Review Comment:
   Would it be useful to add a 
   LOG.info("Choosing metadata filename: %s %s", dir, filename)
   
   ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@samza.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [samza] cameronlee314 commented on a diff in pull request #1604: SAMZA-2740: Provide a way to specify a static file name for container metadata file

Posted by GitBox <gi...@apache.org>.
cameronlee314 commented on code in PR #1604:
URL: https://github.com/apache/samza/pull/1604#discussion_r879948593


##########
samza-core/src/main/java/org/apache/samza/config/JobConfig.java:
##########
@@ -431,18 +433,30 @@ public boolean getStandbyTasksEnabled() {
   }
 
   /**
-   * The metadata file is written in a {@code exec-env-container-id}.metadata file in the log-dir of the container.
-   * Here the {@code exec-env-container-id} refers to the ID assigned by the cluster manager (e.g., YARN) to the container,
-   * which uniquely identifies a container's lifecycle.
+   * Get the {@link File} in the "samza.log.dir" for writing container metadata.
+   * If {@link EnvironmentVariables#ENV_CONTAINER_METADATA_FILENAME} is specified, then use that as the file name.
+   * Otherwise, the file name is {@code exec-env-container-id}.metadata. Here the {@code exec-env-container-id} refers
+   * to the ID assigned by the cluster manager (e.g., YARN) to the container, which uniquely identifies a container's
+   * lifecycle.
    */
   public static Optional<File> getMetadataFile(String execEnvContainerId) {
     String dir = System.getProperty(CONTAINER_METADATA_DIRECTORY_SYS_PROPERTY);
-    if (dir == null || execEnvContainerId == null) {
-      return Optional.empty();
-    } else {
-      return Optional.of(
-          new File(dir, String.format(CONTAINER_METADATA_FILENAME_FORMAT, execEnvContainerId)));
+    if (dir != null) {
+      String fileName = System.getenv(EnvironmentVariables.ENV_CONTAINER_METADATA_FILENAME);
+      if (StringUtils.isNotBlank(fileName)) {
+        if (fileName.contains(File.separator)) {
+          throw new IllegalStateException(String.format("%s should not include directories, but it is %s",
+              EnvironmentVariables.ENV_CONTAINER_METADATA_FILENAME, fileName));
+        } else {
+          return Optional.of(new File(dir, fileName));
+        }
+      } else {
+        if (execEnvContainerId != null) {

Review Comment:
   Makes sense to add a log. Instead of adding a log here, what if I add a log in `DiagnosticsUtil.writeMetadataFile`, where `JobConfig.getMetadataFile` gets called?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@samza.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [samza] rmatharu commented on pull request #1604: SAMZA-2740: Provide a way to specify a static file name for container metadata file

Posted by GitBox <gi...@apache.org>.
rmatharu commented on PR #1604:
URL: https://github.com/apache/samza/pull/1604#issuecomment-1135245321

   minor comment, otherwise lgtm. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@samza.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [samza] cameronlee314 merged pull request #1604: SAMZA-2740: Provide a way to specify a static file name for container metadata file

Posted by GitBox <gi...@apache.org>.
cameronlee314 merged PR #1604:
URL: https://github.com/apache/samza/pull/1604


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@samza.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org