You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Jagadish Venkatraman <ja...@gmail.com> on 2016/07/11 06:57:17 UTC

Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49877/
-----------------------------------------------------------

Review request for samza, Boris Shkolnik, Chris Pettitt, Jake Maes, Yi Pan (Data Infrastructure), and Navina Ramesh.


Repository: samza


Description
-------

This feature introduces physical memory monitoring in SamzaContainer.

Context:
Often memory used by the SamzaContainer process includes 
A. JVM Heap memory: This is where all JVM variables live.
B. Native memory: This memory lives out of the JVM heap and is not visible to the JVM. Examples include used by RocksDb, native libraries that user code depends on etc.

User jobs could be killed by Yarn if their total memory (A+B) exceeds the configured maximum of yarn.container.memory.mb.

Currently, while our existing metrics provide visibility into [A] via JMX, we don't have visibility into [B]. (as it's totally external to the JVM). 

This feature uses Linux ProcFS to provide a complete view of the memory (both A & B) to help Samza users understand memory better. (Schedulers like Apache Yarn that require a holistic view of memory (A+B) also use ProcFS. For the curious, here's the Yarn implementation - http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-yarn-common/0.23.1/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java that inspired this idea)

Scope: The scope of this RB is only to Linux distributions. (Mac based implementation may be a separate change list.)


Diffs
-----

  build.gradle ba4a9d14fe24e1ff170873920cd5eeef656955af 
  checkstyle/import-control.xml 325c38131047836dc8aedaea4187598ef3ba7666 
  samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java PRE-CREATION 
  samza-core/src/main/java/org/apache/samza/container/host/SystemStatistics.java PRE-CREATION 
  samza-core/src/main/java/org/apache/samza/container/host/SystemStatisticsMonitor.java PRE-CREATION 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 18c09224bbae959342daf9b2b7a7d971cc224f48 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala 2044ce01ffded8434e762d99355d5df43642c66b 
  samza-core/src/test/java/org/apache/samza/container/host/TestProcfsBasedStatisticsMonitor.java PRE-CREATION 

Diff: https://reviews.apache.org/r/49877/diff/


Testing
-------

1. Unit tests with mock PROC-FS snapshots of processes
2. Deployed actual jobs on my dev box. 
   2.1 Obtained the operating system's view of the container memory using 'ps' and other tools.
   2.2 Verified that the total memory reported by the monitor is the same as the OS's view of memory[2.1]
3. Tested on various Linux distributions I could find internally:
    - RHEL release 6.4, 6.5, 6.6 (Santiago)


Thanks,

Jagadish Venkatraman


Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

Posted by Chris Pettitt <cp...@linkedin.com>.

> On July 11, 2016, 6:47 p.m., Chris Pettitt wrote:
> > Very high level question: I assume you looked at `ps -o rss` and disqualified it for some reason. Could you elaborate as to why? `ps` itself is certainly more portable than procfs (though `-o rss` is not part of the POSIX standard) - it works on RHEL and OSX. It also gives you the actual memory usage vs the number of pages.
> 
> Jagadish Venkatraman wrote:
>     Thanks for the suggestion and the input! I could not find a portable platform-independent way to get the PID of the current JVM process (without going through JNI/ some other flaky solutions). Commands like `ps` require the PID of the process. Using the procfs has a nice property that we can directly parse `/proc/self/stat` since `self` automatically refers to the current process. Also, if we decide to capture other metrics specific to the host (for example: page faults, network statistics any other information that ProcFs exposes), this maybe be easier to extend and build on top.

You could do something like this, which would get you close to POSIX compatibility (again RSS not being part of the POSIX standard):

```
Runtime.getRuntime().exec(new String[] {"sh", "-c", "ps -o rss -p $PPID"});
```

This at least gets you better compatibility with OSX (not that you'd run real work on OSX :P) and probably with other *nix operating systems that don't auto-mount procfs. It also avoids the page size assumption which may not be correct.

---

Review for current RB coming shortly.


- Chris


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49877/#review141745
-----------------------------------------------------------


On July 12, 2016, 12:17 a.m., Jagadish Venkatraman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49877/
> -----------------------------------------------------------
> 
> (Updated July 12, 2016, 12:17 a.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Chris Pettitt, Fred Ji, Jake Maes, Yi Pan (Data Infrastructure), and Navina Ramesh.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> This feature introduces physical memory monitoring in SamzaContainer.
> 
> Context:
> Often memory used by the SamzaContainer process includes 
> A. JVM Heap memory: This is where all JVM variables live.
> B. Native memory: This memory lives out of the JVM heap and is not visible to the JVM. Examples include used by RocksDb, native libraries that user code depends on etc.
> 
> User jobs could be killed by Yarn if their total memory (A+B) exceeds the configured maximum of yarn.container.memory.mb.
> 
> Currently, while our existing metrics provide visibility into [A] via JMX, we don't have visibility into [B]. (as it's totally external to the JVM). 
> 
> This feature uses Linux ProcFS to provide a complete view of the memory (both A & B) to help Samza users understand memory better. (Schedulers like Apache Yarn that require a holistic view of memory (A+B) also use ProcFS. For the curious, here's the Yarn implementation - http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-yarn-common/0.23.1/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java that inspired this idea)
> 
> Scope: The scope of this RB is only to Linux distributions. (Mac based implementation may be a separate change list.)
> 
> 
> Diffs
> -----
> 
>   build.gradle ba4a9d14fe24e1ff170873920cd5eeef656955af 
>   checkstyle/import-control.xml 325c38131047836dc8aedaea4187598ef3ba7666 
>   samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatistics.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 18c09224bbae959342daf9b2b7a7d971cc224f48 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala 2044ce01ffded8434e762d99355d5df43642c66b 
>   samza-core/src/test/java/org/apache/samza/container/host/TestProcfsBasedStatisticsMonitor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/49877/diff/
> 
> 
> Testing
> -------
> 
> 1. Unit tests with mock PROC-FS snapshots of processes
> 2. Deployed actual jobs on my dev box. 
>    2.1 Obtained the operating system's view of the container memory using 'ps' and other tools.
>    2.2 Verified that the total memory reported by the monitor is the same as the OS's view of memory[2.1]
> 3. Tested on various Linux distributions I could find internally:
>     - RHEL release 6.4, 6.5, 6.6 (Santiago)
> 
> 
> Thanks,
> 
> Jagadish Venkatraman
> 
>


Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

Posted by Jagadish Venkatraman <ja...@gmail.com>.

> On July 11, 2016, 6:47 p.m., Chris Pettitt wrote:
> > Very high level question: I assume you looked at `ps -o rss` and disqualified it for some reason. Could you elaborate as to why? `ps` itself is certainly more portable than procfs (though `-o rss` is not part of the POSIX standard) - it works on RHEL and OSX. It also gives you the actual memory usage vs the number of pages.

Thanks for the suggestion and the input! I could not find a portable platform-independent way to get the PID of the current JVM process (without going through JNI/ some other flaky solutions). Commands like `ps` require the PID of the process. Using the procfs has a nice property that we can directly parse `/proc/self/stat` since `self` automatically refers to the current process. Also, if we decide to capture other metrics specific to the host (for example: page faults, network statistics any other information that ProcFs exposes), this maybe be easier to extend and build on top.


- Jagadish


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49877/#review141745
-----------------------------------------------------------


On July 11, 2016, 6:57 a.m., Jagadish Venkatraman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49877/
> -----------------------------------------------------------
> 
> (Updated July 11, 2016, 6:57 a.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Chris Pettitt, Jake Maes, Yi Pan (Data Infrastructure), and Navina Ramesh.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> This feature introduces physical memory monitoring in SamzaContainer.
> 
> Context:
> Often memory used by the SamzaContainer process includes 
> A. JVM Heap memory: This is where all JVM variables live.
> B. Native memory: This memory lives out of the JVM heap and is not visible to the JVM. Examples include used by RocksDb, native libraries that user code depends on etc.
> 
> User jobs could be killed by Yarn if their total memory (A+B) exceeds the configured maximum of yarn.container.memory.mb.
> 
> Currently, while our existing metrics provide visibility into [A] via JMX, we don't have visibility into [B]. (as it's totally external to the JVM). 
> 
> This feature uses Linux ProcFS to provide a complete view of the memory (both A & B) to help Samza users understand memory better. (Schedulers like Apache Yarn that require a holistic view of memory (A+B) also use ProcFS. For the curious, here's the Yarn implementation - http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-yarn-common/0.23.1/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java that inspired this idea)
> 
> Scope: The scope of this RB is only to Linux distributions. (Mac based implementation may be a separate change list.)
> 
> 
> Diffs
> -----
> 
>   build.gradle ba4a9d14fe24e1ff170873920cd5eeef656955af 
>   checkstyle/import-control.xml 325c38131047836dc8aedaea4187598ef3ba7666 
>   samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatistics.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 18c09224bbae959342daf9b2b7a7d971cc224f48 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala 2044ce01ffded8434e762d99355d5df43642c66b 
>   samza-core/src/test/java/org/apache/samza/container/host/TestProcfsBasedStatisticsMonitor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/49877/diff/
> 
> 
> Testing
> -------
> 
> 1. Unit tests with mock PROC-FS snapshots of processes
> 2. Deployed actual jobs on my dev box. 
>    2.1 Obtained the operating system's view of the container memory using 'ps' and other tools.
>    2.2 Verified that the total memory reported by the monitor is the same as the OS's view of memory[2.1]
> 3. Tested on various Linux distributions I could find internally:
>     - RHEL release 6.4, 6.5, 6.6 (Santiago)
> 
> 
> Thanks,
> 
> Jagadish Venkatraman
> 
>


Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

Posted by Chris Pettitt <cp...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49877/#review141745
-----------------------------------------------------------



Very high level question: I assume you looked at `ps -o rss` and disqualified it for some reason. Could you elaborate as to why? `ps` itself is certainly more portable than procfs (though `-o rss` is not part of the POSIX standard) - it works on RHEL and OSX. It also gives you the actual memory usage vs the number of pages.

- Chris Pettitt


On July 11, 2016, 6:57 a.m., Jagadish Venkatraman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49877/
> -----------------------------------------------------------
> 
> (Updated July 11, 2016, 6:57 a.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Chris Pettitt, Jake Maes, Yi Pan (Data Infrastructure), and Navina Ramesh.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> This feature introduces physical memory monitoring in SamzaContainer.
> 
> Context:
> Often memory used by the SamzaContainer process includes 
> A. JVM Heap memory: This is where all JVM variables live.
> B. Native memory: This memory lives out of the JVM heap and is not visible to the JVM. Examples include used by RocksDb, native libraries that user code depends on etc.
> 
> User jobs could be killed by Yarn if their total memory (A+B) exceeds the configured maximum of yarn.container.memory.mb.
> 
> Currently, while our existing metrics provide visibility into [A] via JMX, we don't have visibility into [B]. (as it's totally external to the JVM). 
> 
> This feature uses Linux ProcFS to provide a complete view of the memory (both A & B) to help Samza users understand memory better. (Schedulers like Apache Yarn that require a holistic view of memory (A+B) also use ProcFS. For the curious, here's the Yarn implementation - http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-yarn-common/0.23.1/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java that inspired this idea)
> 
> Scope: The scope of this RB is only to Linux distributions. (Mac based implementation may be a separate change list.)
> 
> 
> Diffs
> -----
> 
>   build.gradle ba4a9d14fe24e1ff170873920cd5eeef656955af 
>   checkstyle/import-control.xml 325c38131047836dc8aedaea4187598ef3ba7666 
>   samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatistics.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 18c09224bbae959342daf9b2b7a7d971cc224f48 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala 2044ce01ffded8434e762d99355d5df43642c66b 
>   samza-core/src/test/java/org/apache/samza/container/host/TestProcfsBasedStatisticsMonitor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/49877/diff/
> 
> 
> Testing
> -------
> 
> 1. Unit tests with mock PROC-FS snapshots of processes
> 2. Deployed actual jobs on my dev box. 
>    2.1 Obtained the operating system's view of the container memory using 'ps' and other tools.
>    2.2 Verified that the total memory reported by the monitor is the same as the OS's view of memory[2.1]
> 3. Tested on various Linux distributions I could find internally:
>     - RHEL release 6.4, 6.5, 6.6 (Santiago)
> 
> 
> Thanks,
> 
> Jagadish Venkatraman
> 
>


Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

Posted by Jake Maes <ja...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49877/#review141747
-----------------------------------------------------------


Ship it!





samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala (line 515)
<https://reviews.apache.org/r/49877/#comment207112>

    This is the first place where I could tell the units of the memory sample. I think it would be clearer if the fields in SystemStatistics and ProcfsBasedStatisticsMonitor used a similar naming convention. Also, maybe a short note in the java doc indicating that this measurement represents the entire process memory, not just the non-heap native memory.


- Jake Maes


On July 11, 2016, 6:57 a.m., Jagadish Venkatraman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49877/
> -----------------------------------------------------------
> 
> (Updated July 11, 2016, 6:57 a.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Chris Pettitt, Jake Maes, Yi Pan (Data Infrastructure), and Navina Ramesh.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> This feature introduces physical memory monitoring in SamzaContainer.
> 
> Context:
> Often memory used by the SamzaContainer process includes 
> A. JVM Heap memory: This is where all JVM variables live.
> B. Native memory: This memory lives out of the JVM heap and is not visible to the JVM. Examples include used by RocksDb, native libraries that user code depends on etc.
> 
> User jobs could be killed by Yarn if their total memory (A+B) exceeds the configured maximum of yarn.container.memory.mb.
> 
> Currently, while our existing metrics provide visibility into [A] via JMX, we don't have visibility into [B]. (as it's totally external to the JVM). 
> 
> This feature uses Linux ProcFS to provide a complete view of the memory (both A & B) to help Samza users understand memory better. (Schedulers like Apache Yarn that require a holistic view of memory (A+B) also use ProcFS. For the curious, here's the Yarn implementation - http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-yarn-common/0.23.1/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java that inspired this idea)
> 
> Scope: The scope of this RB is only to Linux distributions. (Mac based implementation may be a separate change list.)
> 
> 
> Diffs
> -----
> 
>   build.gradle ba4a9d14fe24e1ff170873920cd5eeef656955af 
>   checkstyle/import-control.xml 325c38131047836dc8aedaea4187598ef3ba7666 
>   samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatistics.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 18c09224bbae959342daf9b2b7a7d971cc224f48 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala 2044ce01ffded8434e762d99355d5df43642c66b 
>   samza-core/src/test/java/org/apache/samza/container/host/TestProcfsBasedStatisticsMonitor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/49877/diff/
> 
> 
> Testing
> -------
> 
> 1. Unit tests with mock PROC-FS snapshots of processes
> 2. Deployed actual jobs on my dev box. 
>    2.1 Obtained the operating system's view of the container memory using 'ps' and other tools.
>    2.2 Verified that the total memory reported by the monitor is the same as the OS's view of memory[2.1]
> 3. Tested on various Linux distributions I could find internally:
>     - RHEL release 6.4, 6.5, 6.6 (Santiago)
> 
> 
> Thanks,
> 
> Jagadish Venkatraman
> 
>


Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

Posted by Xinyu Liu <xi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49877/#review141760
-----------------------------------------------------------


Fix it, then Ship it!





samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java (line 177)
<https://reviews.apache.org/r/49877/#comment207134>

    Do you think we should catch all the throwable here to prevent the thread being killed?


- Xinyu Liu


On July 11, 2016, 6:57 a.m., Jagadish Venkatraman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49877/
> -----------------------------------------------------------
> 
> (Updated July 11, 2016, 6:57 a.m.)
> 
> 
> Review request for samza, Boris Shkolnik, Chris Pettitt, Jake Maes, Yi Pan (Data Infrastructure), and Navina Ramesh.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> This feature introduces physical memory monitoring in SamzaContainer.
> 
> Context:
> Often memory used by the SamzaContainer process includes 
> A. JVM Heap memory: This is where all JVM variables live.
> B. Native memory: This memory lives out of the JVM heap and is not visible to the JVM. Examples include used by RocksDb, native libraries that user code depends on etc.
> 
> User jobs could be killed by Yarn if their total memory (A+B) exceeds the configured maximum of yarn.container.memory.mb.
> 
> Currently, while our existing metrics provide visibility into [A] via JMX, we don't have visibility into [B]. (as it's totally external to the JVM). 
> 
> This feature uses Linux ProcFS to provide a complete view of the memory (both A & B) to help Samza users understand memory better. (Schedulers like Apache Yarn that require a holistic view of memory (A+B) also use ProcFS. For the curious, here's the Yarn implementation - http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-yarn-common/0.23.1/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java that inspired this idea)
> 
> Scope: The scope of this RB is only to Linux distributions. (Mac based implementation may be a separate change list.)
> 
> 
> Diffs
> -----
> 
>   build.gradle ba4a9d14fe24e1ff170873920cd5eeef656955af 
>   checkstyle/import-control.xml 325c38131047836dc8aedaea4187598ef3ba7666 
>   samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatistics.java PRE-CREATION 
>   samza-core/src/main/java/org/apache/samza/container/host/SystemStatisticsMonitor.java PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 18c09224bbae959342daf9b2b7a7d971cc224f48 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala 2044ce01ffded8434e762d99355d5df43642c66b 
>   samza-core/src/test/java/org/apache/samza/container/host/TestProcfsBasedStatisticsMonitor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/49877/diff/
> 
> 
> Testing
> -------
> 
> 1. Unit tests with mock PROC-FS snapshots of processes
> 2. Deployed actual jobs on my dev box. 
>    2.1 Obtained the operating system's view of the container memory using 'ps' and other tools.
>    2.2 Verified that the total memory reported by the monitor is the same as the OS's view of memory[2.1]
> 3. Tested on various Linux distributions I could find internally:
>     - RHEL release 6.4, 6.5, 6.6 (Santiago)
> 
> 
> Thanks,
> 
> Jagadish Venkatraman
> 
>