You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@brooklyn.apache.org by he...@apache.org on 2016/11/07 13:29:24 UTC

[1/3] brooklyn-docs git commit: info on debugging memory usage

Repository: brooklyn-docs
Updated Branches:
  refs/heads/master 608b87d61 -> 925389027


info on debugging memory usage


Project: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/repo
Commit: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/commit/8f04abc0
Tree: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/tree/8f04abc0
Diff: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/diff/8f04abc0

Branch: refs/heads/master
Commit: 8f04abc046103dd117945329e1e2b6e6b6bf3ab0
Parents: 25a6f59
Author: Alex Heneveld <al...@cloudsoftcorp.com>
Authored: Mon Nov 7 11:26:59 2016 +0000
Committer: Alex Heneveld <al...@cloudsoftcorp.com>
Committed: Mon Nov 7 11:26:59 2016 +0000

----------------------------------------------------------------------
 guide/ops/troubleshooting/index.md        |  1 +
 guide/ops/troubleshooting/memory-usage.md | 94 ++++++++++++++++++++++++++
 2 files changed, 95 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/brooklyn-docs/blob/8f04abc0/guide/ops/troubleshooting/index.md
----------------------------------------------------------------------
diff --git a/guide/ops/troubleshooting/index.md b/guide/ops/troubleshooting/index.md
index f373537..331e267 100644
--- a/guide/ops/troubleshooting/index.md
+++ b/guide/ops/troubleshooting/index.md
@@ -12,6 +12,7 @@ children:
 - { path: detailed-support-report.md, title:  Detailed Support Report }
 - { path: softwareprocess.md, title: SoftwareProcess Entities }
 - { path: going-deep-in-java-and-logs.md, title: Going Deep in Java and Logs }
+- { path: memory-usage.md, title: Monitoring Memory Usage }
 ---
 
 {% include list-children.html %}

http://git-wip-us.apache.org/repos/asf/brooklyn-docs/blob/8f04abc0/guide/ops/troubleshooting/memory-usage.md
----------------------------------------------------------------------
diff --git a/guide/ops/troubleshooting/memory-usage.md b/guide/ops/troubleshooting/memory-usage.md
new file mode 100644
index 0000000..cfadaef
--- /dev/null
+++ b/guide/ops/troubleshooting/memory-usage.md
@@ -0,0 +1,94 @@
+---
+layout: website-normal
+title: "Troubleshooting: Monitoring Memory Usage"
+toc: /guide/toc.json
+---
+
+## Memory Usage
+
+Brooklyn tries to keep in memory as much history of its activity as possible,
+for displaying through the UI, so it is normal for it to consume as much memory
+as it can.  It uses "soft references" so these objects will be cleared if needed,
+but **it is not a sign of anything unusual if Brooklyn is using all its available memory**.
+
+The number of active tasks, CPU usage, thread counts, and 
+retention of soft reference objects are a much better indication of load.
+This information can be found by looking in the log for lines containing
+`brooklyn gc`, such as:
+
+    2016-09-16 16:19:43,337 DEBUG o.a.b.c.m.i.BrooklynGarbageCollector [brooklyn-gc]: brooklyn gc (before) - using 910 MB / 3.76 GB memory; 98% soft-reference maybe retention (of 362); 35 threads; tasks: 0 active, 2 unfinished; 31 remembered, 1013 total submitted) 
+
+The soft-reference figure is indicative, but the lower this is, the more
+the JVM has decided to get rid of items that were desired to be kept but optional.
+It only tracks some soft-references (those wrapped in `Maybe`),
+and of course if there are many many such items the JVM will have to get rid
+of some, so a lower figure does not necessarily mean a problem.
+Typically however if there's no `OutOfMemoryError` (OOME) reported,
+there's no problem.
+
+If you are concerned about memory usage, or doing evaluation on test environments, 
+the following method (in the Groovy console) can be invoked to force the system to
+reclaim as much memory as possible, including *all* soft references:
+
+    org.apache.brooklyn.util.javalang.MemoryUsageTracker.forceClearSoftReferences()
+
+If things are happy usage should return to a small level.  This is quite disruptive
+to the system however so use with care.
+
+The above method can also be configured to run automatically when memory usage 
+is detected to hit a certain level.  That can be useful if external policies are
+being used to warn on high memory usage, and you want to keep some headroom.
+Many JVM references discourage interfering with its garbage collector, however,
+so use with care and study the particular JVM you are using.
+See the class `BrooklynGarbageCollector` for more information.
+
+
+## Investigation of Memory Leaks
+
+Design problems of course can cause memory leaks, and due to the nature of the
+soft references these can be difficult to notice until they are advanced.
+If the "soft-reference maybe retention" starts to decrease, that can be
+an early warning.
+
+Common problems such as runaway tasks and cyclic dependent configuration will often
+show their own log errors, so also look for these if there is a performance or memory problem.
+
+You should also note the task counts in the `brooklyn gc` messages described above,
+and if there are an exceptional number of tasks or tasks are not clearing,
+other log messages will describe what is happening, and the in-product task
+view can indicate issues.  `jstack` can also be useful if it is a task problem.
+
+Sometimes slow leaks can occur if blueprints do not clean up entities or locations.
+These can be diagnosed by noting the number of files written to the persistence location,
+if persistence is being used.  Deploying then destroying a blueprint should not leave
+anything behind in the persistence directory.
+
+More subtle problems can occur and these can be more difficult to pin down.
+Where these have been encountered, we have tried to improve logging and early identification,
+so please do ask what other log `grep` patterns can be useful in certain situations.
+And if you find issues, let us know so we can add them to what we monitor.
+
+If there's a problem you really can't solve, a memory profiler such as VisualVM or Eclipse MAT 
+is the standard way to investigate.  If a heap dump was generated on the OOME
+(most JVMs can be configured to generate that), 
+the profiler can load it and investigate the state of the system.
+These can also connect to running systems and be used to investigate instances and growth.
+
+Monitoring these systems while live can be difficult because
+it will often include many soft and weak references that mask the
+source of a leak.  Common such items include:
+
+* `BasicConfigKey` (used for the web server and many blueprints)
+* `DslComponent` and `*Task` (used for Brooklyn activities and dependent configuration)
+* `jclouds` items including `ImageImpl` (to cache data on cloud service providers)
+
+On the other hand any of the above may also indicate a leak.
+Taking snapshots after a `forceClearSoftReferences()` (above) invocation and comparing those
+is one technique to filter out noise.  Another is to wait until there is an OOME
+and look just after, because that will clear all non-essential data from memory.
+(The `forceClearSoftReferences()` actually works by triggering an OOME, in as safe 
+a way as possible.)
+
+If leaked items are found, the profiler will normally let you see their content
+and walk backwards along their references to find out why they are being retained.
+


[3/3] brooklyn-docs git commit: This closes #122

Posted by he...@apache.org.
This closes #122


Project: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/repo
Commit: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/commit/92538902
Tree: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/tree/92538902
Diff: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/diff/92538902

Branch: refs/heads/master
Commit: 9253890272234e0f5e5c57a76a60797f4740103a
Parents: 608b87d 561ac2a
Author: Alex Heneveld <al...@cloudsoftcorp.com>
Authored: Mon Nov 7 13:28:57 2016 +0000
Committer: Alex Heneveld <al...@cloudsoftcorp.com>
Committed: Mon Nov 7 13:28:57 2016 +0000

----------------------------------------------------------------------
 guide/ops/server-cli-reference.md         |  14 ++-
 guide/ops/troubleshooting/index.md        |   1 +
 guide/ops/troubleshooting/memory-usage.md | 138 +++++++++++++++++++++++++
 3 files changed, 150 insertions(+), 3 deletions(-)
----------------------------------------------------------------------



[2/3] brooklyn-docs git commit: expand memory usage hints as discussed on PR

Posted by he...@apache.org.
expand memory usage hints as discussed on PR


Project: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/repo
Commit: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/commit/561ac2a8
Tree: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/tree/561ac2a8
Diff: http://git-wip-us.apache.org/repos/asf/brooklyn-docs/diff/561ac2a8

Branch: refs/heads/master
Commit: 561ac2a8aa8c2e8a373b17046ede059e4c5002d6
Parents: 8f04abc
Author: Alex Heneveld <al...@cloudsoftcorp.com>
Authored: Mon Nov 7 13:24:28 2016 +0000
Committer: Alex Heneveld <al...@cloudsoftcorp.com>
Committed: Mon Nov 7 13:28:06 2016 +0000

----------------------------------------------------------------------
 guide/ops/server-cli-reference.md         | 14 +++-
 guide/ops/troubleshooting/memory-usage.md | 94 +++++++++++++++++++-------
 2 files changed, 80 insertions(+), 28 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/brooklyn-docs/blob/561ac2a8/guide/ops/server-cli-reference.md
----------------------------------------------------------------------
diff --git a/guide/ops/server-cli-reference.md b/guide/ops/server-cli-reference.md
index 407de29..6ce34e6 100644
--- a/guide/ops/server-cli-reference.md
+++ b/guide/ops/server-cli-reference.md
@@ -61,19 +61,25 @@ export PATH=$PATH:$BROOKLYN_HOME/usage/dist/target/brooklyn-dist/bin/
 ### Memory Usage
 
 The amount of memory required by the Brooklyn process depends on the usage 
-- for example the number of entities/VMs under management.
+-- for example the number of entities/VMs under management.
 
 For a standard Brooklyn deployment, the defaults are to start with 256m, and to grow to 1g of memory.
 These numbers can be overridden by setting the environment variable `JAVA_OPTS` before launching
-the `brooklyn script`:
+the `brooklyn script`, as follows:
 
-    JAVA_OPTS=-Xms1g -Xmx1g -XX:MaxPermSize=256m
+    JAVA_OPTS="-Xms1g -Xmx4g -XX:MaxPermSize=256m"
+
+(On Java 8 and later the last entry has no effect and can be dropped.)
 
 Brooklyn stores a task history in-memory using [soft references](http://docs.oracle.com/javase/7/docs/api/java/lang/ref/SoftReference.html).
 This means that, once the task history is large, Brooklyn will continually use the maximum allocated 
 memory. It will only expunge tasks from memory when this space is required for other objects within the
 Brooklyn process.
 
+See [Memory Usage](troubleshooting/memory-usage.html) for more information on memory usage and
+other suggested `JAVA_OPTS`.
+
+
 ### Web Console Bind Address
 
 The web console will by default bind to 0.0.0.0. It's restricted to 127.0.0.1 if the `--noConsoleSecurity` flag is used.
@@ -189,3 +195,5 @@ or Swift. It has the following options:
 
 * `blob --container <containerName> --blob <blobName>`: retrieves the given blob
   (i.e. object), including metadata and its contents.
+
+

http://git-wip-us.apache.org/repos/asf/brooklyn-docs/blob/561ac2a8/guide/ops/troubleshooting/memory-usage.md
----------------------------------------------------------------------
diff --git a/guide/ops/troubleshooting/memory-usage.md b/guide/ops/troubleshooting/memory-usage.md
index cfadaef..bb95342 100644
--- a/guide/ops/troubleshooting/memory-usage.md
+++ b/guide/ops/troubleshooting/memory-usage.md
@@ -26,57 +26,86 @@ of some, so a lower figure does not necessarily mean a problem.
 Typically however if there's no `OutOfMemoryError` (OOME) reported,
 there's no problem.
 
+
+## Problem Indicators and Resolutions
+
+Two things that *do* normally indicate a problem with memory are:
+
+* `OutOfMemoryError` exceptions being thrown
+* Memory usage high *and* CPU high, where the CPU is spent doing full garbage collection
+
+One possible cause is the JVM doing a poorly-selected GC strategy,
+as described in [Oracle Java bug 6912889](http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6912889).
+This can be confirmed by running the "analyzing soft reference usage" technique below;
+memory should shrink dramatically then increase until the problem recurs.
+This can be fixed by passing `-XX:SoftRefLRUPolicyMSPerMB=1` to the JVM,
+as described in [Brooklyn issue 375](https://issues.apache.org/jira/browse/BROOKLYN-375).
+
+Other common JVM options include `-Xms256m -Xmx1g -XX:MaxPermSize=256m`
+(depending on JVM provider and version) to set the right balance of memory allocation.
+In some cases a larger `-Xmx` value may simply be the fix
+(but this should not be the case unless many or large blueprints are being used).
+
+If the problem is not with soft references but with real memory usage,
+the culprit is likely a memory leak, typically in blueprint design.
+An early warning of this situation is the "soft-reference maybe retention" level decreasing.
+In these situations, follow the steps as described below for "Investigating Leaks".
+
+
+## Analyzing Soft Reference Usage
+
 If you are concerned about memory usage, or doing evaluation on test environments, 
 the following method (in the Groovy console) can be invoked to force the system to
 reclaim as much memory as possible, including *all* soft references:
 
     org.apache.brooklyn.util.javalang.MemoryUsageTracker.forceClearSoftReferences()
 
-If things are happy usage should return to a small level.  This is quite disruptive
-to the system however so use with care.
+In good situations, memory usage should return to a small level.  
+This call can be disruptive to the system however so use with care.
 
 The above method can also be configured to run automatically when memory usage 
 is detected to hit a certain level.  That can be useful if external policies are
 being used to warn on high memory usage, and you want to keep some headroom.
-Many JVM references discourage interfering with its garbage collector, however,
+Many JVM authorities discourage interfering with its garbage collector, however,
 so use with care and study the particular JVM you are using.
 See the class `BrooklynGarbageCollector` for more information.
 
 
-## Investigation of Memory Leaks
-
-Design problems of course can cause memory leaks, and due to the nature of the
-soft references these can be difficult to notice until they are advanced.
-If the "soft-reference maybe retention" starts to decrease, that can be
-an early warning.
+## Investigating Leaks
 
-Common problems such as runaway tasks and cyclic dependent configuration will often
-show their own log errors, so also look for these if there is a performance or memory problem.
+If a memory leak is found, the first place to look should be the WARN/ERROR logs.
+Many common causes of leaks, including as runaway tasks and cyclic dependent configuration,
+will show their own log errors prior to the memory error.
 
 You should also note the task counts in the `brooklyn gc` messages described above,
 and if there are an exceptional number of tasks or tasks are not clearing,
 other log messages will describe what is happening, and the in-product task
-view can indicate issues.  `jstack` can also be useful if it is a task problem.
+view can indicate issues. 
 
 Sometimes slow leaks can occur if blueprints do not clean up entities or locations.
 These can be diagnosed by noting the number of files written to the persistence location,
 if persistence is being used.  Deploying then destroying a blueprint should not leave
 anything behind in the persistence directory.
 
-More subtle problems can occur and these can be more difficult to pin down.
-Where these have been encountered, we have tried to improve logging and early identification,
-so please do ask what other log `grep` patterns can be useful in certain situations.
-And if you find issues, let us know so we can add them to what we monitor.
+Where problems have been encountered in the past, we have resolved them and/or
+worked to improve logging and early identification.
+Please report any issues so that we can improve this further.
+In many cases we can also give advice on what other log `grep` patterns can be useful.
 
-If there's a problem you really can't solve, a memory profiler such as VisualVM or Eclipse MAT 
-is the standard way to investigate.  If a heap dump was generated on the OOME
-(most JVMs can be configured to generate that), 
-the profiler can load it and investigate the state of the system.
-These can also connect to running systems and be used to investigate instances and growth.
 
-Monitoring these systems while live can be difficult because
-it will often include many soft and weak references that mask the
-source of a leak.  Common such items include:
+### Standard Java Techniques
+
+Useful standard Java techniques for tracking memory leaks include:
+
+* `jstack <pid>` to see what tasks are running
+* `jmap -histo:live <pid>` to see what objects are using memory (see below)
+* Memory profilers such as VisualVM or Eclipse MAT, either connected to a running system or
+  against a heap dump generated on an OOME
+
+More information is available on [the Oracle Java web site](https://docs.oracle.com/javase/7/docs/webnotes/tsg/TSG-VM/html/memleaks.html).
+
+Note that some of the above techniques will often include soft and weak references that are irrelevant
+to the problem (and will be cleared on an OOME). Objects that may be cached in that way include:
 
 * `BasicConfigKey` (used for the web server and many blueprints)
 * `DslComponent` and `*Task` (used for Brooklyn activities and dependent configuration)
@@ -89,6 +118,21 @@ and look just after, because that will clear all non-essential data from memory.
 (The `forceClearSoftReferences()` actually works by triggering an OOME, in as safe 
 a way as possible.)
 
-If leaked items are found, the profiler will normally let you see their content
+If leaked items are found, a profiler will normally let you see their content
 and walk backwards along their references to find out why they are being retained.
 
+
+### Summary of Techniques
+
+The following sequence of techniques is a common approach to investigating and fixing memory issues:
+
+* Note the log lines about `brooklyn gc`, including memory and tasks
+* Do not assume high memory usage alone is an error, as soft reference caches are deliberate; 
+  use `forceClearSoftReferences()` to clear these
+* Note any WARN/ERROR messages in the log
+* Tune JVM memory allocation and GC
+* Look for leaking locations or references by creating then destroying a blueprint
+* Use standard JVM profilers
+* Inform the Apache Brooklyn community
+
+