You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by Aled Sage <al...@gmail.com> on 2016/07/04 13:25:05 UTC
[PROPOSAL] Deleting orphaned locations
Hi all,
We are looking at implementing a "cleaner" that can remove orphaned
locations from persisted state.
_*Problem statement*_
In older versions of Brooklyn (e.g. prior to [1]), we sometimes did not
unmanage locations when the associated entity was deleted. This means
that the persisted state for some customers contains many "orphaned
locations" that are no longer referenced.
We want a way to safely delete these. We only want to delete locations
that are not referenced.
These orphaned locations can also cause "dangling references" to be
reported, where the orphaned location(s) hold references to things that
have been deleted.
References to locations can be in a few formats:
1. Location is directly referenced from an entity's getLocations().
2. Location is indirectly referenced from an entity (e.g. the location
is the parent of another location that is referenced).
3. Location is referenced by an entity in some other way (rather than
getLocations()) - e.g. in a sensor or config key, such as [2].
4. Location is referenced by a policy or enricher.
For (4), I can't think of any such use-case off-hand, but it's possible
that a customer might write a bespoke policy/enricher that does this.
For (2), it means we need to worry about reachability. Note there might
be groups of locations that are unreachable (e.g. location X and its
parent refer to each other, but are not referenced by anything else).
_*Location deletion: proposed solution*_
We propose an offline tool, similar in use to copy-state [3], which will
clean up the persisted state, and save the cleaned-up copy to a given
location.
It is important that the tool is run offline, in case a Brooklyn server
is in the middle of writing multiple new files.
Ideally this will not deserialize all the persisted state (so does not
require classloading, etc). We'll therefore work with
BrooklynMementoRawData [4].
We'd therefore be able to run this outside of the Karaf container.
We can identify location references in the XML using a combination of
the following techniques:
1. The marker <locationProxy>...</locationProxy> for references inside
config keys, sensors, etc.
2. Inside an entity, the <locations>...</locations> section.
3. Inside a location, the <parent>...</parent> and
<children>...</children> section.
From (1) and (2), we'll identify all locations that are reachable. From
(3), we'll identify the locations that are indirectly referenced. We'll
then know we can delete all others.
_Optional second part: validating location deletions_
We could validate that we were right to delete those locations. When we
next start Brooklyn, we could look at the set of dangling references
[5]. If anything we deleted is now reported as a dangling reference,
then we'd report this error.
Is this worth doing? Would it be optional (because it requires being
able to class-load everything).
_*Policy/Enricher deletion: proposed solution*_
We can apply the same logic for deleting policies/enrichers that have
become orphaned.
It is a lot easier to identify the policies/enrichers that are in use:
they are all directly referenced by an entity in the section
<enrichers>....</enrichers> or <policies>....</policies>.
Anything not referenced, we can delete.
Aled
[1] https://github.com/apache/brooklyn-server/pull/148
[2]
https://github.com/apache/brooklyn-server/blob/0.9.0/core/src/main/java/org/apache/brooklyn/core/location/dynamic/LocationOwner.java#L64
[3]
http://brooklyn.apache.org/v/0.9.0/ops/persistence/index.html#cli-commands-for-copying-state
[4]
https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/mementos/BrooklynMementoRawData.java
[5]
https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/RebindExceptionHandler.java#L55-L88
Re: [PROPOSAL] Deleting orphaned locations
Posted by Geoff Macartney <ge...@cloudsoftcorp.com>.
Sounds like a good idea. To clarify, does
> It is important that the tool is run offline, in case a Brooklyn server is in the middle of writing multiple new files.
mean that Brooklyn must *not* be running when you use this tool?
If so, can such a tool check this before it runs?
Think it would be worthwhile for the tool to have a ‘preview’ mode, to display locations that would be deleted, so you could sanity check them. Maybe also create an ‘archive log’ of the locations it purges, so that they can be reviewed at a later time.
Geoff
————————————————————
Gnu PGP key - http://is.gd/TTTTuI
> On 4 Jul 2016, at 14:25, Aled Sage <al...@gmail.com> wrote:
>
> Hi all,
>
> We are looking at implementing a "cleaner" that can remove orphaned locations from persisted state.
>
> _*Problem statement*_
> In older versions of Brooklyn (e.g. prior to [1]), we sometimes did not unmanage locations when the associated entity was deleted. This means that the persisted state for some customers contains many "orphaned locations" that are no longer referenced.
>
> We want a way to safely delete these. We only want to delete locations that are not referenced.
>
> These orphaned locations can also cause "dangling references" to be reported, where the orphaned location(s) hold references to things that have been deleted.
>
> References to locations can be in a few formats:
>
> 1. Location is directly referenced from an entity's getLocations().
> 2. Location is indirectly referenced from an entity (e.g. the location
> is the parent of another location that is referenced).
> 3. Location is referenced by an entity in some other way (rather than
> getLocations()) - e.g. in a sensor or config key, such as [2].
> 4. Location is referenced by a policy or enricher.
>
> For (4), I can't think of any such use-case off-hand, but it's possible that a customer might write a bespoke policy/enricher that does this.
>
> For (2), it means we need to worry about reachability. Note there might be groups of locations that are unreachable (e.g. location X and its parent refer to each other, but are not referenced by anything else).
>
> _*Location deletion: proposed solution*_
> We propose an offline tool, similar in use to copy-state [3], which will clean up the persisted state, and save the cleaned-up copy to a given location.
>
> It is important that the tool is run offline, in case a Brooklyn server is in the middle of writing multiple new files.
>
> Ideally this will not deserialize all the persisted state (so does not require classloading, etc). We'll therefore work with BrooklynMementoRawData [4].
> We'd therefore be able to run this outside of the Karaf container.
>
> We can identify location references in the XML using a combination of the following techniques:
>
> 1. The marker <locationProxy>...</locationProxy> for references inside
> config keys, sensors, etc.
> 2. Inside an entity, the <locations>...</locations> section.
> 3. Inside a location, the <parent>...</parent> and
> <children>...</children> section.
>
> From (1) and (2), we'll identify all locations that are reachable. From (3), we'll identify the locations that are indirectly referenced. We'll then know we can delete all others.
>
> _Optional second part: validating location deletions_
> We could validate that we were right to delete those locations. When we next start Brooklyn, we could look at the set of dangling references [5]. If anything we deleted is now reported as a dangling reference, then we'd report this error.
>
> Is this worth doing? Would it be optional (because it requires being able to class-load everything).
>
>
> _*Policy/Enricher deletion: proposed solution*_
> We can apply the same logic for deleting policies/enrichers that have become orphaned.
>
> It is a lot easier to identify the policies/enrichers that are in use: they are all directly referenced by an entity in the section <enrichers>....</enrichers> or <policies>....</policies>.
>
> Anything not referenced, we can delete.
>
> Aled
>
> [1] https://github.com/apache/brooklyn-server/pull/148
> [2] https://github.com/apache/brooklyn-server/blob/0.9.0/core/src/main/java/org/apache/brooklyn/core/location/dynamic/LocationOwner.java#L64
> [3] http://brooklyn.apache.org/v/0.9.0/ops/persistence/index.html#cli-commands-for-copying-state
> [4] https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/mementos/BrooklynMementoRawData.java
> [5] https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/RebindExceptionHandler.java#L55-L88
>
Re: [PROPOSAL] Deleting orphaned locations
Posted by Svetoslav Neykov <sv...@cloudsoftcorp.com>.
> _*Location deletion: proposed solution*_
+1
What happens to named locations? Do we have them as location objects that should be ignored during cleanup?
> _Optional second part: validating location deletions_
Not worth doing I think. Could be an additional step in the troubleshooting documentation.
> _*Policy/Enricher deletion: proposed solution*_
Is there currently a need for this, have leakages been spotted in the wild? Would be trivial to add at a later point.
Svet.
> On 4.07.2016 г., at 16:25, Aled Sage <al...@gmail.com> wrote:
>
> Hi all,
>
> We are looking at implementing a "cleaner" that can remove orphaned locations from persisted state.
>
> _*Problem statement*_
> In older versions of Brooklyn (e.g. prior to [1]), we sometimes did not unmanage locations when the associated entity was deleted. This means that the persisted state for some customers contains many "orphaned locations" that are no longer referenced.
>
> We want a way to safely delete these. We only want to delete locations that are not referenced.
>
> These orphaned locations can also cause "dangling references" to be reported, where the orphaned location(s) hold references to things that have been deleted.
>
> References to locations can be in a few formats:
>
> 1. Location is directly referenced from an entity's getLocations().
> 2. Location is indirectly referenced from an entity (e.g. the location
> is the parent of another location that is referenced).
> 3. Location is referenced by an entity in some other way (rather than
> getLocations()) - e.g. in a sensor or config key, such as [2].
> 4. Location is referenced by a policy or enricher.
>
> For (4), I can't think of any such use-case off-hand, but it's possible that a customer might write a bespoke policy/enricher that does this.
>
> For (2), it means we need to worry about reachability. Note there might be groups of locations that are unreachable (e.g. location X and its parent refer to each other, but are not referenced by anything else).
>
> _*Location deletion: proposed solution*_
> We propose an offline tool, similar in use to copy-state [3], which will clean up the persisted state, and save the cleaned-up copy to a given location.
>
> It is important that the tool is run offline, in case a Brooklyn server is in the middle of writing multiple new files.
>
> Ideally this will not deserialize all the persisted state (so does not require classloading, etc). We'll therefore work with BrooklynMementoRawData [4].
> We'd therefore be able to run this outside of the Karaf container.
>
> We can identify location references in the XML using a combination of the following techniques:
>
> 1. The marker <locationProxy>...</locationProxy> for references inside
> config keys, sensors, etc.
> 2. Inside an entity, the <locations>...</locations> section.
> 3. Inside a location, the <parent>...</parent> and
> <children>...</children> section.
>
> From (1) and (2), we'll identify all locations that are reachable. From (3), we'll identify the locations that are indirectly referenced. We'll then know we can delete all others.
>
> _Optional second part: validating location deletions_
> We could validate that we were right to delete those locations. When we next start Brooklyn, we could look at the set of dangling references [5]. If anything we deleted is now reported as a dangling reference, then we'd report this error.
>
> Is this worth doing? Would it be optional (because it requires being able to class-load everything).
>
>
> _*Policy/Enricher deletion: proposed solution*_
> We can apply the same logic for deleting policies/enrichers that have become orphaned.
>
> It is a lot easier to identify the policies/enrichers that are in use: they are all directly referenced by an entity in the section <enrichers>....</enrichers> or <policies>....</policies>.
>
> Anything not referenced, we can delete.
>
> Aled
>
> [1] https://github.com/apache/brooklyn-server/pull/148
> [2] https://github.com/apache/brooklyn-server/blob/0.9.0/core/src/main/java/org/apache/brooklyn/core/location/dynamic/LocationOwner.java#L64
> [3] http://brooklyn.apache.org/v/0.9.0/ops/persistence/index.html#cli-commands-for-copying-state
> [4] https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/mementos/BrooklynMementoRawData.java
> [5] https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/RebindExceptionHandler.java#L55-L88
>