You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by Aled Sage <al...@gmail.com> on 2016/07/04 13:25:05 UTC

[PROPOSAL] Deleting orphaned locations

Hi all,

We are looking at implementing a "cleaner" that can remove orphaned 
locations from persisted state.

_*Problem statement*_
In older versions of Brooklyn (e.g. prior to [1]), we sometimes did not 
unmanage locations when the associated entity was deleted. This means 
that the persisted state for some customers contains many "orphaned 
locations" that are no longer referenced.

We want a way to safely delete these. We only want to delete locations 
that are not referenced.

These orphaned locations can also cause "dangling references" to be 
reported, where the orphaned location(s) hold references to things that 
have been deleted.

References to locations can be in a few formats:

 1. Location is directly referenced from an entity's getLocations().
 2. Location is indirectly referenced from an entity (e.g. the location
    is the parent of another location that is referenced).
 3. Location is referenced by an entity in some other way (rather than
    getLocations()) - e.g. in a sensor or config key, such as [2].
 4. Location is referenced by a policy or enricher.

For (4), I can't think of any such use-case off-hand, but it's possible 
that a customer might write a bespoke policy/enricher that does this.

For (2), it means we need to worry about reachability. Note there might 
be groups of locations that are unreachable (e.g. location X and its 
parent refer to each other, but are not referenced by anything else).

_*Location deletion: proposed solution*_
We propose an offline tool, similar in use to copy-state [3], which will 
clean up the persisted state, and save the cleaned-up copy to a given 
location.

It is important that the tool is run offline, in case a Brooklyn server 
is in the middle of writing multiple new files.

Ideally this will not deserialize all the persisted state (so does not 
require classloading, etc). We'll therefore work with 
BrooklynMementoRawData [4].
We'd therefore be able to run this outside of the Karaf container.

We can identify location references in the XML using a combination of 
the following techniques:

 1. The marker <locationProxy>...</locationProxy> for references inside
    config keys, sensors, etc.
 2. Inside an entity, the <locations>...</locations> section.
 3. Inside a location, the <parent>...</parent> and
    <children>...</children> section.

 From (1) and (2), we'll identify all locations that are reachable. From 
(3), we'll identify the locations that are indirectly referenced. We'll 
then know we can delete all others.

_Optional second part: validating location deletions_
We could validate that we were right to delete those locations. When we 
next start Brooklyn, we could look at the set of dangling references 
[5]. If anything we deleted is now reported as a dangling reference, 
then we'd report this error.

Is this worth doing? Would it be optional (because it requires being 
able to class-load everything).


_*Policy/Enricher deletion: proposed solution*_
We can apply the same logic for deleting policies/enrichers that have 
become orphaned.

It is a lot easier to identify the policies/enrichers that are in use: 
they are all directly referenced by an entity in the section 
<enrichers>....</enrichers> or <policies>....</policies>.

Anything not referenced, we can delete.

Aled

[1] https://github.com/apache/brooklyn-server/pull/148
[2] 
https://github.com/apache/brooklyn-server/blob/0.9.0/core/src/main/java/org/apache/brooklyn/core/location/dynamic/LocationOwner.java#L64
[3] 
http://brooklyn.apache.org/v/0.9.0/ops/persistence/index.html#cli-commands-for-copying-state
[4] 
https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/mementos/BrooklynMementoRawData.java
[5] 
https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/RebindExceptionHandler.java#L55-L88


Re: [PROPOSAL] Deleting orphaned locations

Posted by Geoff Macartney <ge...@cloudsoftcorp.com>.
Sounds like a good idea.  To clarify, does

> It is important that the tool is run offline, in case a Brooklyn server is in the middle of writing multiple new files.

mean that Brooklyn must *not* be running when you use this tool?

If so, can such a tool check this before it runs?

Think it would be worthwhile for the tool to have a ‘preview’ mode, to display locations that would be deleted, so you could sanity check them. Maybe also create an ‘archive log’ of the locations it purges, so that they can be reviewed at a later time.

Geoff




————————————————————
Gnu PGP key - http://is.gd/TTTTuI


> On 4 Jul 2016, at 14:25, Aled Sage <al...@gmail.com> wrote:
> 
> Hi all,
> 
> We are looking at implementing a "cleaner" that can remove orphaned locations from persisted state.
> 
> _*Problem statement*_
> In older versions of Brooklyn (e.g. prior to [1]), we sometimes did not unmanage locations when the associated entity was deleted. This means that the persisted state for some customers contains many "orphaned locations" that are no longer referenced.
> 
> We want a way to safely delete these. We only want to delete locations that are not referenced.
> 
> These orphaned locations can also cause "dangling references" to be reported, where the orphaned location(s) hold references to things that have been deleted.
> 
> References to locations can be in a few formats:
> 
> 1. Location is directly referenced from an entity's getLocations().
> 2. Location is indirectly referenced from an entity (e.g. the location
>   is the parent of another location that is referenced).
> 3. Location is referenced by an entity in some other way (rather than
>   getLocations()) - e.g. in a sensor or config key, such as [2].
> 4. Location is referenced by a policy or enricher.
> 
> For (4), I can't think of any such use-case off-hand, but it's possible that a customer might write a bespoke policy/enricher that does this.
> 
> For (2), it means we need to worry about reachability. Note there might be groups of locations that are unreachable (e.g. location X and its parent refer to each other, but are not referenced by anything else).
> 
> _*Location deletion: proposed solution*_
> We propose an offline tool, similar in use to copy-state [3], which will clean up the persisted state, and save the cleaned-up copy to a given location.
> 
> It is important that the tool is run offline, in case a Brooklyn server is in the middle of writing multiple new files.
> 
> Ideally this will not deserialize all the persisted state (so does not require classloading, etc). We'll therefore work with BrooklynMementoRawData [4].
> We'd therefore be able to run this outside of the Karaf container.
> 
> We can identify location references in the XML using a combination of the following techniques:
> 
> 1. The marker <locationProxy>...</locationProxy> for references inside
>   config keys, sensors, etc.
> 2. Inside an entity, the <locations>...</locations> section.
> 3. Inside a location, the <parent>...</parent> and
>   <children>...</children> section.
> 
> From (1) and (2), we'll identify all locations that are reachable. From (3), we'll identify the locations that are indirectly referenced. We'll then know we can delete all others.
> 
> _Optional second part: validating location deletions_
> We could validate that we were right to delete those locations. When we next start Brooklyn, we could look at the set of dangling references [5]. If anything we deleted is now reported as a dangling reference, then we'd report this error.
> 
> Is this worth doing? Would it be optional (because it requires being able to class-load everything).
> 
> 
> _*Policy/Enricher deletion: proposed solution*_
> We can apply the same logic for deleting policies/enrichers that have become orphaned.
> 
> It is a lot easier to identify the policies/enrichers that are in use: they are all directly referenced by an entity in the section <enrichers>....</enrichers> or <policies>....</policies>.
> 
> Anything not referenced, we can delete.
> 
> Aled
> 
> [1] https://github.com/apache/brooklyn-server/pull/148
> [2] https://github.com/apache/brooklyn-server/blob/0.9.0/core/src/main/java/org/apache/brooklyn/core/location/dynamic/LocationOwner.java#L64
> [3] http://brooklyn.apache.org/v/0.9.0/ops/persistence/index.html#cli-commands-for-copying-state
> [4] https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/mementos/BrooklynMementoRawData.java
> [5] https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/RebindExceptionHandler.java#L55-L88
> 


Re: [PROPOSAL] Deleting orphaned locations

Posted by Svetoslav Neykov <sv...@cloudsoftcorp.com>.
> _*Location deletion: proposed solution*_

+1

What happens to named locations? Do we have them as location objects that should be ignored during cleanup?


> _Optional second part: validating location deletions_

Not worth doing I think. Could be an additional step in the troubleshooting documentation.

> _*Policy/Enricher deletion: proposed solution*_


Is there currently a need for this, have leakages been spotted in the wild? Would be trivial to add at a later point.

Svet.



> On 4.07.2016 г., at 16:25, Aled Sage <al...@gmail.com> wrote:
> 
> Hi all,
> 
> We are looking at implementing a "cleaner" that can remove orphaned locations from persisted state.
> 
> _*Problem statement*_
> In older versions of Brooklyn (e.g. prior to [1]), we sometimes did not unmanage locations when the associated entity was deleted. This means that the persisted state for some customers contains many "orphaned locations" that are no longer referenced.
> 
> We want a way to safely delete these. We only want to delete locations that are not referenced.
> 
> These orphaned locations can also cause "dangling references" to be reported, where the orphaned location(s) hold references to things that have been deleted.
> 
> References to locations can be in a few formats:
> 
> 1. Location is directly referenced from an entity's getLocations().
> 2. Location is indirectly referenced from an entity (e.g. the location
>   is the parent of another location that is referenced).
> 3. Location is referenced by an entity in some other way (rather than
>   getLocations()) - e.g. in a sensor or config key, such as [2].
> 4. Location is referenced by a policy or enricher.
> 
> For (4), I can't think of any such use-case off-hand, but it's possible that a customer might write a bespoke policy/enricher that does this.
> 
> For (2), it means we need to worry about reachability. Note there might be groups of locations that are unreachable (e.g. location X and its parent refer to each other, but are not referenced by anything else).
> 
> _*Location deletion: proposed solution*_
> We propose an offline tool, similar in use to copy-state [3], which will clean up the persisted state, and save the cleaned-up copy to a given location.
> 
> It is important that the tool is run offline, in case a Brooklyn server is in the middle of writing multiple new files.
> 
> Ideally this will not deserialize all the persisted state (so does not require classloading, etc). We'll therefore work with BrooklynMementoRawData [4].
> We'd therefore be able to run this outside of the Karaf container.
> 
> We can identify location references in the XML using a combination of the following techniques:
> 
> 1. The marker <locationProxy>...</locationProxy> for references inside
>   config keys, sensors, etc.
> 2. Inside an entity, the <locations>...</locations> section.
> 3. Inside a location, the <parent>...</parent> and
>   <children>...</children> section.
> 
> From (1) and (2), we'll identify all locations that are reachable. From (3), we'll identify the locations that are indirectly referenced. We'll then know we can delete all others.
> 
> _Optional second part: validating location deletions_
> We could validate that we were right to delete those locations. When we next start Brooklyn, we could look at the set of dangling references [5]. If anything we deleted is now reported as a dangling reference, then we'd report this error.
> 
> Is this worth doing? Would it be optional (because it requires being able to class-load everything).
> 
> 
> _*Policy/Enricher deletion: proposed solution*_
> We can apply the same logic for deleting policies/enrichers that have become orphaned.
> 
> It is a lot easier to identify the policies/enrichers that are in use: they are all directly referenced by an entity in the section <enrichers>....</enrichers> or <policies>....</policies>.
> 
> Anything not referenced, we can delete.
> 
> Aled
> 
> [1] https://github.com/apache/brooklyn-server/pull/148
> [2] https://github.com/apache/brooklyn-server/blob/0.9.0/core/src/main/java/org/apache/brooklyn/core/location/dynamic/LocationOwner.java#L64
> [3] http://brooklyn.apache.org/v/0.9.0/ops/persistence/index.html#cli-commands-for-copying-state
> [4] https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/mementos/BrooklynMementoRawData.java
> [5] https://github.com/apache/brooklyn-server/blob/0.9.0/api/src/main/java/org/apache/brooklyn/api/mgmt/rebind/RebindExceptionHandler.java#L55-L88
>