You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@metron.apache.org by Nick Allen <ni...@nickallen.org> on 2019/08/13 21:53:19 UTC

[DISCUSS] Deprecate Least Recently Used Pruner

As part of https://github.com/apache/metron/pull/1470, I found it difficult
to update the "Least Recently Used Pruner" to work with HBase 2.0.2.  I am
sure that given more time and effort, I could make it work, but is it worth
it?

This is a feature that I myself am not familiar with. I do not know of
anyone using this.  I also did not find much documentation on how to use
this feature.  I certainly don't know the entire user community, so please
let me know if anyone is using this functionality or believes that it
should be maintained going forward.

Would you support deprecating this feature?

Thanks

Re: [DISCUSS] Deprecate Least Recently Used Pruner

Posted by James Sirota <js...@apache.org>.
I think setting TTL values for enrichments should be fine for the time being.  I am not sure what the issue was originally where we didn't feel it was sufficient (probably performance implication on Hbase?), but I think it should be safe reverting back to setting TTL. 

James 

13.08.2019, 15:44, "Casey Stella" <ce...@gmail.com>:
> Ah, that feature. Yes, it never seemed to catch on. It actually wasn't
> from OpenSOC, but a very early feature of Metron. The use-case was that
> enrichments may go stale and removing them based on TTL was easy to do, but
> not ideal. The LeastRecentlyUsedPruner was a MR job which would allow
> enrichments to be pruned which had not been *read* in x amount of time. It
> did this by capturing bloom filters with enrichment keys used for a
> time-range and the MR job would use those bloom filters to determine which
> keys to remove.
>
> I'd be ok with it either being used or removed. It's unclear to me whether
> the use-case that hbase needs to be pruned based on usage was as valid as
> we thought. I guess that makes me +0 on the request to deprecate.
>
> On Tue, Aug 13, 2019 at 6:28 PM Nick Allen <ni...@nickallen.org> wrote:
>
>>  Sure. I should have provided some more context. I can tell you what I do
>>  know about it. Perhaps others can provide some more color.
>>
>>     - This is functionality accessed by a user by running the script; ${
>>     METRON_HOME}/bin/threatintel_bulk_prune.sh
>>
>>     - If you are using access trackers with your HBase enrichments, it runs
>>     as an MR job that counts the number of times each Enrichment is used.
>>  I am
>>     assuming that it then prunes those that are less frequently accessed.
>>
>>     - It was originally created here;
>>     https://github.com/apache/metron/pull/22
>>
>>  On Tue, Aug 13, 2019 at 6:11 PM Otto Fowler <ot...@gmail.com>
>>  wrote:
>>
>>  > Can you summarize what it does? Is it from OpenSOC?
>>  >
>>  >
>>  >
>>  >
>>  > On August 13, 2019 at 17:53:52, Nick Allen (nick@nickallen.org) wrote:
>>  >
>>  > As part of https://github.com/apache/metron/pull/1470, I found it
>>  > difficult
>>  > to update the "Least Recently Used Pruner" to work with HBase 2.0.2. I am
>>  > sure that given more time and effort, I could make it work, but is it
>>  worth
>>  > it?
>>  >
>>  > This is a feature that I myself am not familiar with. I do not know of
>>  > anyone using this. I also did not find much documentation on how to use
>>  > this feature. I certainly don't know the entire user community, so please
>>  > let me know if anyone is using this functionality or believes that it
>>  > should be maintained going forward.
>>  >
>>  > Would you support deprecating this feature?
>>  >
>>  > Thanks
>>  >

------------------- 
Thank you,

James Sirota
PMC- Apache Metron
jsirota AT apache DOT org


Re: [DISCUSS] Deprecate Least Recently Used Pruner

Posted by Casey Stella <ce...@gmail.com>.
Ah, that feature.  Yes, it never seemed to catch on.  It actually wasn't
from OpenSOC, but a very early feature of Metron.  The use-case was that
enrichments may go stale and removing them based on TTL was easy to do, but
not ideal.  The LeastRecentlyUsedPruner was a MR job which would allow
enrichments to be pruned which had not been *read* in x amount of time.  It
did this by capturing bloom filters with enrichment keys used for a
time-range and the MR job would use those bloom filters to determine which
keys to remove.

I'd be ok with it either being used or removed.  It's unclear to me whether
the use-case that hbase needs to be pruned based on usage was as valid as
we thought.  I guess that makes me +0 on the request to deprecate.

On Tue, Aug 13, 2019 at 6:28 PM Nick Allen <ni...@nickallen.org> wrote:

> Sure.  I should have provided some more context.  I can tell you what I do
> know about it.  Perhaps others can provide some more color.
>
>    - This is functionality accessed by a user by running the script; ${
>    METRON_HOME}/bin/threatintel_bulk_prune.sh
>
>
>    - If you are using access trackers with your HBase enrichments, it runs
>    as an MR job that counts the number of times each Enrichment is used.
> I am
>    assuming that it then prunes those that are less frequently accessed.
>
>
>    - It was originally created here;
>    https://github.com/apache/metron/pull/22
>
>
> On Tue, Aug 13, 2019 at 6:11 PM Otto Fowler <ot...@gmail.com>
> wrote:
>
> > Can you summarize what it does? Is it from OpenSOC?
> >
> >
> >
> >
> > On August 13, 2019 at 17:53:52, Nick Allen (nick@nickallen.org) wrote:
> >
> > As part of https://github.com/apache/metron/pull/1470, I found it
> > difficult
> > to update the "Least Recently Used Pruner" to work with HBase 2.0.2. I am
> > sure that given more time and effort, I could make it work, but is it
> worth
> > it?
> >
> > This is a feature that I myself am not familiar with. I do not know of
> > anyone using this. I also did not find much documentation on how to use
> > this feature. I certainly don't know the entire user community, so please
> > let me know if anyone is using this functionality or believes that it
> > should be maintained going forward.
> >
> > Would you support deprecating this feature?
> >
> > Thanks
> >
>

Re: [DISCUSS] Deprecate Least Recently Used Pruner

Posted by Nick Allen <ni...@nickallen.org>.
Sure.  I should have provided some more context.  I can tell you what I do
know about it.  Perhaps others can provide some more color.

   - This is functionality accessed by a user by running the script; ${
   METRON_HOME}/bin/threatintel_bulk_prune.sh


   - If you are using access trackers with your HBase enrichments, it runs
   as an MR job that counts the number of times each Enrichment is used.  I am
   assuming that it then prunes those that are less frequently accessed.


   - It was originally created here;
   https://github.com/apache/metron/pull/22


On Tue, Aug 13, 2019 at 6:11 PM Otto Fowler <ot...@gmail.com> wrote:

> Can you summarize what it does? Is it from OpenSOC?
>
>
>
>
> On August 13, 2019 at 17:53:52, Nick Allen (nick@nickallen.org) wrote:
>
> As part of https://github.com/apache/metron/pull/1470, I found it
> difficult
> to update the "Least Recently Used Pruner" to work with HBase 2.0.2. I am
> sure that given more time and effort, I could make it work, but is it worth
> it?
>
> This is a feature that I myself am not familiar with. I do not know of
> anyone using this. I also did not find much documentation on how to use
> this feature. I certainly don't know the entire user community, so please
> let me know if anyone is using this functionality or believes that it
> should be maintained going forward.
>
> Would you support deprecating this feature?
>
> Thanks
>

Re: [DISCUSS] Deprecate Least Recently Used Pruner

Posted by Otto Fowler <ot...@gmail.com>.
Can you summarize what it does? Is it from OpenSOC?




On August 13, 2019 at 17:53:52, Nick Allen (nick@nickallen.org) wrote:

As part of https://github.com/apache/metron/pull/1470, I found it difficult
to update the "Least Recently Used Pruner" to work with HBase 2.0.2. I am
sure that given more time and effort, I could make it work, but is it worth
it?

This is a feature that I myself am not familiar with. I do not know of
anyone using this. I also did not find much documentation on how to use
this feature. I certainly don't know the entire user community, so please
let me know if anyone is using this functionality or believes that it
should be maintained going forward.

Would you support deprecating this feature?

Thanks