You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@metron.apache.org by "Debo Dutta (dedutta)" <de...@cisco.com> on 2016/06/03 07:43:06 UTC

ML features for Metron

Hi

Wondering if anyone is interested in starting a discussion on what kind of machine learning based features would be good for Metron …. Would love to have the SOC users chime in on the dev list.

The result of the discussion could lead to JIRA items.

thx
debo

Re: ML features for Metron

Posted by Debojyoti Dutta <dd...@gmail.com>.

Thanks Yazan ... these seem like great use cases. Online
clustering/classification makes sense and Metron could leverage Spark....

On Sat, Jun 4, 2016 at 8:02 AM, Yazan Boshmaf <bo...@ece.ubc.ca> wrote:

> One use case of Apache Metron (or OpenSOC) is to analyze amplification DDoS
> attacks <https://www.internetsociety.org/sites/default/files/01_5.pdf>.
>
> With honeypots as information sources (e.g., AmptPot
> <http://www.christian-rossow.de/publications/amppot-raid2015.pdf>), you
> have the typical UDP/IP features (IP addresses, timestamps, protocols,
> ports, payload, etc.), which get enriched with reverse IP data,
> geolocation, etc. Some of these attributes can be used as features to
> identify and characterize types of reflection attacks (e.g., exploiting
> NTP, DNS resolvers, or even RIPv1). Also, it is important to distinguish
> attackers from scanners, using certain features like timestamp
> synchronization across honeypots, as scanner tend to go through IP blocks,
> one by one, as compared to actual attacks.
>
> These are some of the attributes one might consider for this use case. It
> would be nice to have something that does online learning and analytics, so
> clustering / classification is done in real-time. Maybe Apache Spark's
> MLlib?
>
> All the best,
> Yazan
>
> On Sat, Jun 4, 2016 at 4:59 PM, Zeolla@GMail.com <ze...@gmail.com> wrote:
>
> > I'm in
> >
> > On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
> >
> > > Me too.
> > >
> > > On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <ve...@gmail.com>
> > wrote:
> > >
> > > > hi,
> > > >
> > > > i am interested.
> > > >
> > > > regards
> > > > On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <
> dedutta@cisco.com
> > >
> > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > Wondering if anyone is interested in starting a discussion on what
> > kind
> > > > of
> > > > > machine learning based features would be good for Metron …. Would
> > love
> > > to
> > > > > have the SOC users chime in on the dev list.
> > > > >
> > > > > The result of the discussion could lead to JIRA items.
> > > > >
> > > > > thx
> > > > > debo
> > > > >
> > > >
> > >
> > --
> >
> > Jon
> >
>



-- 
-Debo~

Re: ML features for Metron

Posted by Egon Kidmose <ki...@gmail.com>.

Hi Yazan, others

I've ran through and added some of my ideas.
This is my first time with user stories, so please provide any constructive
feedback, whatsoever, and forgive me for breaking any conventions, of which
I know none :)

My input evolves around exploiting that a SOC is generating labels for free
when operating, which in return can be used for training/evaluating ML
models to assist the SOC operation.
My ideas is to keep the human in the loop, to enable both supervised and
unsupervised methods and to aid academics with methods for obtaining labels
for testing, as that is the sole greatest problem (from this here tree I'm
sitting in..)

In brief, the stories I added can be described as follows:

S5 and S6: Alerts (security events, such as IDS alerts) and whether they
are true or false
S7 and S8: Link between events and incidents
S9 and S10: incident management

S5, S7 and S9: Users in the SOC manually labeling data
S6, S8 and S10: ML models providing outputs for users





Mvh. / BR
Egon Kidmose

On Thu, Jun 9, 2016 at 10:56 PM, Yazan Boshmaf <bo...@ece.ubc.ca> wrote:

> That's a great idea.
>
> Here's a link to an editable Google Doc with an initial draft of user
> stories: https://goo.gl/QAxiH6
>
> Please give it a pass and let's iterate over it.
>
> On Thu, Jun 9, 2016 at 10:44 PM, Casey Stella <ce...@gmail.com> wrote:
>
> > +1 on the google doc idea.
> >
> > I think any solution should include a framework that allows the user to
> >
> >    - Manage the training of their models
> >    - Manage the deployment of their models without stopping the
> topologies
> >    (i.e. hot loading of models)
> >    - Application of their models
> >
> > I'd also very much like to see support for
> >
> >    - both small data ML libraries (i.e. scikit-learn) and big-data ML
> >    libraries (i.e. MLLib)
> >    - The popular non-java language support (i.e. Python and R)
> >
> >
> > On Thu, Jun 9, 2016 at 3:33 PM, Debo Dutta (dedutta) <de...@cisco.com>
> > wrote:
> >
> > > Haven't seen one. Hence I started a thread.
> > >
> > > Metron is a community project so please feel free to start a google
> doc.
> > >
> > > And then we can get feedback from the users.
> > >
> > > Thx
> > > Debo
> > >
> > > Sent from my iPhone
> > >
> > > > On Jun 9, 2016, at 12:28 PM, Yazan Boshmaf <bo...@ece.ubc.ca>
> wrote:
> > > >
> > > > Do we have a roadmap for ML support in Metron? If not, how someone
> > reach
> > > > out to existing users of Metron and get more input so that we at
> least
> > > > collect functional requirements?
> > > >
> > > > From my side, I can share some of the nice-to-have features from a
> > > research
> > > > perspective (i.e., feature that would make Metron a better platform
> to
> > > > conduct cybersecurity research).
> > > >
> > > > All the best,
> > > > Yazan
> > > >
> > > >> On Mon, Jun 6, 2016 at 10:12 AM, Debojyoti Dutta <dd...@gmail.com>
> > > wrote:
> > > >>
> > > >> Thx Egon. The idea of labeled data collection is awesome, else we
> have
> > > to
> > > >> resort to unsupervised alone. Maybe one of the things the website
> > could
> > > do
> > > >> is to point to labeled data contributed by users of Metron.
> > > >>
> > > >>> On Mon, Jun 6, 2016 at 12:03 AM, Egon Kidmose <ki...@gmail.com>
> > > wrote:
> > > >>>
> > > >>> Hi all,
> > > >>>
> > > >>> I'd be interested in joining that discussion.
> > > >>>
> > > >>> I'm a phd student applying ML in the security monitoring domain.
> > > >>> It is my expectation that I'll be able to contribute with some
> event
> > > >>> correlation and alert filtering methods.
> > > >>> (Corelation: Finding events that are relevant to each other.
> > Filtering:
> > > >>> Suppressing false alerts from e.g. IDSs, or picking out the
> relevant
> > > >> ones)
> > > >>> You'll see a PR as soon as I have something that is somewhat ready.
> > > >>>
> > > >>> A particularly interesting issue (to me at least) is the
> > possibilities
> > > of
> > > >>> using a real, running SOC as the the "label factory" for labelled
> > data.
> > > >>> Getting real data with labels for supervised methods is one of the
> > > great
> > > >>> challenges, and I see quite some potential for Metron here.
> > > >>>
> > > >>>
> > > >>> Mvh. / BR
> > > >>> Egon Kidmose
> > > >>>
> > > >>>> On Sat, Jun 4, 2016 at 5:02 PM, Yazan Boshmaf <boshmaf@ece.ubc.ca
> >
> > > >>> wrote:
> > > >>>
> > > >>>> One use case of Apache Metron (or OpenSOC) is to analyze
> > amplification
> > > >>> DDoS
> > > >>>> attacks <
> > https://www.internetsociety.org/sites/default/files/01_5.pdf
> > > >>> .
> > > >>>>
> > > >>>> With honeypots as information sources (e.g., AmptPot
> > > >>>> <http://www.christian-rossow.de/publications/amppot-raid2015.pdf
> >),
> > > >> you
> > > >>>> have the typical UDP/IP features (IP addresses, timestamps,
> > protocols,
> > > >>>> ports, payload, etc.), which get enriched with reverse IP data,
> > > >>>> geolocation, etc. Some of these attributes can be used as features
> > to
> > > >>>> identify and characterize types of reflection attacks (e.g.,
> > > exploiting
> > > >>>> NTP, DNS resolvers, or even RIPv1). Also, it is important to
> > > >> distinguish
> > > >>>> attackers from scanners, using certain features like timestamp
> > > >>>> synchronization across honeypots, as scanner tend to go through IP
> > > >>> blocks,
> > > >>>> one by one, as compared to actual attacks.
> > > >>>>
> > > >>>> These are some of the attributes one might consider for this use
> > case.
> > > >> It
> > > >>>> would be nice to have something that does online learning and
> > > >> analytics,
> > > >>> so
> > > >>>> clustering / classification is done in real-time. Maybe Apache
> > Spark's
> > > >>>> MLlib?
> > > >>>>
> > > >>>> All the best,
> > > >>>> Yazan
> > > >>>>
> > > >>>>> On Sat, Jun 4, 2016 at 4:59 PM, Zeolla@GMail.com <
> zeolla@gmail.com
> > >
> > > >>>> wrote:
> > > >>>>
> > > >>>>> I'm in
> > > >>>>>
> > > >>>>>> On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca>
> > > wrote:
> > > >>>>>>
> > > >>>>>> Me too.
> > > >>>>>>
> > > >>>>>>> On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <
> > vervial@gmail.com>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> hi,
> > > >>>>>>>
> > > >>>>>>> i am interested.
> > > >>>>>>>
> > > >>>>>>> regards
> > > >>>>>>> On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <
> > > >>>> dedutta@cisco.com
> > > >>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi
> > > >>>>>>>>
> > > >>>>>>>> Wondering if anyone is interested in starting a discussion on
> > > >>> what
> > > >>>>> kind
> > > >>>>>>> of
> > > >>>>>>>> machine learning based features would be good for Metron ….
> > > >> Would
> > > >>>>> love
> > > >>>>>> to
> > > >>>>>>>> have the SOC users chime in on the dev list.
> > > >>>>>>>>
> > > >>>>>>>> The result of the discussion could lead to JIRA items.
> > > >>>>>>>>
> > > >>>>>>>> thx
> > > >>>>>>>> debo
> > > >>>>> --
> > > >>>>>
> > > >>>>> Jon
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> -Debo~
> > > >>
> > >
> >
>

Re: ML features for Metron

Posted by Yazan Boshmaf <bo...@ece.ubc.ca>.

That's a great idea.

Here's a link to an editable Google Doc with an initial draft of user
stories: https://goo.gl/QAxiH6

Please give it a pass and let's iterate over it.

On Thu, Jun 9, 2016 at 10:44 PM, Casey Stella <ce...@gmail.com> wrote:

> +1 on the google doc idea.
>
> I think any solution should include a framework that allows the user to
>
>    - Manage the training of their models
>    - Manage the deployment of their models without stopping the topologies
>    (i.e. hot loading of models)
>    - Application of their models
>
> I'd also very much like to see support for
>
>    - both small data ML libraries (i.e. scikit-learn) and big-data ML
>    libraries (i.e. MLLib)
>    - The popular non-java language support (i.e. Python and R)
>
>
> On Thu, Jun 9, 2016 at 3:33 PM, Debo Dutta (dedutta) <de...@cisco.com>
> wrote:
>
> > Haven't seen one. Hence I started a thread.
> >
> > Metron is a community project so please feel free to start a google doc.
> >
> > And then we can get feedback from the users.
> >
> > Thx
> > Debo
> >
> > Sent from my iPhone
> >
> > > On Jun 9, 2016, at 12:28 PM, Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
> > >
> > > Do we have a roadmap for ML support in Metron? If not, how someone
> reach
> > > out to existing users of Metron and get more input so that we at least
> > > collect functional requirements?
> > >
> > > From my side, I can share some of the nice-to-have features from a
> > research
> > > perspective (i.e., feature that would make Metron a better platform to
> > > conduct cybersecurity research).
> > >
> > > All the best,
> > > Yazan
> > >
> > >> On Mon, Jun 6, 2016 at 10:12 AM, Debojyoti Dutta <dd...@gmail.com>
> > wrote:
> > >>
> > >> Thx Egon. The idea of labeled data collection is awesome, else we have
> > to
> > >> resort to unsupervised alone. Maybe one of the things the website
> could
> > do
> > >> is to point to labeled data contributed by users of Metron.
> > >>
> > >>> On Mon, Jun 6, 2016 at 12:03 AM, Egon Kidmose <ki...@gmail.com>
> > wrote:
> > >>>
> > >>> Hi all,
> > >>>
> > >>> I'd be interested in joining that discussion.
> > >>>
> > >>> I'm a phd student applying ML in the security monitoring domain.
> > >>> It is my expectation that I'll be able to contribute with some event
> > >>> correlation and alert filtering methods.
> > >>> (Corelation: Finding events that are relevant to each other.
> Filtering:
> > >>> Suppressing false alerts from e.g. IDSs, or picking out the relevant
> > >> ones)
> > >>> You'll see a PR as soon as I have something that is somewhat ready.
> > >>>
> > >>> A particularly interesting issue (to me at least) is the
> possibilities
> > of
> > >>> using a real, running SOC as the the "label factory" for labelled
> data.
> > >>> Getting real data with labels for supervised methods is one of the
> > great
> > >>> challenges, and I see quite some potential for Metron here.
> > >>>
> > >>>
> > >>> Mvh. / BR
> > >>> Egon Kidmose
> > >>>
> > >>>> On Sat, Jun 4, 2016 at 5:02 PM, Yazan Boshmaf <bo...@ece.ubc.ca>
> > >>> wrote:
> > >>>
> > >>>> One use case of Apache Metron (or OpenSOC) is to analyze
> amplification
> > >>> DDoS
> > >>>> attacks <
> https://www.internetsociety.org/sites/default/files/01_5.pdf
> > >>> .
> > >>>>
> > >>>> With honeypots as information sources (e.g., AmptPot
> > >>>> <http://www.christian-rossow.de/publications/amppot-raid2015.pdf>),
> > >> you
> > >>>> have the typical UDP/IP features (IP addresses, timestamps,
> protocols,
> > >>>> ports, payload, etc.), which get enriched with reverse IP data,
> > >>>> geolocation, etc. Some of these attributes can be used as features
> to
> > >>>> identify and characterize types of reflection attacks (e.g.,
> > exploiting
> > >>>> NTP, DNS resolvers, or even RIPv1). Also, it is important to
> > >> distinguish
> > >>>> attackers from scanners, using certain features like timestamp
> > >>>> synchronization across honeypots, as scanner tend to go through IP
> > >>> blocks,
> > >>>> one by one, as compared to actual attacks.
> > >>>>
> > >>>> These are some of the attributes one might consider for this use
> case.
> > >> It
> > >>>> would be nice to have something that does online learning and
> > >> analytics,
> > >>> so
> > >>>> clustering / classification is done in real-time. Maybe Apache
> Spark's
> > >>>> MLlib?
> > >>>>
> > >>>> All the best,
> > >>>> Yazan
> > >>>>
> > >>>>> On Sat, Jun 4, 2016 at 4:59 PM, Zeolla@GMail.com <zeolla@gmail.com
> >
> > >>>> wrote:
> > >>>>
> > >>>>> I'm in
> > >>>>>
> > >>>>>> On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca>
> > wrote:
> > >>>>>>
> > >>>>>> Me too.
> > >>>>>>
> > >>>>>>> On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <
> vervial@gmail.com>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> hi,
> > >>>>>>>
> > >>>>>>> i am interested.
> > >>>>>>>
> > >>>>>>> regards
> > >>>>>>> On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <
> > >>>> dedutta@cisco.com
> > >>>>>>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi
> > >>>>>>>>
> > >>>>>>>> Wondering if anyone is interested in starting a discussion on
> > >>> what
> > >>>>> kind
> > >>>>>>> of
> > >>>>>>>> machine learning based features would be good for Metron ….
> > >> Would
> > >>>>> love
> > >>>>>> to
> > >>>>>>>> have the SOC users chime in on the dev list.
> > >>>>>>>>
> > >>>>>>>> The result of the discussion could lead to JIRA items.
> > >>>>>>>>
> > >>>>>>>> thx
> > >>>>>>>> debo
> > >>>>> --
> > >>>>>
> > >>>>> Jon
> > >>
> > >>
> > >>
> > >> --
> > >> -Debo~
> > >>
> >
>

Re: ML features for Metron

Posted by Casey Stella <ce...@gmail.com>.

+1 on the google doc idea.

I think any solution should include a framework that allows the user to

   - Manage the training of their models
   - Manage the deployment of their models without stopping the topologies
   (i.e. hot loading of models)
   - Application of their models

I'd also very much like to see support for

   - both small data ML libraries (i.e. scikit-learn) and big-data ML
   libraries (i.e. MLLib)
   - The popular non-java language support (i.e. Python and R)


On Thu, Jun 9, 2016 at 3:33 PM, Debo Dutta (dedutta) <de...@cisco.com>
wrote:

> Haven't seen one. Hence I started a thread.
>
> Metron is a community project so please feel free to start a google doc.
>
> And then we can get feedback from the users.
>
> Thx
> Debo
>
> Sent from my iPhone
>
> > On Jun 9, 2016, at 12:28 PM, Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
> >
> > Do we have a roadmap for ML support in Metron? If not, how someone reach
> > out to existing users of Metron and get more input so that we at least
> > collect functional requirements?
> >
> > From my side, I can share some of the nice-to-have features from a
> research
> > perspective (i.e., feature that would make Metron a better platform to
> > conduct cybersecurity research).
> >
> > All the best,
> > Yazan
> >
> >> On Mon, Jun 6, 2016 at 10:12 AM, Debojyoti Dutta <dd...@gmail.com>
> wrote:
> >>
> >> Thx Egon. The idea of labeled data collection is awesome, else we have
> to
> >> resort to unsupervised alone. Maybe one of the things the website could
> do
> >> is to point to labeled data contributed by users of Metron.
> >>
> >>> On Mon, Jun 6, 2016 at 12:03 AM, Egon Kidmose <ki...@gmail.com>
> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> I'd be interested in joining that discussion.
> >>>
> >>> I'm a phd student applying ML in the security monitoring domain.
> >>> It is my expectation that I'll be able to contribute with some event
> >>> correlation and alert filtering methods.
> >>> (Corelation: Finding events that are relevant to each other. Filtering:
> >>> Suppressing false alerts from e.g. IDSs, or picking out the relevant
> >> ones)
> >>> You'll see a PR as soon as I have something that is somewhat ready.
> >>>
> >>> A particularly interesting issue (to me at least) is the possibilities
> of
> >>> using a real, running SOC as the the "label factory" for labelled data.
> >>> Getting real data with labels for supervised methods is one of the
> great
> >>> challenges, and I see quite some potential for Metron here.
> >>>
> >>>
> >>> Mvh. / BR
> >>> Egon Kidmose
> >>>
> >>>> On Sat, Jun 4, 2016 at 5:02 PM, Yazan Boshmaf <bo...@ece.ubc.ca>
> >>> wrote:
> >>>
> >>>> One use case of Apache Metron (or OpenSOC) is to analyze amplification
> >>> DDoS
> >>>> attacks <https://www.internetsociety.org/sites/default/files/01_5.pdf
> >>> .
> >>>>
> >>>> With honeypots as information sources (e.g., AmptPot
> >>>> <http://www.christian-rossow.de/publications/amppot-raid2015.pdf>),
> >> you
> >>>> have the typical UDP/IP features (IP addresses, timestamps, protocols,
> >>>> ports, payload, etc.), which get enriched with reverse IP data,
> >>>> geolocation, etc. Some of these attributes can be used as features to
> >>>> identify and characterize types of reflection attacks (e.g.,
> exploiting
> >>>> NTP, DNS resolvers, or even RIPv1). Also, it is important to
> >> distinguish
> >>>> attackers from scanners, using certain features like timestamp
> >>>> synchronization across honeypots, as scanner tend to go through IP
> >>> blocks,
> >>>> one by one, as compared to actual attacks.
> >>>>
> >>>> These are some of the attributes one might consider for this use case.
> >> It
> >>>> would be nice to have something that does online learning and
> >> analytics,
> >>> so
> >>>> clustering / classification is done in real-time. Maybe Apache Spark's
> >>>> MLlib?
> >>>>
> >>>> All the best,
> >>>> Yazan
> >>>>
> >>>>> On Sat, Jun 4, 2016 at 4:59 PM, Zeolla@GMail.com <ze...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> I'm in
> >>>>>
> >>>>>> On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca>
> wrote:
> >>>>>>
> >>>>>> Me too.
> >>>>>>
> >>>>>>> On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <ve...@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> hi,
> >>>>>>>
> >>>>>>> i am interested.
> >>>>>>>
> >>>>>>> regards
> >>>>>>> On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <
> >>>> dedutta@cisco.com
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi
> >>>>>>>>
> >>>>>>>> Wondering if anyone is interested in starting a discussion on
> >>> what
> >>>>> kind
> >>>>>>> of
> >>>>>>>> machine learning based features would be good for Metron ….
> >> Would
> >>>>> love
> >>>>>> to
> >>>>>>>> have the SOC users chime in on the dev list.
> >>>>>>>>
> >>>>>>>> The result of the discussion could lead to JIRA items.
> >>>>>>>>
> >>>>>>>> thx
> >>>>>>>> debo
> >>>>> --
> >>>>>
> >>>>> Jon
> >>
> >>
> >>
> >> --
> >> -Debo~
> >>
>

Re: ML features for Metron

Posted by "Debo Dutta (dedutta)" <de...@cisco.com>.

Haven't seen one. Hence I started a thread. 

Metron is a community project so please feel free to start a google doc. 

And then we can get feedback from the users. 

Thx 
Debo

Sent from my iPhone

> On Jun 9, 2016, at 12:28 PM, Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
> 
> Do we have a roadmap for ML support in Metron? If not, how someone reach
> out to existing users of Metron and get more input so that we at least
> collect functional requirements?
> 
> From my side, I can share some of the nice-to-have features from a research
> perspective (i.e., feature that would make Metron a better platform to
> conduct cybersecurity research).
> 
> All the best,
> Yazan
> 
>> On Mon, Jun 6, 2016 at 10:12 AM, Debojyoti Dutta <dd...@gmail.com> wrote:
>> 
>> Thx Egon. The idea of labeled data collection is awesome, else we have to
>> resort to unsupervised alone. Maybe one of the things the website could do
>> is to point to labeled data contributed by users of Metron.
>> 
>>> On Mon, Jun 6, 2016 at 12:03 AM, Egon Kidmose <ki...@gmail.com> wrote:
>>> 
>>> Hi all,
>>> 
>>> I'd be interested in joining that discussion.
>>> 
>>> I'm a phd student applying ML in the security monitoring domain.
>>> It is my expectation that I'll be able to contribute with some event
>>> correlation and alert filtering methods.
>>> (Corelation: Finding events that are relevant to each other. Filtering:
>>> Suppressing false alerts from e.g. IDSs, or picking out the relevant
>> ones)
>>> You'll see a PR as soon as I have something that is somewhat ready.
>>> 
>>> A particularly interesting issue (to me at least) is the possibilities of
>>> using a real, running SOC as the the "label factory" for labelled data.
>>> Getting real data with labels for supervised methods is one of the great
>>> challenges, and I see quite some potential for Metron here.
>>> 
>>> 
>>> Mvh. / BR
>>> Egon Kidmose
>>> 
>>>> On Sat, Jun 4, 2016 at 5:02 PM, Yazan Boshmaf <bo...@ece.ubc.ca>
>>> wrote:
>>> 
>>>> One use case of Apache Metron (or OpenSOC) is to analyze amplification
>>> DDoS
>>>> attacks <https://www.internetsociety.org/sites/default/files/01_5.pdf
>>> .
>>>> 
>>>> With honeypots as information sources (e.g., AmptPot
>>>> <http://www.christian-rossow.de/publications/amppot-raid2015.pdf>),
>> you
>>>> have the typical UDP/IP features (IP addresses, timestamps, protocols,
>>>> ports, payload, etc.), which get enriched with reverse IP data,
>>>> geolocation, etc. Some of these attributes can be used as features to
>>>> identify and characterize types of reflection attacks (e.g., exploiting
>>>> NTP, DNS resolvers, or even RIPv1). Also, it is important to
>> distinguish
>>>> attackers from scanners, using certain features like timestamp
>>>> synchronization across honeypots, as scanner tend to go through IP
>>> blocks,
>>>> one by one, as compared to actual attacks.
>>>> 
>>>> These are some of the attributes one might consider for this use case.
>> It
>>>> would be nice to have something that does online learning and
>> analytics,
>>> so
>>>> clustering / classification is done in real-time. Maybe Apache Spark's
>>>> MLlib?
>>>> 
>>>> All the best,
>>>> Yazan
>>>> 
>>>>> On Sat, Jun 4, 2016 at 4:59 PM, Zeolla@GMail.com <ze...@gmail.com>
>>>> wrote:
>>>> 
>>>>> I'm in
>>>>> 
>>>>>> On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
>>>>>> 
>>>>>> Me too.
>>>>>> 
>>>>>>> On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <ve...@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> hi,
>>>>>>> 
>>>>>>> i am interested.
>>>>>>> 
>>>>>>> regards
>>>>>>> On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <
>>>> dedutta@cisco.com
>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi
>>>>>>>> 
>>>>>>>> Wondering if anyone is interested in starting a discussion on
>>> what
>>>>> kind
>>>>>>> of
>>>>>>>> machine learning based features would be good for Metron ….
>> Would
>>>>> love
>>>>>> to
>>>>>>>> have the SOC users chime in on the dev list.
>>>>>>>> 
>>>>>>>> The result of the discussion could lead to JIRA items.
>>>>>>>> 
>>>>>>>> thx
>>>>>>>> debo
>>>>> --
>>>>> 
>>>>> Jon
>> 
>> 
>> 
>> --
>> -Debo~
>>

Re: ML features for Metron

Posted by Yazan Boshmaf <bo...@ece.ubc.ca>.

Do we have a roadmap for ML support in Metron? If not, how someone reach
out to existing users of Metron and get more input so that we at least
collect functional requirements?

From my side, I can share some of the nice-to-have features from a research
perspective (i.e., feature that would make Metron a better platform to
conduct cybersecurity research).

All the best,
Yazan

On Mon, Jun 6, 2016 at 10:12 AM, Debojyoti Dutta <dd...@gmail.com> wrote:

> Thx Egon. The idea of labeled data collection is awesome, else we have to
> resort to unsupervised alone. Maybe one of the things the website could do
> is to point to labeled data contributed by users of Metron.
>
> On Mon, Jun 6, 2016 at 12:03 AM, Egon Kidmose <ki...@gmail.com> wrote:
>
> > Hi all,
> >
> > I'd be interested in joining that discussion.
> >
> > I'm a phd student applying ML in the security monitoring domain.
> > It is my expectation that I'll be able to contribute with some event
> > correlation and alert filtering methods.
> > (Corelation: Finding events that are relevant to each other. Filtering:
> > Suppressing false alerts from e.g. IDSs, or picking out the relevant
> ones)
> > You'll see a PR as soon as I have something that is somewhat ready.
> >
> > A particularly interesting issue (to me at least) is the possibilities of
> > using a real, running SOC as the the "label factory" for labelled data.
> > Getting real data with labels for supervised methods is one of the great
> > challenges, and I see quite some potential for Metron here.
> >
> >
> > Mvh. / BR
> > Egon Kidmose
> >
> > On Sat, Jun 4, 2016 at 5:02 PM, Yazan Boshmaf <bo...@ece.ubc.ca>
> wrote:
> >
> > > One use case of Apache Metron (or OpenSOC) is to analyze amplification
> > DDoS
> > > attacks <https://www.internetsociety.org/sites/default/files/01_5.pdf
> >.
> > >
> > > With honeypots as information sources (e.g., AmptPot
> > > <http://www.christian-rossow.de/publications/amppot-raid2015.pdf>),
> you
> > > have the typical UDP/IP features (IP addresses, timestamps, protocols,
> > > ports, payload, etc.), which get enriched with reverse IP data,
> > > geolocation, etc. Some of these attributes can be used as features to
> > > identify and characterize types of reflection attacks (e.g., exploiting
> > > NTP, DNS resolvers, or even RIPv1). Also, it is important to
> distinguish
> > > attackers from scanners, using certain features like timestamp
> > > synchronization across honeypots, as scanner tend to go through IP
> > blocks,
> > > one by one, as compared to actual attacks.
> > >
> > > These are some of the attributes one might consider for this use case.
> It
> > > would be nice to have something that does online learning and
> analytics,
> > so
> > > clustering / classification is done in real-time. Maybe Apache Spark's
> > > MLlib?
> > >
> > > All the best,
> > > Yazan
> > >
> > > On Sat, Jun 4, 2016 at 4:59 PM, Zeolla@GMail.com <ze...@gmail.com>
> > wrote:
> > >
> > > > I'm in
> > > >
> > > > On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
> > > >
> > > > > Me too.
> > > > >
> > > > > On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <ve...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > hi,
> > > > > >
> > > > > > i am interested.
> > > > > >
> > > > > > regards
> > > > > > On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <
> > > dedutta@cisco.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi
> > > > > > >
> > > > > > > Wondering if anyone is interested in starting a discussion on
> > what
> > > > kind
> > > > > > of
> > > > > > > machine learning based features would be good for Metron ….
> Would
> > > > love
> > > > > to
> > > > > > > have the SOC users chime in on the dev list.
> > > > > > >
> > > > > > > The result of the discussion could lead to JIRA items.
> > > > > > >
> > > > > > > thx
> > > > > > > debo
> > > > > > >
> > > > > >
> > > > >
> > > > --
> > > >
> > > > Jon
> > > >
> > >
> >
>
>
>
> --
> -Debo~
>

Re: ML features for Metron

Posted by Debojyoti Dutta <dd...@gmail.com>.

Thx Egon. The idea of labeled data collection is awesome, else we have to
resort to unsupervised alone. Maybe one of the things the website could do
is to point to labeled data contributed by users of Metron.

On Mon, Jun 6, 2016 at 12:03 AM, Egon Kidmose <ki...@gmail.com> wrote:

> Hi all,
>
> I'd be interested in joining that discussion.
>
> I'm a phd student applying ML in the security monitoring domain.
> It is my expectation that I'll be able to contribute with some event
> correlation and alert filtering methods.
> (Corelation: Finding events that are relevant to each other. Filtering:
> Suppressing false alerts from e.g. IDSs, or picking out the relevant ones)
> You'll see a PR as soon as I have something that is somewhat ready.
>
> A particularly interesting issue (to me at least) is the possibilities of
> using a real, running SOC as the the "label factory" for labelled data.
> Getting real data with labels for supervised methods is one of the great
> challenges, and I see quite some potential for Metron here.
>
>
> Mvh. / BR
> Egon Kidmose
>
> On Sat, Jun 4, 2016 at 5:02 PM, Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
>
> > One use case of Apache Metron (or OpenSOC) is to analyze amplification
> DDoS
> > attacks <https://www.internetsociety.org/sites/default/files/01_5.pdf>.
> >
> > With honeypots as information sources (e.g., AmptPot
> > <http://www.christian-rossow.de/publications/amppot-raid2015.pdf>), you
> > have the typical UDP/IP features (IP addresses, timestamps, protocols,
> > ports, payload, etc.), which get enriched with reverse IP data,
> > geolocation, etc. Some of these attributes can be used as features to
> > identify and characterize types of reflection attacks (e.g., exploiting
> > NTP, DNS resolvers, or even RIPv1). Also, it is important to distinguish
> > attackers from scanners, using certain features like timestamp
> > synchronization across honeypots, as scanner tend to go through IP
> blocks,
> > one by one, as compared to actual attacks.
> >
> > These are some of the attributes one might consider for this use case. It
> > would be nice to have something that does online learning and analytics,
> so
> > clustering / classification is done in real-time. Maybe Apache Spark's
> > MLlib?
> >
> > All the best,
> > Yazan
> >
> > On Sat, Jun 4, 2016 at 4:59 PM, Zeolla@GMail.com <ze...@gmail.com>
> wrote:
> >
> > > I'm in
> > >
> > > On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
> > >
> > > > Me too.
> > > >
> > > > On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <ve...@gmail.com>
> > > wrote:
> > > >
> > > > > hi,
> > > > >
> > > > > i am interested.
> > > > >
> > > > > regards
> > > > > On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <
> > dedutta@cisco.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi
> > > > > >
> > > > > > Wondering if anyone is interested in starting a discussion on
> what
> > > kind
> > > > > of
> > > > > > machine learning based features would be good for Metron …. Would
> > > love
> > > > to
> > > > > > have the SOC users chime in on the dev list.
> > > > > >
> > > > > > The result of the discussion could lead to JIRA items.
> > > > > >
> > > > > > thx
> > > > > > debo
> > > > > >
> > > > >
> > > >
> > > --
> > >
> > > Jon
> > >
> >
>



-- 
-Debo~

Re: ML features for Metron

Posted by Egon Kidmose <ki...@gmail.com>.

Hi all,

I'd be interested in joining that discussion.

I'm a phd student applying ML in the security monitoring domain.
It is my expectation that I'll be able to contribute with some event
correlation and alert filtering methods.
(Corelation: Finding events that are relevant to each other. Filtering:
Suppressing false alerts from e.g. IDSs, or picking out the relevant ones)
You'll see a PR as soon as I have something that is somewhat ready.

A particularly interesting issue (to me at least) is the possibilities of
using a real, running SOC as the the "label factory" for labelled data.
Getting real data with labels for supervised methods is one of the great
challenges, and I see quite some potential for Metron here.


Mvh. / BR
Egon Kidmose

On Sat, Jun 4, 2016 at 5:02 PM, Yazan Boshmaf <bo...@ece.ubc.ca> wrote:

> One use case of Apache Metron (or OpenSOC) is to analyze amplification DDoS
> attacks <https://www.internetsociety.org/sites/default/files/01_5.pdf>.
>
> With honeypots as information sources (e.g., AmptPot
> <http://www.christian-rossow.de/publications/amppot-raid2015.pdf>), you
> have the typical UDP/IP features (IP addresses, timestamps, protocols,
> ports, payload, etc.), which get enriched with reverse IP data,
> geolocation, etc. Some of these attributes can be used as features to
> identify and characterize types of reflection attacks (e.g., exploiting
> NTP, DNS resolvers, or even RIPv1). Also, it is important to distinguish
> attackers from scanners, using certain features like timestamp
> synchronization across honeypots, as scanner tend to go through IP blocks,
> one by one, as compared to actual attacks.
>
> These are some of the attributes one might consider for this use case. It
> would be nice to have something that does online learning and analytics, so
> clustering / classification is done in real-time. Maybe Apache Spark's
> MLlib?
>
> All the best,
> Yazan
>
> On Sat, Jun 4, 2016 at 4:59 PM, Zeolla@GMail.com <ze...@gmail.com> wrote:
>
> > I'm in
> >
> > On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
> >
> > > Me too.
> > >
> > > On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <ve...@gmail.com>
> > wrote:
> > >
> > > > hi,
> > > >
> > > > i am interested.
> > > >
> > > > regards
> > > > On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <
> dedutta@cisco.com
> > >
> > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > Wondering if anyone is interested in starting a discussion on what
> > kind
> > > > of
> > > > > machine learning based features would be good for Metron …. Would
> > love
> > > to
> > > > > have the SOC users chime in on the dev list.
> > > > >
> > > > > The result of the discussion could lead to JIRA items.
> > > > >
> > > > > thx
> > > > > debo
> > > > >
> > > >
> > >
> > --
> >
> > Jon
> >
>

Re: ML features for Metron

Posted by Yazan Boshmaf <bo...@ece.ubc.ca>.

One use case of Apache Metron (or OpenSOC) is to analyze amplification DDoS
attacks <https://www.internetsociety.org/sites/default/files/01_5.pdf>.

With honeypots as information sources (e.g., AmptPot
<http://www.christian-rossow.de/publications/amppot-raid2015.pdf>), you
have the typical UDP/IP features (IP addresses, timestamps, protocols,
ports, payload, etc.), which get enriched with reverse IP data,
geolocation, etc. Some of these attributes can be used as features to
identify and characterize types of reflection attacks (e.g., exploiting
NTP, DNS resolvers, or even RIPv1). Also, it is important to distinguish
attackers from scanners, using certain features like timestamp
synchronization across honeypots, as scanner tend to go through IP blocks,
one by one, as compared to actual attacks.

These are some of the attributes one might consider for this use case. It
would be nice to have something that does online learning and analytics, so
clustering / classification is done in real-time. Maybe Apache Spark's
MLlib?

All the best,
Yazan

On Sat, Jun 4, 2016 at 4:59 PM, Zeolla@GMail.com <ze...@gmail.com> wrote:

> I'm in
>
> On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca> wrote:
>
> > Me too.
> >
> > On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <ve...@gmail.com>
> wrote:
> >
> > > hi,
> > >
> > > i am interested.
> > >
> > > regards
> > > On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <dedutta@cisco.com
> >
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > Wondering if anyone is interested in starting a discussion on what
> kind
> > > of
> > > > machine learning based features would be good for Metron …. Would
> love
> > to
> > > > have the SOC users chime in on the dev list.
> > > >
> > > > The result of the discussion could lead to JIRA items.
> > > >
> > > > thx
> > > > debo
> > > >
> > >
> >
> --
>
> Jon
>

Re: ML features for Metron

Posted by "Zeolla@GMail.com" <ze...@gmail.com>.

I'm in

On Sat, Jun 4, 2016, 09:53 Yazan Boshmaf <bo...@ece.ubc.ca> wrote:

> Me too.
>
> On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <ve...@gmail.com> wrote:
>
> > hi,
> >
> > i am interested.
> >
> > regards
> > On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <de...@cisco.com>
> > wrote:
> >
> > > Hi
> > >
> > > Wondering if anyone is interested in starting a discussion on what kind
> > of
> > > machine learning based features would be good for Metron …. Would love
> to
> > > have the SOC users chime in on the dev list.
> > >
> > > The result of the discussion could lead to JIRA items.
> > >
> > > thx
> > > debo
> > >
> >
>
-- 

Jon

Re: ML features for Metron

Posted by Yazan Boshmaf <bo...@ece.ubc.ca>.

Me too.

On Sat, Jun 4, 2016 at 9:43 AM, Franck Vervial <ve...@gmail.com> wrote:

> hi,
>
> i am interested.
>
> regards
> On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <de...@cisco.com>
> wrote:
>
> > Hi
> >
> > Wondering if anyone is interested in starting a discussion on what kind
> of
> > machine learning based features would be good for Metron …. Would love to
> > have the SOC users chime in on the dev list.
> >
> > The result of the discussion could lead to JIRA items.
> >
> > thx
> > debo
> >
>

Re: ML features for Metron

Posted by Franck Vervial <ve...@gmail.com>.

hi,

i am interested.

regards
On Fri, 3 Jun 2016 at 3:43 PM, Debo Dutta (dedutta) <de...@cisco.com>
wrote:

> Hi
>
> Wondering if anyone is interested in starting a discussion on what kind of
> machine learning based features would be good for Metron …. Would love to
> have the SOC users chime in on the dev list.
>
> The result of the discussion could lead to JIRA items.
>
> thx
> debo
>