You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spot.apache.org by "D.Anil" <an...@gmail.com> on 2017/05/31 21:30:55 UTC

Queries related to model building and ODM integration

Hi,



I have the below queries for which I searching for answers in current SPOT
code base. I would like to get help from team.



1) I want to have a kind of setup where I want to build a model with
training data and use that model to generate scores for the real time
records as model is a kind of definition to filter/analyze records. I would
like to know how can achieve that with SPOT for netflow, dns and proxy
data. Currently I see in SPOT ML created model for each day that we run ML.
How can it consider the history data to detect anomalies from the current
day events

2) I believe that ODM is not yet integrated as part of SPOT. Please correct
me if I am wrong. I would like to know if there is tentative plan to
integrate ODM with SPOT ?



Awaiting for your response. Thanks for all the help from support team with
SPOT setup and configuration.



Thanks,

Anil.

Re: Queries related to model building and ODM integration

Posted by "D.Anil" <an...@gmail.com>.
Hello Cesar,

I am glad to hear your prompt response related to ODM. I appreciate the
thoughts and efforts from the team.

We do have good expertise on database design and good understanding of data
related to security. We are two and we would be happy to take up the tasks
related to ODM along with the team and be glad to be part of SPOT
contributors.

Please let us know the procedure to proceed further. Looking forward to
hear from you.

Thanks,
Anil.


On Thu, Jun 1, 2017 at 4:56 AM, Cesar Berho <ce...@apache.org> wrote:

> Hello Anil,
>
> I can response for bullet 2. As of today ODM still requires some
> foundational work, SPOT currently has 3 uses cases we know well, however in
> order bring other into the equation the whole concept of storage layer, and
> Data Lake have to be well defined, this is something part of Committers
> who're experts in DB and Ingest are working one. Ingest redesign has to be
> concluded to support streaming I would say on the majority of uses cases,
> and also batch processing for others. Second the normalization of data
> sources, so how you play to start getting key fields from data sources and
> them make them available on standard format.
>
> The whole effort is no trivial at all, and as Community grows that how we
> continue getting help to accelerate delivery. So in case you have expertise
> on this particular field, and would like to contribute please share out
> your thoughts. Also in terms of timing, some of the early stages of this
> work will be seen by the end of Q2.
>
> Thanks,
> Cesar
>
> On Wed, May 31, 2017 at 4:30 PM, D.Anil <an...@gmail.com> wrote:
>
>> Hi,
>>
>>
>>
>> I have the below queries for which I searching for answers in current
>> SPOT code base. I would like to get help from team.
>>
>>
>>
>> 1) I want to have a kind of setup where I want to build a model with
>> training data and use that model to generate scores for the real time
>> records as model is a kind of definition to filter/analyze records. I would
>> like to know how can achieve that with SPOT for netflow, dns and proxy
>> data. Currently I see in SPOT ML created model for each day that we run ML.
>> How can it consider the history data to detect anomalies from the current
>> day events
>>
>> 2) I believe that ODM is not yet integrated as part of SPOT. Please
>> correct me if I am wrong. I would like to know if there is tentative plan
>> to integrate ODM with SPOT ?
>>
>>
>>
>> Awaiting for your response. Thanks for all the help from support team
>> with SPOT setup and configuration.
>>
>>
>>
>> Thanks,
>>
>> Anil.
>>
>
>

Re: Queries related to model building and ODM integration

Posted by Cesar Berho <ce...@apache.org>.
Hello Anil,

I can response for bullet 2. As of today ODM still requires some
foundational work, SPOT currently has 3 uses cases we know well, however in
order bring other into the equation the whole concept of storage layer, and
Data Lake have to be well defined, this is something part of Committers
who're experts in DB and Ingest are working one. Ingest redesign has to be
concluded to support streaming I would say on the majority of uses cases,
and also batch processing for others. Second the normalization of data
sources, so how you play to start getting key fields from data sources and
them make them available on standard format.

The whole effort is no trivial at all, and as Community grows that how we
continue getting help to accelerate delivery. So in case you have expertise
on this particular field, and would like to contribute please share out
your thoughts. Also in terms of timing, some of the early stages of this
work will be seen by the end of Q2.

Thanks,
Cesar

On Wed, May 31, 2017 at 4:30 PM, D.Anil <an...@gmail.com> wrote:

> Hi,
>
>
>
> I have the below queries for which I searching for answers in current SPOT
> code base. I would like to get help from team.
>
>
>
> 1) I want to have a kind of setup where I want to build a model with
> training data and use that model to generate scores for the real time
> records as model is a kind of definition to filter/analyze records. I would
> like to know how can achieve that with SPOT for netflow, dns and proxy
> data. Currently I see in SPOT ML created model for each day that we run ML.
> How can it consider the history data to detect anomalies from the current
> day events
>
> 2) I believe that ODM is not yet integrated as part of SPOT. Please
> correct me if I am wrong. I would like to know if there is tentative plan
> to integrate ODM with SPOT ?
>
>
>
> Awaiting for your response. Thanks for all the help from support team with
> SPOT setup and configuration.
>
>
>
> Thanks,
>
> Anil.
>