You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spot.apache.org by Vartika Singh <vs...@cloudera.com> on 2017/10/05 02:38:08 UTC

Spot Ingestion Design Document

All,

Keeping in mind the scope of growth in this project, we have come with this
initial design document for ingestion. The components of the ingestion
module have been designed keeping in mind the following constraints:

   1. Extendible to any source.
   2. No latency between ingestion and output to the HDFS/persistent store.
   3. Maintain the integrity of the ODM module in order to facilitate
   seamless integration with applications.

We are opening this document for review and all comments, feedbacks and
suggestions will be very welcome and will be helpful.

https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t


Vartika Singh

Re: Spot Ingestion Design Document

Posted by Vartika Singh <vs...@cloudera.com>.
I believe google doc should be sufficient and more convenient for people to
share thoughts.

On Thu, Oct 5, 2017 at 7:19 AM, <ja...@apache.org> wrote:

> Thanks for putting together the design document Vartika!
>
> I’ve shared few thoughts on the document - do you prefer me to send emails
> to the mailing list as instructed on the page 3 or are google comments
> sufficient to start a discussion?
>
> Jarcec
>
> > On Oct 4, 2017, at 9:54 PM, Vartika Singh <vs...@cloudera.com> wrote:
> >
> > Hello Sam,
> >
> > The design could easily be extended to a several streaming cases.
> >
> > Adding more talent pool for development is always a good idea. As long as
> > there is active development and contribution.
> >
> > Once the design goes through, and if we can get a few committers (hence
> > growing the community) actively developing on the project, it would
> > definitely boost the growth and turnaround time overall. We have been a
> bit
> > slow on that.
> >
> > Thank you
> > Vartika
> >
> > On Wed, Oct 4, 2017 at 9:15 PM, Samuel Heywood <sam.heywood@cloudera.com
> >
> > wrote:
> >
> >> Vartika
> >>
> >> I'm reading through the ingest doc. This is very good.
> >>
> >> Question for you - this ingest design, it is extensible to use cases
> >> outside of cyber right? I mean, if we wanted to tap into online
> e-Commerce
> >> data sources we could still use the core aspects of your design right?
> >>
> >> I'm asking as if that's true, it may make it more possible to unlock
> >> resources within Cloudera to help develop this stuff
> >>
> >> Thank you Vartika,
> >>
> >> Sam
> >>
> >> On Thu, Oct 5, 2017 at 10:38 AM, Vartika Singh <vs...@cloudera.com>
> >> wrote:
> >>
> >>> All,
> >>>
> >>> Keeping in mind the scope of growth in this project, we have come with
> >> this
> >>> initial design document for ingestion. The components of the ingestion
> >>> module have been designed keeping in mind the following constraints:
> >>>
> >>>   1. Extendible to any source.
> >>>   2. No latency between ingestion and output to the HDFS/persistent
> >> store.
> >>>   3. Maintain the integrity of the ODM module in order to facilitate
> >>>   seamless integration with applications.
> >>>
> >>> We are opening this document for review and all comments, feedbacks and
> >>> suggestions will be very welcome and will be helpful.
> >>>
> >>> https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
> >>> GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t
> >>>
> >>>
> >>> Vartika Singh
> >>>
> >>
> >>
> >>
> >> --
> >> Sam Heywood
> >> Director Cybersecurity Strategy, Cloudera
> >> sam.heywood@cloudera.com <sa...@gazzang.com>
> >> M: (512) 716-9660
> >>
> >
> >
> >
> > --
> > Vartika Singh
> > Cloudera
>
>


-- 
Vartika Singh
Cloudera

Re: Spot Ingestion Design Document

Posted by ja...@apache.org.
Thanks for putting together the design document Vartika!

I’ve shared few thoughts on the document - do you prefer me to send emails to the mailing list as instructed on the page 3 or are google comments sufficient to start a discussion?

Jarcec

> On Oct 4, 2017, at 9:54 PM, Vartika Singh <vs...@cloudera.com> wrote:
> 
> Hello Sam,
> 
> The design could easily be extended to a several streaming cases.
> 
> Adding more talent pool for development is always a good idea. As long as
> there is active development and contribution.
> 
> Once the design goes through, and if we can get a few committers (hence
> growing the community) actively developing on the project, it would
> definitely boost the growth and turnaround time overall. We have been a bit
> slow on that.
> 
> Thank you
> Vartika
> 
> On Wed, Oct 4, 2017 at 9:15 PM, Samuel Heywood <sa...@cloudera.com>
> wrote:
> 
>> Vartika
>> 
>> I'm reading through the ingest doc. This is very good.
>> 
>> Question for you - this ingest design, it is extensible to use cases
>> outside of cyber right? I mean, if we wanted to tap into online e-Commerce
>> data sources we could still use the core aspects of your design right?
>> 
>> I'm asking as if that's true, it may make it more possible to unlock
>> resources within Cloudera to help develop this stuff
>> 
>> Thank you Vartika,
>> 
>> Sam
>> 
>> On Thu, Oct 5, 2017 at 10:38 AM, Vartika Singh <vs...@cloudera.com>
>> wrote:
>> 
>>> All,
>>> 
>>> Keeping in mind the scope of growth in this project, we have come with
>> this
>>> initial design document for ingestion. The components of the ingestion
>>> module have been designed keeping in mind the following constraints:
>>> 
>>>   1. Extendible to any source.
>>>   2. No latency between ingestion and output to the HDFS/persistent
>> store.
>>>   3. Maintain the integrity of the ODM module in order to facilitate
>>>   seamless integration with applications.
>>> 
>>> We are opening this document for review and all comments, feedbacks and
>>> suggestions will be very welcome and will be helpful.
>>> 
>>> https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
>>> GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t
>>> 
>>> 
>>> Vartika Singh
>>> 
>> 
>> 
>> 
>> --
>> Sam Heywood
>> Director Cybersecurity Strategy, Cloudera
>> sam.heywood@cloudera.com <sa...@gazzang.com>
>> M: (512) 716-9660
>> 
> 
> 
> 
> -- 
> Vartika Singh
> Cloudera


Re: Spot Ingestion Design Document

Posted by Vartika Singh <vs...@cloudera.com>.
Hello Sam,

The design could easily be extended to a several streaming cases.

Adding more talent pool for development is always a good idea. As long as
there is active development and contribution.

Once the design goes through, and if we can get a few committers (hence
growing the community) actively developing on the project, it would
definitely boost the growth and turnaround time overall. We have been a bit
slow on that.

Thank you
Vartika

On Wed, Oct 4, 2017 at 9:15 PM, Samuel Heywood <sa...@cloudera.com>
wrote:

> Vartika
>
> I'm reading through the ingest doc. This is very good.
>
> Question for you - this ingest design, it is extensible to use cases
> outside of cyber right? I mean, if we wanted to tap into online e-Commerce
> data sources we could still use the core aspects of your design right?
>
> I'm asking as if that's true, it may make it more possible to unlock
> resources within Cloudera to help develop this stuff
>
> Thank you Vartika,
>
> Sam
>
> On Thu, Oct 5, 2017 at 10:38 AM, Vartika Singh <vs...@cloudera.com>
> wrote:
>
> > All,
> >
> > Keeping in mind the scope of growth in this project, we have come with
> this
> > initial design document for ingestion. The components of the ingestion
> > module have been designed keeping in mind the following constraints:
> >
> >    1. Extendible to any source.
> >    2. No latency between ingestion and output to the HDFS/persistent
> store.
> >    3. Maintain the integrity of the ODM module in order to facilitate
> >    seamless integration with applications.
> >
> > We are opening this document for review and all comments, feedbacks and
> > suggestions will be very welcome and will be helpful.
> >
> > https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
> > GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t
> >
> >
> > Vartika Singh
> >
>
>
>
> --
> Sam Heywood
> Director Cybersecurity Strategy, Cloudera
> sam.heywood@cloudera.com <sa...@gazzang.com>
> M: (512) 716-9660
>



-- 
Vartika Singh
Cloudera

Re: Spot Ingestion Design Document

Posted by Samuel Heywood <sa...@cloudera.com>.
Vartika

I'm reading through the ingest doc. This is very good.

Question for you - this ingest design, it is extensible to use cases
outside of cyber right? I mean, if we wanted to tap into online e-Commerce
data sources we could still use the core aspects of your design right?

I'm asking as if that's true, it may make it more possible to unlock
resources within Cloudera to help develop this stuff

Thank you Vartika,

Sam

On Thu, Oct 5, 2017 at 10:38 AM, Vartika Singh <vs...@cloudera.com> wrote:

> All,
>
> Keeping in mind the scope of growth in this project, we have come with this
> initial design document for ingestion. The components of the ingestion
> module have been designed keeping in mind the following constraints:
>
>    1. Extendible to any source.
>    2. No latency between ingestion and output to the HDFS/persistent store.
>    3. Maintain the integrity of the ODM module in order to facilitate
>    seamless integration with applications.
>
> We are opening this document for review and all comments, feedbacks and
> suggestions will be very welcome and will be helpful.
>
> https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
> GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t
>
>
> Vartika Singh
>



-- 
Sam Heywood
Director Cybersecurity Strategy, Cloudera
sam.heywood@cloudera.com <sa...@gazzang.com>
M: (512) 716-9660

Re: Spot Ingestion Design Document

Posted by Vartika Singh <vs...@cloudera.com>.
Hello Carlos,

I believe you might be trying to access the older document which is within
Cloudera's GSuite.

I have made the new document public, and any body with a link can comment
on it. That includes suggestion mode.

https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__GyB1Msu2LUofIU4/edit#

Let me know if you have trouble accessing the above link.

Regards
Vartika

On Thu, Oct 5, 2017 at 7:50 AM, solrac901@gmail.com <so...@gmail.com>
wrote:

> I don't have access:( anyway, i was thinking on add some details on
> forensics and malware stuff on the ODM side.
> Regards.
>
> 2017-10-04 21:38 GMT-05:00 Vartika Singh <vs...@cloudera.com>:
>
> > All,
> >
> > Keeping in mind the scope of growth in this project, we have come with
> this
> > initial design document for ingestion. The components of the ingestion
> > module have been designed keeping in mind the following constraints:
> >
> >    1. Extendible to any source.
> >    2. No latency between ingestion and output to the HDFS/persistent
> store.
> >    3. Maintain the integrity of the ODM module in order to facilitate
> >    seamless integration with applications.
> >
> > We are opening this document for review and all comments, feedbacks and
> > suggestions will be very welcome and will be helpful.
> >
> > https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
> > GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t
> >
> >
> > Vartika Singh
> >
>



-- 
Vartika Singh
Cloudera

Re: Spot Ingestion Design Document

Posted by "solrac901@gmail.com" <so...@gmail.com>.
I don't have access:( anyway, i was thinking on add some details on
forensics and malware stuff on the ODM side.
Regards.

2017-10-04 21:38 GMT-05:00 Vartika Singh <vs...@cloudera.com>:

> All,
>
> Keeping in mind the scope of growth in this project, we have come with this
> initial design document for ingestion. The components of the ingestion
> module have been designed keeping in mind the following constraints:
>
>    1. Extendible to any source.
>    2. No latency between ingestion and output to the HDFS/persistent store.
>    3. Maintain the integrity of the ODM module in order to facilitate
>    seamless integration with applications.
>
> We are opening this document for review and all comments, feedbacks and
> suggestions will be very welcome and will be helpful.
>
> https://docs.google.com/document/d/1yYaD50gp2HN9RHYaUG8ASDfh3f__
> GyB1Msu2LUofIU4/edit#heading=h.ptz51qlqt0t
>
>
> Vartika Singh
>