You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Gerard Toonstra <gt...@gmail.com> on 2017/11/27 19:38:23 UTC

Data lineage and data portal

Hi all,

So something that really drew my attention recently is a "data portal"  as
described by a team from airbnb somewhere in May. The idea is basically a
"facebook of data":


https://medium.com/airbnb-engineering/democratizing-data-at-airbnb-852d76c51770


Unfortunately it looks like it's not going to be opensourced due to how
heavily integrated it is with their specific infrastructure; but the idea
itself to me sounds like it's something every organization of a certain
size should have to keep track of data and stay informed as an organization.

Based on the descriptions, I prototyped some things away and am happy with
the results and the speed that something like this can be constructed. I'm
now working on sql scanners, extractors and other tools that allow me to
populate the database and put a poc together on some real data.

If other people have similar concerns in their organization and think this
would be a great thing to have, reply to me or the list; with sufficient
interest I may set up a web chat/meet session so this can be discussed in
more detail and find ways to progress this.


Best regards,

Gerard

Re: Data lineage and data portal

Posted by Maxime Beauchemin <ma...@gmail.com>.
+1, I miss the data portal!

Max

On Mon, Nov 27, 2017 at 5:33 PM, Ruslan Dautkhanov <da...@gmail.com>
wrote:

> +1
>
> Thank you
>
>
> On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <gt...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > So something that really drew my attention recently is a "data portal"
> as
> > described by a team from airbnb somewhere in May. The idea is basically a
> > "facebook of data":
> >
> >
> >
> > https://medium.com/airbnb-engineering/democratizing-
> data-at-airbnb-852d76c51770
> >
> >
> > Unfortunately it looks like it's not going to be opensourced due to how
> > heavily integrated it is with their specific infrastructure; but the idea
> > itself to me sounds like it's something every organization of a certain
> > size should have to keep track of data and stay informed as an
> > organization.
> >
> > Based on the descriptions, I prototyped some things away and am happy
> with
> > the results and the speed that something like this can be constructed.
> I'm
> > now working on sql scanners, extractors and other tools that allow me to
> > populate the database and put a poc together on some real data.
> >
> > If other people have similar concerns in their organization and think
> this
> > would be a great thing to have, reply to me or the list; with sufficient
> > interest I may set up a web chat/meet session so this can be discussed in
> > more detail and find ways to progress this.
> >
> >
> > Best regards,
> >
> > Gerard
> >
>

Re: Data lineage and data portal

Posted by Ruslan Dautkhanov <da...@gmail.com>.
+1

Thank you


On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <gt...@gmail.com>
wrote:

> Hi all,
>
> So something that really drew my attention recently is a "data portal"  as
> described by a team from airbnb somewhere in May. The idea is basically a
> "facebook of data":
>
>
>
> https://medium.com/airbnb-engineering/democratizing-data-at-airbnb-852d76c51770
>
>
> Unfortunately it looks like it's not going to be opensourced due to how
> heavily integrated it is with their specific infrastructure; but the idea
> itself to me sounds like it's something every organization of a certain
> size should have to keep track of data and stay informed as an
> organization.
>
> Based on the descriptions, I prototyped some things away and am happy with
> the results and the speed that something like this can be constructed. I'm
> now working on sql scanners, extractors and other tools that allow me to
> populate the database and put a poc together on some real data.
>
> If other people have similar concerns in their organization and think this
> would be a great thing to have, reply to me or the list; with sufficient
> interest I may set up a web chat/meet session so this can be discussed in
> more detail and find ways to progress this.
>
>
> Best regards,
>
> Gerard
>

Re: Data lineage and data portal

Posted by Gurer Kiratli <gu...@airbnb.com.INVALID>.
If there are particular questions about the Data Portal, I would be happy
to get a list of these and work on looping in the Data Portal folks from
Airbnb.

Cheers,
Gurer

On Mon, Nov 27, 2017 at 2:41 PM, Megan Kearl <me...@gmail.com> wrote:

> I'm interested too
>
> On Nov 27, 2017 3:26 PM, "Bolke de Bruin" <bd...@gmail.com> wrote:
>
> > Natuurlijk :-)
> >
> > Absolutely!
> >
> > Sent from my iPhone
> >
> > > On 27 Nov 2017, at 21:23, Chris Riccomini <cr...@apache.org>
> wrote:
> > >
> > > Interested
> > >
> > >> On Mon, Nov 27, 2017 at 12:07 PM, Kerr Shireman <ha...@gmail.com>
> > wrote:
> > >>
> > >> I am interested.  I remember being pretty excited when I read that
> blog
> > >> post.
> > >> On Mon, Nov 27, 2017 at 2:00 PM Arthur Wiedmer <
> > arthur.wiedmer@gmail.com>
> > >> wrote:
> > >>
> > >>> Likewise!
> > >>>
> > >>> Best,
> > >>> Arthur
> > >>>
> > >>> On Mon, Nov 27, 2017 at 11:57 AM, Alison Stanton <
> > >>> astanton@bankofknowledge.net> wrote:
> > >>>
> > >>>> I'd like to be kept informed.
> > >>>>
> > >>>> Alison Stanton
> > >>>>
> > >>>> On Mon, Nov 27, 2017 at 1:53 PM, Laura Lorenz <
> > >> llorenz@industrydive.com>
> > >>>> wrote:
> > >>>>
> > >>>>> We're definitely looking for something like that here, so I would
> > >> like
> > >>> to
> > >>>>> jump in on this discussion.
> > >>>>>
> > >>>>> Laura
> > >>>>>
> > >>>>> On Mon, Nov 27, 2017 at 2:38 PM, Gerard Toonstra <
> > >> gtoonstra@gmail.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi all,
> > >>>>>>
> > >>>>>> So something that really drew my attention recently is a "data
> > >>> portal"
> > >>>>> as
> > >>>>>> described by a team from airbnb somewhere in May. The idea is
> > >>>> basically a
> > >>>>>> "facebook of data":
> > >>>>>>
> > >>>>>>
> > >>>>>> https://medium.com/airbnb-engineering/democratizing-
> > >>>>>> data-at-airbnb-852d76c51770
> > >>>>>>
> > >>>>>>
> > >>>>>> Unfortunately it looks like it's not going to be opensourced due
> to
> > >>> how
> > >>>>>> heavily integrated it is with their specific infrastructure; but
> > >> the
> > >>>> idea
> > >>>>>> itself to me sounds like it's something every organization of a
> > >>> certain
> > >>>>>> size should have to keep track of data and stay informed as an
> > >>>>>> organization.
> > >>>>>>
> > >>>>>> Based on the descriptions, I prototyped some things away and am
> > >> happy
> > >>>>> with
> > >>>>>> the results and the speed that something like this can be
> > >>> constructed.
> > >>>>> I'm
> > >>>>>> now working on sql scanners, extractors and other tools that allow
> > >> me
> > >>>> to
> > >>>>>> populate the database and put a poc together on some real data.
> > >>>>>>
> > >>>>>> If other people have similar concerns in their organization and
> > >> think
> > >>>>> this
> > >>>>>> would be a great thing to have, reply to me or the list; with
> > >>>> sufficient
> > >>>>>> interest I may set up a web chat/meet session so this can be
> > >>> discussed
> > >>>> in
> > >>>>>> more detail and find ways to progress this.
> > >>>>>>
> > >>>>>>
> > >>>>>> Best regards,
> > >>>>>>
> > >>>>>> Gerard
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
>

Re: Data lineage and data portal

Posted by Megan Kearl <me...@gmail.com>.
I'm interested too

On Nov 27, 2017 3:26 PM, "Bolke de Bruin" <bd...@gmail.com> wrote:

> Natuurlijk :-)
>
> Absolutely!
>
> Sent from my iPhone
>
> > On 27 Nov 2017, at 21:23, Chris Riccomini <cr...@apache.org> wrote:
> >
> > Interested
> >
> >> On Mon, Nov 27, 2017 at 12:07 PM, Kerr Shireman <ha...@gmail.com>
> wrote:
> >>
> >> I am interested.  I remember being pretty excited when I read that blog
> >> post.
> >> On Mon, Nov 27, 2017 at 2:00 PM Arthur Wiedmer <
> arthur.wiedmer@gmail.com>
> >> wrote:
> >>
> >>> Likewise!
> >>>
> >>> Best,
> >>> Arthur
> >>>
> >>> On Mon, Nov 27, 2017 at 11:57 AM, Alison Stanton <
> >>> astanton@bankofknowledge.net> wrote:
> >>>
> >>>> I'd like to be kept informed.
> >>>>
> >>>> Alison Stanton
> >>>>
> >>>> On Mon, Nov 27, 2017 at 1:53 PM, Laura Lorenz <
> >> llorenz@industrydive.com>
> >>>> wrote:
> >>>>
> >>>>> We're definitely looking for something like that here, so I would
> >> like
> >>> to
> >>>>> jump in on this discussion.
> >>>>>
> >>>>> Laura
> >>>>>
> >>>>> On Mon, Nov 27, 2017 at 2:38 PM, Gerard Toonstra <
> >> gtoonstra@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> So something that really drew my attention recently is a "data
> >>> portal"
> >>>>> as
> >>>>>> described by a team from airbnb somewhere in May. The idea is
> >>>> basically a
> >>>>>> "facebook of data":
> >>>>>>
> >>>>>>
> >>>>>> https://medium.com/airbnb-engineering/democratizing-
> >>>>>> data-at-airbnb-852d76c51770
> >>>>>>
> >>>>>>
> >>>>>> Unfortunately it looks like it's not going to be opensourced due to
> >>> how
> >>>>>> heavily integrated it is with their specific infrastructure; but
> >> the
> >>>> idea
> >>>>>> itself to me sounds like it's something every organization of a
> >>> certain
> >>>>>> size should have to keep track of data and stay informed as an
> >>>>>> organization.
> >>>>>>
> >>>>>> Based on the descriptions, I prototyped some things away and am
> >> happy
> >>>>> with
> >>>>>> the results and the speed that something like this can be
> >>> constructed.
> >>>>> I'm
> >>>>>> now working on sql scanners, extractors and other tools that allow
> >> me
> >>>> to
> >>>>>> populate the database and put a poc together on some real data.
> >>>>>>
> >>>>>> If other people have similar concerns in their organization and
> >> think
> >>>>> this
> >>>>>> would be a great thing to have, reply to me or the list; with
> >>>> sufficient
> >>>>>> interest I may set up a web chat/meet session so this can be
> >>> discussed
> >>>> in
> >>>>>> more detail and find ways to progress this.
> >>>>>>
> >>>>>>
> >>>>>> Best regards,
> >>>>>>
> >>>>>> Gerard
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>

Re: Data lineage and data portal

Posted by Bolke de Bruin <bd...@gmail.com>.
Natuurlijk :-)

Absolutely!

Sent from my iPhone

> On 27 Nov 2017, at 21:23, Chris Riccomini <cr...@apache.org> wrote:
> 
> Interested
> 
>> On Mon, Nov 27, 2017 at 12:07 PM, Kerr Shireman <ha...@gmail.com> wrote:
>> 
>> I am interested.  I remember being pretty excited when I read that blog
>> post.
>> On Mon, Nov 27, 2017 at 2:00 PM Arthur Wiedmer <ar...@gmail.com>
>> wrote:
>> 
>>> Likewise!
>>> 
>>> Best,
>>> Arthur
>>> 
>>> On Mon, Nov 27, 2017 at 11:57 AM, Alison Stanton <
>>> astanton@bankofknowledge.net> wrote:
>>> 
>>>> I'd like to be kept informed.
>>>> 
>>>> Alison Stanton
>>>> 
>>>> On Mon, Nov 27, 2017 at 1:53 PM, Laura Lorenz <
>> llorenz@industrydive.com>
>>>> wrote:
>>>> 
>>>>> We're definitely looking for something like that here, so I would
>> like
>>> to
>>>>> jump in on this discussion.
>>>>> 
>>>>> Laura
>>>>> 
>>>>> On Mon, Nov 27, 2017 at 2:38 PM, Gerard Toonstra <
>> gtoonstra@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> So something that really drew my attention recently is a "data
>>> portal"
>>>>> as
>>>>>> described by a team from airbnb somewhere in May. The idea is
>>>> basically a
>>>>>> "facebook of data":
>>>>>> 
>>>>>> 
>>>>>> https://medium.com/airbnb-engineering/democratizing-
>>>>>> data-at-airbnb-852d76c51770
>>>>>> 
>>>>>> 
>>>>>> Unfortunately it looks like it's not going to be opensourced due to
>>> how
>>>>>> heavily integrated it is with their specific infrastructure; but
>> the
>>>> idea
>>>>>> itself to me sounds like it's something every organization of a
>>> certain
>>>>>> size should have to keep track of data and stay informed as an
>>>>>> organization.
>>>>>> 
>>>>>> Based on the descriptions, I prototyped some things away and am
>> happy
>>>>> with
>>>>>> the results and the speed that something like this can be
>>> constructed.
>>>>> I'm
>>>>>> now working on sql scanners, extractors and other tools that allow
>> me
>>>> to
>>>>>> populate the database and put a poc together on some real data.
>>>>>> 
>>>>>> If other people have similar concerns in their organization and
>> think
>>>>> this
>>>>>> would be a great thing to have, reply to me or the list; with
>>>> sufficient
>>>>>> interest I may set up a web chat/meet session so this can be
>>> discussed
>>>> in
>>>>>> more detail and find ways to progress this.
>>>>>> 
>>>>>> 
>>>>>> Best regards,
>>>>>> 
>>>>>> Gerard
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 

Re: Data lineage and data portal

Posted by Chris Riccomini <cr...@apache.org>.
Interested

On Mon, Nov 27, 2017 at 12:07 PM, Kerr Shireman <ha...@gmail.com> wrote:

> I am interested.  I remember being pretty excited when I read that blog
> post.
> On Mon, Nov 27, 2017 at 2:00 PM Arthur Wiedmer <ar...@gmail.com>
> wrote:
>
> > Likewise!
> >
> > Best,
> > Arthur
> >
> > On Mon, Nov 27, 2017 at 11:57 AM, Alison Stanton <
> > astanton@bankofknowledge.net> wrote:
> >
> > > I'd like to be kept informed.
> > >
> > > Alison Stanton
> > >
> > > On Mon, Nov 27, 2017 at 1:53 PM, Laura Lorenz <
> llorenz@industrydive.com>
> > > wrote:
> > >
> > > > We're definitely looking for something like that here, so I would
> like
> > to
> > > > jump in on this discussion.
> > > >
> > > > Laura
> > > >
> > > > On Mon, Nov 27, 2017 at 2:38 PM, Gerard Toonstra <
> gtoonstra@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > So something that really drew my attention recently is a "data
> > portal"
> > > > as
> > > > > described by a team from airbnb somewhere in May. The idea is
> > > basically a
> > > > > "facebook of data":
> > > > >
> > > > >
> > > > > https://medium.com/airbnb-engineering/democratizing-
> > > > > data-at-airbnb-852d76c51770
> > > > >
> > > > >
> > > > > Unfortunately it looks like it's not going to be opensourced due to
> > how
> > > > > heavily integrated it is with their specific infrastructure; but
> the
> > > idea
> > > > > itself to me sounds like it's something every organization of a
> > certain
> > > > > size should have to keep track of data and stay informed as an
> > > > > organization.
> > > > >
> > > > > Based on the descriptions, I prototyped some things away and am
> happy
> > > > with
> > > > > the results and the speed that something like this can be
> > constructed.
> > > > I'm
> > > > > now working on sql scanners, extractors and other tools that allow
> me
> > > to
> > > > > populate the database and put a poc together on some real data.
> > > > >
> > > > > If other people have similar concerns in their organization and
> think
> > > > this
> > > > > would be a great thing to have, reply to me or the list; with
> > > sufficient
> > > > > interest I may set up a web chat/meet session so this can be
> > discussed
> > > in
> > > > > more detail and find ways to progress this.
> > > > >
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Gerard
> > > > >
> > > >
> > >
> >
>

Re: Data lineage and data portal

Posted by Kerr Shireman <ha...@gmail.com>.
I am interested.  I remember being pretty excited when I read that blog
post.
On Mon, Nov 27, 2017 at 2:00 PM Arthur Wiedmer <ar...@gmail.com>
wrote:

> Likewise!
>
> Best,
> Arthur
>
> On Mon, Nov 27, 2017 at 11:57 AM, Alison Stanton <
> astanton@bankofknowledge.net> wrote:
>
> > I'd like to be kept informed.
> >
> > Alison Stanton
> >
> > On Mon, Nov 27, 2017 at 1:53 PM, Laura Lorenz <ll...@industrydive.com>
> > wrote:
> >
> > > We're definitely looking for something like that here, so I would like
> to
> > > jump in on this discussion.
> > >
> > > Laura
> > >
> > > On Mon, Nov 27, 2017 at 2:38 PM, Gerard Toonstra <gt...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > So something that really drew my attention recently is a "data
> portal"
> > > as
> > > > described by a team from airbnb somewhere in May. The idea is
> > basically a
> > > > "facebook of data":
> > > >
> > > >
> > > > https://medium.com/airbnb-engineering/democratizing-
> > > > data-at-airbnb-852d76c51770
> > > >
> > > >
> > > > Unfortunately it looks like it's not going to be opensourced due to
> how
> > > > heavily integrated it is with their specific infrastructure; but the
> > idea
> > > > itself to me sounds like it's something every organization of a
> certain
> > > > size should have to keep track of data and stay informed as an
> > > > organization.
> > > >
> > > > Based on the descriptions, I prototyped some things away and am happy
> > > with
> > > > the results and the speed that something like this can be
> constructed.
> > > I'm
> > > > now working on sql scanners, extractors and other tools that allow me
> > to
> > > > populate the database and put a poc together on some real data.
> > > >
> > > > If other people have similar concerns in their organization and think
> > > this
> > > > would be a great thing to have, reply to me or the list; with
> > sufficient
> > > > interest I may set up a web chat/meet session so this can be
> discussed
> > in
> > > > more detail and find ways to progress this.
> > > >
> > > >
> > > > Best regards,
> > > >
> > > > Gerard
> > > >
> > >
> >
>

Re: Data lineage and data portal

Posted by Arthur Wiedmer <ar...@gmail.com>.
Likewise!

Best,
Arthur

On Mon, Nov 27, 2017 at 11:57 AM, Alison Stanton <
astanton@bankofknowledge.net> wrote:

> I'd like to be kept informed.
>
> Alison Stanton
>
> On Mon, Nov 27, 2017 at 1:53 PM, Laura Lorenz <ll...@industrydive.com>
> wrote:
>
> > We're definitely looking for something like that here, so I would like to
> > jump in on this discussion.
> >
> > Laura
> >
> > On Mon, Nov 27, 2017 at 2:38 PM, Gerard Toonstra <gt...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > So something that really drew my attention recently is a "data portal"
> > as
> > > described by a team from airbnb somewhere in May. The idea is
> basically a
> > > "facebook of data":
> > >
> > >
> > > https://medium.com/airbnb-engineering/democratizing-
> > > data-at-airbnb-852d76c51770
> > >
> > >
> > > Unfortunately it looks like it's not going to be opensourced due to how
> > > heavily integrated it is with their specific infrastructure; but the
> idea
> > > itself to me sounds like it's something every organization of a certain
> > > size should have to keep track of data and stay informed as an
> > > organization.
> > >
> > > Based on the descriptions, I prototyped some things away and am happy
> > with
> > > the results and the speed that something like this can be constructed.
> > I'm
> > > now working on sql scanners, extractors and other tools that allow me
> to
> > > populate the database and put a poc together on some real data.
> > >
> > > If other people have similar concerns in their organization and think
> > this
> > > would be a great thing to have, reply to me or the list; with
> sufficient
> > > interest I may set up a web chat/meet session so this can be discussed
> in
> > > more detail and find ways to progress this.
> > >
> > >
> > > Best regards,
> > >
> > > Gerard
> > >
> >
>

Re: Data lineage and data portal

Posted by Alison Stanton <as...@bankofknowledge.net>.
I'd like to be kept informed.

Alison Stanton

On Mon, Nov 27, 2017 at 1:53 PM, Laura Lorenz <ll...@industrydive.com>
wrote:

> We're definitely looking for something like that here, so I would like to
> jump in on this discussion.
>
> Laura
>
> On Mon, Nov 27, 2017 at 2:38 PM, Gerard Toonstra <gt...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > So something that really drew my attention recently is a "data portal"
> as
> > described by a team from airbnb somewhere in May. The idea is basically a
> > "facebook of data":
> >
> >
> > https://medium.com/airbnb-engineering/democratizing-
> > data-at-airbnb-852d76c51770
> >
> >
> > Unfortunately it looks like it's not going to be opensourced due to how
> > heavily integrated it is with their specific infrastructure; but the idea
> > itself to me sounds like it's something every organization of a certain
> > size should have to keep track of data and stay informed as an
> > organization.
> >
> > Based on the descriptions, I prototyped some things away and am happy
> with
> > the results and the speed that something like this can be constructed.
> I'm
> > now working on sql scanners, extractors and other tools that allow me to
> > populate the database and put a poc together on some real data.
> >
> > If other people have similar concerns in their organization and think
> this
> > would be a great thing to have, reply to me or the list; with sufficient
> > interest I may set up a web chat/meet session so this can be discussed in
> > more detail and find ways to progress this.
> >
> >
> > Best regards,
> >
> > Gerard
> >
>

Re: Data lineage and data portal

Posted by Laura Lorenz <ll...@industrydive.com>.
We're definitely looking for something like that here, so I would like to
jump in on this discussion.

Laura

On Mon, Nov 27, 2017 at 2:38 PM, Gerard Toonstra <gt...@gmail.com>
wrote:

> Hi all,
>
> So something that really drew my attention recently is a "data portal"  as
> described by a team from airbnb somewhere in May. The idea is basically a
> "facebook of data":
>
>
> https://medium.com/airbnb-engineering/democratizing-
> data-at-airbnb-852d76c51770
>
>
> Unfortunately it looks like it's not going to be opensourced due to how
> heavily integrated it is with their specific infrastructure; but the idea
> itself to me sounds like it's something every organization of a certain
> size should have to keep track of data and stay informed as an
> organization.
>
> Based on the descriptions, I prototyped some things away and am happy with
> the results and the speed that something like this can be constructed. I'm
> now working on sql scanners, extractors and other tools that allow me to
> populate the database and put a poc together on some real data.
>
> If other people have similar concerns in their organization and think this
> would be a great thing to have, reply to me or the list; with sufficient
> interest I may set up a web chat/meet session so this can be discussed in
> more detail and find ways to progress this.
>
>
> Best regards,
>
> Gerard
>

Re: Data lineage and data portal

Posted by Gerard Toonstra <gt...@gmail.com>.
Hi all,

So we held a short meeting to explore cooperation a bit and get ideas what
people expect.
You can view the short meeting minutes here:

https://docs.google.com/document/d/16yyS07a_i7qpac4B5Xqec_S6j7kVwHcjUaPTdcnLih0/edit?usp=sharing

The mp3 audio:
https://drive.google.com/open?id=1kCmd8I5X8L4T0hLkN-9bIH3FCx1NKIIK
<https://drive.google.com/open?id=1kCmd8I5X8L4T0hLkN-9bIH3FCx1NKIIK>


I set up a google group to discuss the data portal further and kick it off
from there:

https://groups.google.com/forum/#!forum/dataportalkickoff


You can request to join that group, I'll accept everyone and then we can
start writing your ideas, references,
things you looked at, tried, etc.

See you there!

Gerard


On Sun, Dec 3, 2017 at 11:40 AM, Sam Elamin <hu...@gmail.com> wrote:

> I'm def in.
>
> Thanks for organising Gerard!
>
> On Sun, 3 Dec 2017 at 07:42, Gerard Toonstra <gt...@gmail.com> wrote:
>
> > Good morning,
> >
> > The meeting has been scheduled for wednesday 6th december:
> >
> > London 17:00:00 UTC, Amsterdam 18:00:00 UTC+1, San Francisco 09:00:00
> PST,
> > New York 12:00:00 EST
> >
> > --------
> >
> > Gerard Toonstra is inviting you to a scheduled Zoom meeting.
> >
> > Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/789615759
> > <
> > https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fj%
> 2F789615759&sa=D&usd=2&usg=AFQjCNFfEIitKjiD6AjsxPB5ddfMcyLv5Q
> > >
> >
> >
> > Or iPhone one-tap :
> >     US: +16699006833,,789615759# or +16468769923,,789615759#
> > Or Telephone:
> >     Dial(for higher quality, dial a number based on your current
> location):
> >         US: +1 669 900 6833 or +1 646 876 9923
> >     Meeting ID: 789 615 759
> >     International numbers available:
> > https://zoom.us/zoomconference?m=5eYALpFJGSSZWrP577vUpbK_0dP7WPfp
> > <
> > https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fzoomconference%3Fm%
> 3D5eYALpFJGSSZWrP577vUpbK_0dP7WPfp&sa=D&usd=2&usg=AFQjCNE41pjZbgmI9NNAqOi_
> 6ah3BufM1A
> > >
> >
> >
> > -------
> >
> > Best regards,
> >
> > Gerard
> >
> >
> > On Thu, Nov 30, 2017 at 9:05 PM, Gerard Toonstra <gt...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > Nice overwhelming response! :)
> > >
> > > I'd like to host a meeting on zoom to discuss. I'm on the free plan
> > there,
> > > so we are limited to 40 minutes! :)  (msg me priv if you have better
> > > hosting options).
> > > Here's a way to let me know your availability without polluting the
> > > mailing list:
> > >
> > > https://docs.google.com/forms/d/e/1FAIpQLSfsvfM6mUVwD_
> > > qV9EXdUB1aLcdKt55toYKRp3JDanonoFRSLQ/viewform?usp=sf_link
> > >
> > > I'll record the meeting and share the video for those who cannot
> attend.
> > >
> > > Agenda:
> > > - What specifically is so useful about the data portal?
> > > - ( If anyone from airbnb attends, they get the floor to give us more
> > > context and info )
> > > - Is there sufficient interest and value creation through a
> > collaboration?
> > > - How would a data portal complement airflow and vice versa?
> > > - How do we progress this further and which actions can I take here?
> > >
> > > You can prepare for the meeting by reading the blog post or watching
> the
> > > video (which has even more detail):
> > >
> > > https://medium.com/airbnb-engineering/democratizing-data-at-
> > > airbnb-852d76c51770
> > >
> > > https://www.youtube.com/watch?v=gayXC2FDSiA&t=4s
> > >
> > > Looking forward to your availability responses.
> > >
> > > Best Regards,
> > >
> > > Gerard
> > >
> > >
> > > On Wed, Nov 29, 2017 at 7:31 PM, Nathan McIntyre <nwmcintyre@gmail.com
> >
> > > wrote:
> > >
> > >> +1
> > >>
> > >> > On Nov 29, 2017, at 09:48, Alek Storm <al...@gmail.com> wrote:
> > >> >
> > >> > +1
> > >> >
> > >> > On Wed, Nov 29, 2017 at 9:50 AM, Igors Vaitkus <
> > >> litdeviant@protonmail.com>
> > >> > wrote:
> > >> >
> > >> >> +1
> > >> >>
> > >> >> On 11/29/17, 3:48 PM, "Kate-Laurel Agnew" <ka...@signal.co>
> wrote:
> > >> >>
> > >> >>    +1
> > >> >>
> > >> >>    On Wed, Nov 29, 2017 at 12:09 AM, Koen Mevissen <
> > >> kmevissen@travix.com>
> > >> >>    wrote:
> > >> >>
> > >> >>> +1
> > >> >>>
> > >> >>> I'm interested as well!
> > >> >>>
> > >> >>>
> > >> >>>
> > >> >>> Op di 28 nov. 2017 om 14:04 schreef Marc Bollinger <
> > >> >> marc@lumoslabs.com>
> > >> >>>
> > >> >>>> +1
> > >> >>>>
> > >> >>>> On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <
> > >> >> dautkhanov@gmail.com
> > >> >>>>
> > >> >>>> wrote:
> > >> >>>>
> > >> >>>>> ‘’’
> > >> >>>>> I'm
> > >> >>>>> now working on sql scanners, extractors and other tools that
> > >> >> allow me
> > >> >>> to
> > >> >>>>> populate the database
> > >> >>>>> ‘’’
> > >> >>>>>
> > >> >>>>> Very cool. Cloudera Navigator ( not an open source product) does
> > >> >> this
> > >> >>> too
> > >> >>>>> to some extent - collect metadata and create data lineage
> > >> >>> automatically (
> > >> >>>>> stored as a Solr collection) by parsing sql queries.
> > >> >>>>>
> > >> >>>>> https://www.cloudera.com/documentation/enterprise/5-12-
> > >> >>>>> x/topics/datamgmt_extraction_indexing.html
> > >> >>>>>
> > >> >>>>>
> > >> >>>>>
> > >> >>>>> On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <
> > >> >> gtoonstra@gmail.com>
> > >> >>>>> wrote:
> > >> >>>>>
> > >> >>>>>> Hi all,
> > >> >>>>>>
> > >> >>>>>> So something that really drew my attention recently is a "data
> > >> >>> portal"
> > >> >>>>> as
> > >> >>>>>> described by a team from airbnb somewhere in May. The idea is
> > >> >>>> basically a
> > >> >>>>>> "facebook of data":
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>> https://medium.com/airbnb-engineering/democratizing-
> > >> >>>>> data-at-airbnb-852d76c51770
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>> Unfortunately it looks like it's not going to be opensourced
> > >> >> due to
> > >> >>> how
> > >> >>>>>> heavily integrated it is with their specific infrastructure;
> > >> >> but the
> > >> >>>> idea
> > >> >>>>>> itself to me sounds like it's something every organization of a
> > >> >>> certain
> > >> >>>>>> size should have to keep track of data and stay informed as an
> > >> >>>>>> organization.
> > >> >>>>>>
> > >> >>>>>> Based on the descriptions, I prototyped some things away and
> > >> >> am happy
> > >> >>>>> with
> > >> >>>>>> the results and the speed that something like this can be
> > >> >>> constructed.
> > >> >>>>> I'm
> > >> >>>>>> now working on sql scanners, extractors and other tools that
> > >> >> allow me
> > >> >>>> to
> > >> >>>>>> populate the database and put a poc together on some real data.
> > >> >>>>>>
> > >> >>>>>> If other people have similar concerns in their organization
> > >> >> and think
> > >> >>>>> this
> > >> >>>>>> would be a great thing to have, reply to me or the list; with
> > >> >>>> sufficient
> > >> >>>>>> interest I may set up a web chat/meet session so this can be
> > >> >>> discussed
> > >> >>>> in
> > >> >>>>>> more detail and find ways to progress this.
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>> Best regards,
> > >> >>>>>>
> > >> >>>>>> Gerard
> > >> >>>>>>
> > >> >>>>>
> > >> >>>>
> > >> >>> --
> > >> >>> Kind regards,
> > >> >>> Met vriendelijke groet,
> > >> >>>
> > >> >>> *Koen Mevissen*
> > >> >>> Principal BI Developer
> > >> >>>
> > >> >>>
> > >> >>> *Travix Nederland B.V.*
> > >> >>> Piet Heinkade 55
> > >> >>> 1019 GM Amsterdam
> > >> >>> The Netherlands
> > >> >>>
> > >> >>> T. +31 (0)20 203 3241
> > >> >>> E: KMevissen@travix.com
> > >> >>> www.travix.com
> > >> >>>
> > >> >>> *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir
> |
> > >> >>> Flugladen
> > >> >>>
> > >> >>
> > >> >>
> > >> >>
> > >> >>    --
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>    *Kate-Laurel AgnewData Engineerm: 503-741-9207
> > >> >>    <503%20741%209207>e: kagnew@signal.co
> > >> >>    <https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=kagnew@s
> > >> ignal.co>
> > >> >> signal.co
> > >> >>    <http://www.google.com/url?q=http%3A%2F%2Fsignal.co%2F&sa=
> > >> >> D&sntz=1&usg=AFQjCNGG8nAqplB1u72dXPbYfPYRcfopNQ>____________
> > >> ____________
> > >> >> Cut
> > >> >>    Through the NoiseThis e-mail and any files transmitted with it
> are
> > >> for
> > >> >> the
> > >> >>    sole use of the intended recipient(s) and may contain
> confidential
> > >> and
> > >> >>    privileged information. Any unauthorized use of this email is
> > >> strictly
> > >> >>    prohibited. ©2015 Signal. All rights reserved.*
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >>
> > >
> > >
> >
>

Re: Data lineage and data portal

Posted by Sam Elamin <hu...@gmail.com>.
I'm def in.

Thanks for organising Gerard!

On Sun, 3 Dec 2017 at 07:42, Gerard Toonstra <gt...@gmail.com> wrote:

> Good morning,
>
> The meeting has been scheduled for wednesday 6th december:
>
> London 17:00:00 UTC, Amsterdam 18:00:00 UTC+1, San Francisco 09:00:00 PST,
> New York 12:00:00 EST
>
> --------
>
> Gerard Toonstra is inviting you to a scheduled Zoom meeting.
>
> Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/789615759
> <
> https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fj%2F789615759&sa=D&usd=2&usg=AFQjCNFfEIitKjiD6AjsxPB5ddfMcyLv5Q
> >
>
>
> Or iPhone one-tap :
>     US: +16699006833,,789615759# or +16468769923,,789615759#
> Or Telephone:
>     Dial(for higher quality, dial a number based on your current location):
>         US: +1 669 900 6833 or +1 646 876 9923
>     Meeting ID: 789 615 759
>     International numbers available:
> https://zoom.us/zoomconference?m=5eYALpFJGSSZWrP577vUpbK_0dP7WPfp
> <
> https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fzoomconference%3Fm%3D5eYALpFJGSSZWrP577vUpbK_0dP7WPfp&sa=D&usd=2&usg=AFQjCNE41pjZbgmI9NNAqOi_6ah3BufM1A
> >
>
>
> -------
>
> Best regards,
>
> Gerard
>
>
> On Thu, Nov 30, 2017 at 9:05 PM, Gerard Toonstra <gt...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > Nice overwhelming response! :)
> >
> > I'd like to host a meeting on zoom to discuss. I'm on the free plan
> there,
> > so we are limited to 40 minutes! :)  (msg me priv if you have better
> > hosting options).
> > Here's a way to let me know your availability without polluting the
> > mailing list:
> >
> > https://docs.google.com/forms/d/e/1FAIpQLSfsvfM6mUVwD_
> > qV9EXdUB1aLcdKt55toYKRp3JDanonoFRSLQ/viewform?usp=sf_link
> >
> > I'll record the meeting and share the video for those who cannot attend.
> >
> > Agenda:
> > - What specifically is so useful about the data portal?
> > - ( If anyone from airbnb attends, they get the floor to give us more
> > context and info )
> > - Is there sufficient interest and value creation through a
> collaboration?
> > - How would a data portal complement airflow and vice versa?
> > - How do we progress this further and which actions can I take here?
> >
> > You can prepare for the meeting by reading the blog post or watching the
> > video (which has even more detail):
> >
> > https://medium.com/airbnb-engineering/democratizing-data-at-
> > airbnb-852d76c51770
> >
> > https://www.youtube.com/watch?v=gayXC2FDSiA&t=4s
> >
> > Looking forward to your availability responses.
> >
> > Best Regards,
> >
> > Gerard
> >
> >
> > On Wed, Nov 29, 2017 at 7:31 PM, Nathan McIntyre <nw...@gmail.com>
> > wrote:
> >
> >> +1
> >>
> >> > On Nov 29, 2017, at 09:48, Alek Storm <al...@gmail.com> wrote:
> >> >
> >> > +1
> >> >
> >> > On Wed, Nov 29, 2017 at 9:50 AM, Igors Vaitkus <
> >> litdeviant@protonmail.com>
> >> > wrote:
> >> >
> >> >> +1
> >> >>
> >> >> On 11/29/17, 3:48 PM, "Kate-Laurel Agnew" <ka...@signal.co> wrote:
> >> >>
> >> >>    +1
> >> >>
> >> >>    On Wed, Nov 29, 2017 at 12:09 AM, Koen Mevissen <
> >> kmevissen@travix.com>
> >> >>    wrote:
> >> >>
> >> >>> +1
> >> >>>
> >> >>> I'm interested as well!
> >> >>>
> >> >>>
> >> >>>
> >> >>> Op di 28 nov. 2017 om 14:04 schreef Marc Bollinger <
> >> >> marc@lumoslabs.com>
> >> >>>
> >> >>>> +1
> >> >>>>
> >> >>>> On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <
> >> >> dautkhanov@gmail.com
> >> >>>>
> >> >>>> wrote:
> >> >>>>
> >> >>>>> ‘’’
> >> >>>>> I'm
> >> >>>>> now working on sql scanners, extractors and other tools that
> >> >> allow me
> >> >>> to
> >> >>>>> populate the database
> >> >>>>> ‘’’
> >> >>>>>
> >> >>>>> Very cool. Cloudera Navigator ( not an open source product) does
> >> >> this
> >> >>> too
> >> >>>>> to some extent - collect metadata and create data lineage
> >> >>> automatically (
> >> >>>>> stored as a Solr collection) by parsing sql queries.
> >> >>>>>
> >> >>>>> https://www.cloudera.com/documentation/enterprise/5-12-
> >> >>>>> x/topics/datamgmt_extraction_indexing.html
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <
> >> >> gtoonstra@gmail.com>
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>>> Hi all,
> >> >>>>>>
> >> >>>>>> So something that really drew my attention recently is a "data
> >> >>> portal"
> >> >>>>> as
> >> >>>>>> described by a team from airbnb somewhere in May. The idea is
> >> >>>> basically a
> >> >>>>>> "facebook of data":
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> https://medium.com/airbnb-engineering/democratizing-
> >> >>>>> data-at-airbnb-852d76c51770
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Unfortunately it looks like it's not going to be opensourced
> >> >> due to
> >> >>> how
> >> >>>>>> heavily integrated it is with their specific infrastructure;
> >> >> but the
> >> >>>> idea
> >> >>>>>> itself to me sounds like it's something every organization of a
> >> >>> certain
> >> >>>>>> size should have to keep track of data and stay informed as an
> >> >>>>>> organization.
> >> >>>>>>
> >> >>>>>> Based on the descriptions, I prototyped some things away and
> >> >> am happy
> >> >>>>> with
> >> >>>>>> the results and the speed that something like this can be
> >> >>> constructed.
> >> >>>>> I'm
> >> >>>>>> now working on sql scanners, extractors and other tools that
> >> >> allow me
> >> >>>> to
> >> >>>>>> populate the database and put a poc together on some real data.
> >> >>>>>>
> >> >>>>>> If other people have similar concerns in their organization
> >> >> and think
> >> >>>>> this
> >> >>>>>> would be a great thing to have, reply to me or the list; with
> >> >>>> sufficient
> >> >>>>>> interest I may set up a web chat/meet session so this can be
> >> >>> discussed
> >> >>>> in
> >> >>>>>> more detail and find ways to progress this.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Best regards,
> >> >>>>>>
> >> >>>>>> Gerard
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>> --
> >> >>> Kind regards,
> >> >>> Met vriendelijke groet,
> >> >>>
> >> >>> *Koen Mevissen*
> >> >>> Principal BI Developer
> >> >>>
> >> >>>
> >> >>> *Travix Nederland B.V.*
> >> >>> Piet Heinkade 55
> >> >>> 1019 GM Amsterdam
> >> >>> The Netherlands
> >> >>>
> >> >>> T. +31 (0)20 203 3241
> >> >>> E: KMevissen@travix.com
> >> >>> www.travix.com
> >> >>>
> >> >>> *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
> >> >>> Flugladen
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >>    --
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>    *Kate-Laurel AgnewData Engineerm: 503-741-9207
> >> >>    <503%20741%209207>e: kagnew@signal.co
> >> >>    <https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=kagnew@s
> >> ignal.co>
> >> >> signal.co
> >> >>    <http://www.google.com/url?q=http%3A%2F%2Fsignal.co%2F&sa=
> >> >> D&sntz=1&usg=AFQjCNGG8nAqplB1u72dXPbYfPYRcfopNQ>____________
> >> ____________
> >> >> Cut
> >> >>    Through the NoiseThis e-mail and any files transmitted with it are
> >> for
> >> >> the
> >> >>    sole use of the intended recipient(s) and may contain confidential
> >> and
> >> >>    privileged information. Any unauthorized use of this email is
> >> strictly
> >> >>    prohibited. ©2015 Signal. All rights reserved.*
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >>
> >
> >
>

Re: Data lineage and data portal

Posted by Gerard Toonstra <gt...@gmail.com>.
Good morning,

The meeting has been scheduled for wednesday 6th december:

London 17:00:00 UTC, Amsterdam 18:00:00 UTC+1, San Francisco 09:00:00 PST,
New York 12:00:00 EST

--------

Gerard Toonstra is inviting you to a scheduled Zoom meeting.

Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/789615759
<https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fj%2F789615759&sa=D&usd=2&usg=AFQjCNFfEIitKjiD6AjsxPB5ddfMcyLv5Q>


Or iPhone one-tap :
    US: +16699006833,,789615759# or +16468769923,,789615759#
Or Telephone:
    Dial(for higher quality, dial a number based on your current location):
        US: +1 669 900 6833 or +1 646 876 9923
    Meeting ID: 789 615 759
    International numbers available:
https://zoom.us/zoomconference?m=5eYALpFJGSSZWrP577vUpbK_0dP7WPfp
<https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fzoomconference%3Fm%3D5eYALpFJGSSZWrP577vUpbK_0dP7WPfp&sa=D&usd=2&usg=AFQjCNE41pjZbgmI9NNAqOi_6ah3BufM1A>


-------

Best regards,

Gerard


On Thu, Nov 30, 2017 at 9:05 PM, Gerard Toonstra <gt...@gmail.com>
wrote:

> Hi all,
>
> Nice overwhelming response! :)
>
> I'd like to host a meeting on zoom to discuss. I'm on the free plan there,
> so we are limited to 40 minutes! :)  (msg me priv if you have better
> hosting options).
> Here's a way to let me know your availability without polluting the
> mailing list:
>
> https://docs.google.com/forms/d/e/1FAIpQLSfsvfM6mUVwD_
> qV9EXdUB1aLcdKt55toYKRp3JDanonoFRSLQ/viewform?usp=sf_link
>
> I'll record the meeting and share the video for those who cannot attend.
>
> Agenda:
> - What specifically is so useful about the data portal?
> - ( If anyone from airbnb attends, they get the floor to give us more
> context and info )
> - Is there sufficient interest and value creation through a collaboration?
> - How would a data portal complement airflow and vice versa?
> - How do we progress this further and which actions can I take here?
>
> You can prepare for the meeting by reading the blog post or watching the
> video (which has even more detail):
>
> https://medium.com/airbnb-engineering/democratizing-data-at-
> airbnb-852d76c51770
>
> https://www.youtube.com/watch?v=gayXC2FDSiA&t=4s
>
> Looking forward to your availability responses.
>
> Best Regards,
>
> Gerard
>
>
> On Wed, Nov 29, 2017 at 7:31 PM, Nathan McIntyre <nw...@gmail.com>
> wrote:
>
>> +1
>>
>> > On Nov 29, 2017, at 09:48, Alek Storm <al...@gmail.com> wrote:
>> >
>> > +1
>> >
>> > On Wed, Nov 29, 2017 at 9:50 AM, Igors Vaitkus <
>> litdeviant@protonmail.com>
>> > wrote:
>> >
>> >> +1
>> >>
>> >> On 11/29/17, 3:48 PM, "Kate-Laurel Agnew" <ka...@signal.co> wrote:
>> >>
>> >>    +1
>> >>
>> >>    On Wed, Nov 29, 2017 at 12:09 AM, Koen Mevissen <
>> kmevissen@travix.com>
>> >>    wrote:
>> >>
>> >>> +1
>> >>>
>> >>> I'm interested as well!
>> >>>
>> >>>
>> >>>
>> >>> Op di 28 nov. 2017 om 14:04 schreef Marc Bollinger <
>> >> marc@lumoslabs.com>
>> >>>
>> >>>> +1
>> >>>>
>> >>>> On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <
>> >> dautkhanov@gmail.com
>> >>>>
>> >>>> wrote:
>> >>>>
>> >>>>> ‘’’
>> >>>>> I'm
>> >>>>> now working on sql scanners, extractors and other tools that
>> >> allow me
>> >>> to
>> >>>>> populate the database
>> >>>>> ‘’’
>> >>>>>
>> >>>>> Very cool. Cloudera Navigator ( not an open source product) does
>> >> this
>> >>> too
>> >>>>> to some extent - collect metadata and create data lineage
>> >>> automatically (
>> >>>>> stored as a Solr collection) by parsing sql queries.
>> >>>>>
>> >>>>> https://www.cloudera.com/documentation/enterprise/5-12-
>> >>>>> x/topics/datamgmt_extraction_indexing.html
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <
>> >> gtoonstra@gmail.com>
>> >>>>> wrote:
>> >>>>>
>> >>>>>> Hi all,
>> >>>>>>
>> >>>>>> So something that really drew my attention recently is a "data
>> >>> portal"
>> >>>>> as
>> >>>>>> described by a team from airbnb somewhere in May. The idea is
>> >>>> basically a
>> >>>>>> "facebook of data":
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> https://medium.com/airbnb-engineering/democratizing-
>> >>>>> data-at-airbnb-852d76c51770
>> >>>>>>
>> >>>>>>
>> >>>>>> Unfortunately it looks like it's not going to be opensourced
>> >> due to
>> >>> how
>> >>>>>> heavily integrated it is with their specific infrastructure;
>> >> but the
>> >>>> idea
>> >>>>>> itself to me sounds like it's something every organization of a
>> >>> certain
>> >>>>>> size should have to keep track of data and stay informed as an
>> >>>>>> organization.
>> >>>>>>
>> >>>>>> Based on the descriptions, I prototyped some things away and
>> >> am happy
>> >>>>> with
>> >>>>>> the results and the speed that something like this can be
>> >>> constructed.
>> >>>>> I'm
>> >>>>>> now working on sql scanners, extractors and other tools that
>> >> allow me
>> >>>> to
>> >>>>>> populate the database and put a poc together on some real data.
>> >>>>>>
>> >>>>>> If other people have similar concerns in their organization
>> >> and think
>> >>>>> this
>> >>>>>> would be a great thing to have, reply to me or the list; with
>> >>>> sufficient
>> >>>>>> interest I may set up a web chat/meet session so this can be
>> >>> discussed
>> >>>> in
>> >>>>>> more detail and find ways to progress this.
>> >>>>>>
>> >>>>>>
>> >>>>>> Best regards,
>> >>>>>>
>> >>>>>> Gerard
>> >>>>>>
>> >>>>>
>> >>>>
>> >>> --
>> >>> Kind regards,
>> >>> Met vriendelijke groet,
>> >>>
>> >>> *Koen Mevissen*
>> >>> Principal BI Developer
>> >>>
>> >>>
>> >>> *Travix Nederland B.V.*
>> >>> Piet Heinkade 55
>> >>> 1019 GM Amsterdam
>> >>> The Netherlands
>> >>>
>> >>> T. +31 (0)20 203 3241
>> >>> E: KMevissen@travix.com
>> >>> www.travix.com
>> >>>
>> >>> *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
>> >>> Flugladen
>> >>>
>> >>
>> >>
>> >>
>> >>    --
>> >>
>> >>
>> >>
>> >>
>> >>    *Kate-Laurel AgnewData Engineerm: 503-741-9207
>> >>    <503%20741%209207>e: kagnew@signal.co
>> >>    <https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=kagnew@s
>> ignal.co>
>> >> signal.co
>> >>    <http://www.google.com/url?q=http%3A%2F%2Fsignal.co%2F&sa=
>> >> D&sntz=1&usg=AFQjCNGG8nAqplB1u72dXPbYfPYRcfopNQ>____________
>> ____________
>> >> Cut
>> >>    Through the NoiseThis e-mail and any files transmitted with it are
>> for
>> >> the
>> >>    sole use of the intended recipient(s) and may contain confidential
>> and
>> >>    privileged information. Any unauthorized use of this email is
>> strictly
>> >>    prohibited. ©2015 Signal. All rights reserved.*
>> >>
>> >>
>> >>
>> >>
>> >>
>>
>
>

Re: Data lineage and data portal

Posted by Gerard Toonstra <gt...@gmail.com>.
Hi all,

Nice overwhelming response! :)

I'd like to host a meeting on zoom to discuss. I'm on the free plan there,
so we are limited to 40 minutes! :)  (msg me priv if you have better
hosting options).
Here's a way to let me know your availability without polluting the mailing
list:

https://docs.google.com/forms/d/e/1FAIpQLSfsvfM6mUVwD_qV9EXdUB1aLcdKt55toYKRp3JDanonoFRSLQ/viewform?usp=sf_link

I'll record the meeting and share the video for those who cannot attend.

Agenda:
- What specifically is so useful about the data portal?
- ( If anyone from airbnb attends, they get the floor to give us more
context and info )
- Is there sufficient interest and value creation through a collaboration?
- How would a data portal complement airflow and vice versa?
- How do we progress this further and which actions can I take here?

You can prepare for the meeting by reading the blog post or watching the
video (which has even more detail):

https://medium.com/airbnb-engineering/democratizing-
data-at-airbnb-852d76c51770

https://www.youtube.com/watch?v=gayXC2FDSiA&t=4s

Looking forward to your availability responses.

Best Regards,

Gerard


On Wed, Nov 29, 2017 at 7:31 PM, Nathan McIntyre <nw...@gmail.com>
wrote:

> +1
>
> > On Nov 29, 2017, at 09:48, Alek Storm <al...@gmail.com> wrote:
> >
> > +1
> >
> > On Wed, Nov 29, 2017 at 9:50 AM, Igors Vaitkus <
> litdeviant@protonmail.com>
> > wrote:
> >
> >> +1
> >>
> >> On 11/29/17, 3:48 PM, "Kate-Laurel Agnew" <ka...@signal.co> wrote:
> >>
> >>    +1
> >>
> >>    On Wed, Nov 29, 2017 at 12:09 AM, Koen Mevissen <
> kmevissen@travix.com>
> >>    wrote:
> >>
> >>> +1
> >>>
> >>> I'm interested as well!
> >>>
> >>>
> >>>
> >>> Op di 28 nov. 2017 om 14:04 schreef Marc Bollinger <
> >> marc@lumoslabs.com>
> >>>
> >>>> +1
> >>>>
> >>>> On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <
> >> dautkhanov@gmail.com
> >>>>
> >>>> wrote:
> >>>>
> >>>>> ‘’’
> >>>>> I'm
> >>>>> now working on sql scanners, extractors and other tools that
> >> allow me
> >>> to
> >>>>> populate the database
> >>>>> ‘’’
> >>>>>
> >>>>> Very cool. Cloudera Navigator ( not an open source product) does
> >> this
> >>> too
> >>>>> to some extent - collect metadata and create data lineage
> >>> automatically (
> >>>>> stored as a Solr collection) by parsing sql queries.
> >>>>>
> >>>>> https://www.cloudera.com/documentation/enterprise/5-12-
> >>>>> x/topics/datamgmt_extraction_indexing.html
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <
> >> gtoonstra@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> So something that really drew my attention recently is a "data
> >>> portal"
> >>>>> as
> >>>>>> described by a team from airbnb somewhere in May. The idea is
> >>>> basically a
> >>>>>> "facebook of data":
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> https://medium.com/airbnb-engineering/democratizing-
> >>>>> data-at-airbnb-852d76c51770
> >>>>>>
> >>>>>>
> >>>>>> Unfortunately it looks like it's not going to be opensourced
> >> due to
> >>> how
> >>>>>> heavily integrated it is with their specific infrastructure;
> >> but the
> >>>> idea
> >>>>>> itself to me sounds like it's something every organization of a
> >>> certain
> >>>>>> size should have to keep track of data and stay informed as an
> >>>>>> organization.
> >>>>>>
> >>>>>> Based on the descriptions, I prototyped some things away and
> >> am happy
> >>>>> with
> >>>>>> the results and the speed that something like this can be
> >>> constructed.
> >>>>> I'm
> >>>>>> now working on sql scanners, extractors and other tools that
> >> allow me
> >>>> to
> >>>>>> populate the database and put a poc together on some real data.
> >>>>>>
> >>>>>> If other people have similar concerns in their organization
> >> and think
> >>>>> this
> >>>>>> would be a great thing to have, reply to me or the list; with
> >>>> sufficient
> >>>>>> interest I may set up a web chat/meet session so this can be
> >>> discussed
> >>>> in
> >>>>>> more detail and find ways to progress this.
> >>>>>>
> >>>>>>
> >>>>>> Best regards,
> >>>>>>
> >>>>>> Gerard
> >>>>>>
> >>>>>
> >>>>
> >>> --
> >>> Kind regards,
> >>> Met vriendelijke groet,
> >>>
> >>> *Koen Mevissen*
> >>> Principal BI Developer
> >>>
> >>>
> >>> *Travix Nederland B.V.*
> >>> Piet Heinkade 55
> >>> 1019 GM Amsterdam
> >>> The Netherlands
> >>>
> >>> T. +31 (0)20 203 3241
> >>> E: KMevissen@travix.com
> >>> www.travix.com
> >>>
> >>> *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
> >>> Flugladen
> >>>
> >>
> >>
> >>
> >>    --
> >>
> >>
> >>
> >>
> >>    *Kate-Laurel AgnewData Engineerm: 503-741-9207
> >>    <503%20741%209207>e: kagnew@signal.co
> >>    <https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=kagnew@signal.co
> >
> >> signal.co
> >>    <http://www.google.com/url?q=http%3A%2F%2Fsignal.co%2F&sa=
> >> D&sntz=1&usg=AFQjCNGG8nAqplB1u72dXPbYfPYRcf
> opNQ>________________________
> >> Cut
> >>    Through the NoiseThis e-mail and any files transmitted with it are
> for
> >> the
> >>    sole use of the intended recipient(s) and may contain confidential
> and
> >>    privileged information. Any unauthorized use of this email is
> strictly
> >>    prohibited. ©2015 Signal. All rights reserved.*
> >>
> >>
> >>
> >>
> >>
>

Re: Data lineage and data portal

Posted by Nathan McIntyre <nw...@gmail.com>.
+1

> On Nov 29, 2017, at 09:48, Alek Storm <al...@gmail.com> wrote:
> 
> +1
> 
> On Wed, Nov 29, 2017 at 9:50 AM, Igors Vaitkus <li...@protonmail.com>
> wrote:
> 
>> +1
>> 
>> On 11/29/17, 3:48 PM, "Kate-Laurel Agnew" <ka...@signal.co> wrote:
>> 
>>    +1
>> 
>>    On Wed, Nov 29, 2017 at 12:09 AM, Koen Mevissen <km...@travix.com>
>>    wrote:
>> 
>>> +1
>>> 
>>> I'm interested as well!
>>> 
>>> 
>>> 
>>> Op di 28 nov. 2017 om 14:04 schreef Marc Bollinger <
>> marc@lumoslabs.com>
>>> 
>>>> +1
>>>> 
>>>> On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <
>> dautkhanov@gmail.com
>>>> 
>>>> wrote:
>>>> 
>>>>> ‘’’
>>>>> I'm
>>>>> now working on sql scanners, extractors and other tools that
>> allow me
>>> to
>>>>> populate the database
>>>>> ‘’’
>>>>> 
>>>>> Very cool. Cloudera Navigator ( not an open source product) does
>> this
>>> too
>>>>> to some extent - collect metadata and create data lineage
>>> automatically (
>>>>> stored as a Solr collection) by parsing sql queries.
>>>>> 
>>>>> https://www.cloudera.com/documentation/enterprise/5-12-
>>>>> x/topics/datamgmt_extraction_indexing.html
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <
>> gtoonstra@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> So something that really drew my attention recently is a "data
>>> portal"
>>>>> as
>>>>>> described by a team from airbnb somewhere in May. The idea is
>>>> basically a
>>>>>> "facebook of data":
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> https://medium.com/airbnb-engineering/democratizing-
>>>>> data-at-airbnb-852d76c51770
>>>>>> 
>>>>>> 
>>>>>> Unfortunately it looks like it's not going to be opensourced
>> due to
>>> how
>>>>>> heavily integrated it is with their specific infrastructure;
>> but the
>>>> idea
>>>>>> itself to me sounds like it's something every organization of a
>>> certain
>>>>>> size should have to keep track of data and stay informed as an
>>>>>> organization.
>>>>>> 
>>>>>> Based on the descriptions, I prototyped some things away and
>> am happy
>>>>> with
>>>>>> the results and the speed that something like this can be
>>> constructed.
>>>>> I'm
>>>>>> now working on sql scanners, extractors and other tools that
>> allow me
>>>> to
>>>>>> populate the database and put a poc together on some real data.
>>>>>> 
>>>>>> If other people have similar concerns in their organization
>> and think
>>>>> this
>>>>>> would be a great thing to have, reply to me or the list; with
>>>> sufficient
>>>>>> interest I may set up a web chat/meet session so this can be
>>> discussed
>>>> in
>>>>>> more detail and find ways to progress this.
>>>>>> 
>>>>>> 
>>>>>> Best regards,
>>>>>> 
>>>>>> Gerard
>>>>>> 
>>>>> 
>>>> 
>>> --
>>> Kind regards,
>>> Met vriendelijke groet,
>>> 
>>> *Koen Mevissen*
>>> Principal BI Developer
>>> 
>>> 
>>> *Travix Nederland B.V.*
>>> Piet Heinkade 55
>>> 1019 GM Amsterdam
>>> The Netherlands
>>> 
>>> T. +31 (0)20 203 3241
>>> E: KMevissen@travix.com
>>> www.travix.com
>>> 
>>> *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
>>> Flugladen
>>> 
>> 
>> 
>> 
>>    --
>> 
>> 
>> 
>> 
>>    *Kate-Laurel AgnewData Engineerm: 503-741-9207
>>    <503%20741%209207>e: kagnew@signal.co
>>    <ht...@signal.co>
>> signal.co
>>    <http://www.google.com/url?q=http%3A%2F%2Fsignal.co%2F&sa=
>> D&sntz=1&usg=AFQjCNGG8nAqplB1u72dXPbYfPYRcfopNQ>________________________
>> Cut
>>    Through the NoiseThis e-mail and any files transmitted with it are for
>> the
>>    sole use of the intended recipient(s) and may contain confidential and
>>    privileged information. Any unauthorized use of this email is strictly
>>    prohibited. ©2015 Signal. All rights reserved.*
>> 
>> 
>> 
>> 
>> 

Re: Data lineage and data portal

Posted by Alek Storm <al...@gmail.com>.
+1

On Wed, Nov 29, 2017 at 9:50 AM, Igors Vaitkus <li...@protonmail.com>
wrote:

> +1
>
> On 11/29/17, 3:48 PM, "Kate-Laurel Agnew" <ka...@signal.co> wrote:
>
>     +1
>
>     On Wed, Nov 29, 2017 at 12:09 AM, Koen Mevissen <km...@travix.com>
>     wrote:
>
>     > +1
>     >
>     > I'm interested as well!
>     >
>     >
>     >
>     > Op di 28 nov. 2017 om 14:04 schreef Marc Bollinger <
> marc@lumoslabs.com>
>     >
>     > > +1
>     > >
>     > > On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <
> dautkhanov@gmail.com
>     > >
>     > > wrote:
>     > >
>     > > > ‘’’
>     > > > I'm
>     > > > now working on sql scanners, extractors and other tools that
> allow me
>     > to
>     > > > populate the database
>     > > > ‘’’
>     > > >
>     > > > Very cool. Cloudera Navigator ( not an open source product) does
> this
>     > too
>     > > > to some extent - collect metadata and create data lineage
>     > automatically (
>     > > > stored as a Solr collection) by parsing sql queries.
>     > > >
>     > > > https://www.cloudera.com/documentation/enterprise/5-12-
>     > > > x/topics/datamgmt_extraction_indexing.html
>     > > >
>     > > >
>     > > >
>     > > > On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <
> gtoonstra@gmail.com>
>     > > > wrote:
>     > > >
>     > > > > Hi all,
>     > > > >
>     > > > > So something that really drew my attention recently is a "data
>     > portal"
>     > > > as
>     > > > > described by a team from airbnb somewhere in May. The idea is
>     > > basically a
>     > > > > "facebook of data":
>     > > > >
>     > > > >
>     > > > >
>     > > > > https://medium.com/airbnb-engineering/democratizing-
>     > > > data-at-airbnb-852d76c51770
>     > > > >
>     > > > >
>     > > > > Unfortunately it looks like it's not going to be opensourced
> due to
>     > how
>     > > > > heavily integrated it is with their specific infrastructure;
> but the
>     > > idea
>     > > > > itself to me sounds like it's something every organization of a
>     > certain
>     > > > > size should have to keep track of data and stay informed as an
>     > > > > organization.
>     > > > >
>     > > > > Based on the descriptions, I prototyped some things away and
> am happy
>     > > > with
>     > > > > the results and the speed that something like this can be
>     > constructed.
>     > > > I'm
>     > > > > now working on sql scanners, extractors and other tools that
> allow me
>     > > to
>     > > > > populate the database and put a poc together on some real data.
>     > > > >
>     > > > > If other people have similar concerns in their organization
> and think
>     > > > this
>     > > > > would be a great thing to have, reply to me or the list; with
>     > > sufficient
>     > > > > interest I may set up a web chat/meet session so this can be
>     > discussed
>     > > in
>     > > > > more detail and find ways to progress this.
>     > > > >
>     > > > >
>     > > > > Best regards,
>     > > > >
>     > > > > Gerard
>     > > > >
>     > > >
>     > >
>     > --
>     > Kind regards,
>     > Met vriendelijke groet,
>     >
>     > *Koen Mevissen*
>     > Principal BI Developer
>     >
>     >
>     > *Travix Nederland B.V.*
>     > Piet Heinkade 55
>     > 1019 GM Amsterdam
>     > The Netherlands
>     >
>     > T. +31 (0)20 203 3241
>     > E: KMevissen@travix.com
>     > www.travix.com
>     >
>     > *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
>     >  Flugladen
>     >
>
>
>
>     --
>
>
>
>
>     *Kate-Laurel AgnewData Engineerm: 503-741-9207
>     <503%20741%209207>e: kagnew@signal.co
>     <ht...@signal.co>
> signal.co
>     <http://www.google.com/url?q=http%3A%2F%2Fsignal.co%2F&sa=
> D&sntz=1&usg=AFQjCNGG8nAqplB1u72dXPbYfPYRcfopNQ>________________________
> Cut
>     Through the NoiseThis e-mail and any files transmitted with it are for
> the
>     sole use of the intended recipient(s) and may contain confidential and
>     privileged information. Any unauthorized use of this email is strictly
>     prohibited. ©2015 Signal. All rights reserved.*
>
>
>
>
>

Re: Data lineage and data portal

Posted by Igors Vaitkus <li...@protonmail.com>.
+1

On 11/29/17, 3:48 PM, "Kate-Laurel Agnew" <ka...@signal.co> wrote:

    +1
    
    On Wed, Nov 29, 2017 at 12:09 AM, Koen Mevissen <km...@travix.com>
    wrote:
    
    > +1
    >
    > I'm interested as well!
    >
    >
    >
    > Op di 28 nov. 2017 om 14:04 schreef Marc Bollinger <ma...@lumoslabs.com>
    >
    > > +1
    > >
    > > On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <dautkhanov@gmail.com
    > >
    > > wrote:
    > >
    > > > ‘’’
    > > > I'm
    > > > now working on sql scanners, extractors and other tools that allow me
    > to
    > > > populate the database
    > > > ‘’’
    > > >
    > > > Very cool. Cloudera Navigator ( not an open source product) does this
    > too
    > > > to some extent - collect metadata and create data lineage
    > automatically (
    > > > stored as a Solr collection) by parsing sql queries.
    > > >
    > > > https://www.cloudera.com/documentation/enterprise/5-12-
    > > > x/topics/datamgmt_extraction_indexing.html
    > > >
    > > >
    > > >
    > > > On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <gt...@gmail.com>
    > > > wrote:
    > > >
    > > > > Hi all,
    > > > >
    > > > > So something that really drew my attention recently is a "data
    > portal"
    > > > as
    > > > > described by a team from airbnb somewhere in May. The idea is
    > > basically a
    > > > > "facebook of data":
    > > > >
    > > > >
    > > > >
    > > > > https://medium.com/airbnb-engineering/democratizing-
    > > > data-at-airbnb-852d76c51770
    > > > >
    > > > >
    > > > > Unfortunately it looks like it's not going to be opensourced due to
    > how
    > > > > heavily integrated it is with their specific infrastructure; but the
    > > idea
    > > > > itself to me sounds like it's something every organization of a
    > certain
    > > > > size should have to keep track of data and stay informed as an
    > > > > organization.
    > > > >
    > > > > Based on the descriptions, I prototyped some things away and am happy
    > > > with
    > > > > the results and the speed that something like this can be
    > constructed.
    > > > I'm
    > > > > now working on sql scanners, extractors and other tools that allow me
    > > to
    > > > > populate the database and put a poc together on some real data.
    > > > >
    > > > > If other people have similar concerns in their organization and think
    > > > this
    > > > > would be a great thing to have, reply to me or the list; with
    > > sufficient
    > > > > interest I may set up a web chat/meet session so this can be
    > discussed
    > > in
    > > > > more detail and find ways to progress this.
    > > > >
    > > > >
    > > > > Best regards,
    > > > >
    > > > > Gerard
    > > > >
    > > >
    > >
    > --
    > Kind regards,
    > Met vriendelijke groet,
    >
    > *Koen Mevissen*
    > Principal BI Developer
    >
    >
    > *Travix Nederland B.V.*
    > Piet Heinkade 55
    > 1019 GM Amsterdam
    > The Netherlands
    >
    > T. +31 (0)20 203 3241
    > E: KMevissen@travix.com
    > www.travix.com
    >
    > *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
    >  Flugladen
    >
    
    
    
    -- 
    
    
    
    
    *Kate-Laurel AgnewData Engineerm: 503-741-9207
    <503%20741%209207>e: kagnew@signal.co
    <ht...@signal.co>signal.co
    <http://www.google.com/url?q=http%3A%2F%2Fsignal.co%2F&sa=D&sntz=1&usg=AFQjCNGG8nAqplB1u72dXPbYfPYRcfopNQ>________________________Cut
    Through the NoiseThis e-mail and any files transmitted with it are for the
    sole use of the intended recipient(s) and may contain confidential and
    privileged information. Any unauthorized use of this email is strictly
    prohibited. ©2015 Signal. All rights reserved.*
    




Re: Data lineage and data portal

Posted by Kate-Laurel Agnew <ka...@signal.co>.
+1

On Wed, Nov 29, 2017 at 12:09 AM, Koen Mevissen <km...@travix.com>
wrote:

> +1
>
> I'm interested as well!
>
>
>
> Op di 28 nov. 2017 om 14:04 schreef Marc Bollinger <ma...@lumoslabs.com>
>
> > +1
> >
> > On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <dautkhanov@gmail.com
> >
> > wrote:
> >
> > > ‘’’
> > > I'm
> > > now working on sql scanners, extractors and other tools that allow me
> to
> > > populate the database
> > > ‘’’
> > >
> > > Very cool. Cloudera Navigator ( not an open source product) does this
> too
> > > to some extent - collect metadata and create data lineage
> automatically (
> > > stored as a Solr collection) by parsing sql queries.
> > >
> > > https://www.cloudera.com/documentation/enterprise/5-12-
> > > x/topics/datamgmt_extraction_indexing.html
> > >
> > >
> > >
> > > On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <gt...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > So something that really drew my attention recently is a "data
> portal"
> > > as
> > > > described by a team from airbnb somewhere in May. The idea is
> > basically a
> > > > "facebook of data":
> > > >
> > > >
> > > >
> > > > https://medium.com/airbnb-engineering/democratizing-
> > > data-at-airbnb-852d76c51770
> > > >
> > > >
> > > > Unfortunately it looks like it's not going to be opensourced due to
> how
> > > > heavily integrated it is with their specific infrastructure; but the
> > idea
> > > > itself to me sounds like it's something every organization of a
> certain
> > > > size should have to keep track of data and stay informed as an
> > > > organization.
> > > >
> > > > Based on the descriptions, I prototyped some things away and am happy
> > > with
> > > > the results and the speed that something like this can be
> constructed.
> > > I'm
> > > > now working on sql scanners, extractors and other tools that allow me
> > to
> > > > populate the database and put a poc together on some real data.
> > > >
> > > > If other people have similar concerns in their organization and think
> > > this
> > > > would be a great thing to have, reply to me or the list; with
> > sufficient
> > > > interest I may set up a web chat/meet session so this can be
> discussed
> > in
> > > > more detail and find ways to progress this.
> > > >
> > > >
> > > > Best regards,
> > > >
> > > > Gerard
> > > >
> > >
> >
> --
> Kind regards,
> Met vriendelijke groet,
>
> *Koen Mevissen*
> Principal BI Developer
>
>
> *Travix Nederland B.V.*
> Piet Heinkade 55
> 1019 GM Amsterdam
> The Netherlands
>
> T. +31 (0)20 203 3241
> E: KMevissen@travix.com
> www.travix.com
>
> *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
>  Flugladen
>



-- 




*Kate-Laurel AgnewData Engineerm: 503-741-9207
<503%20741%209207>e: kagnew@signal.co
<ht...@signal.co>signal.co
<http://www.google.com/url?q=http%3A%2F%2Fsignal.co%2F&sa=D&sntz=1&usg=AFQjCNGG8nAqplB1u72dXPbYfPYRcfopNQ>________________________Cut
Through the NoiseThis e-mail and any files transmitted with it are for the
sole use of the intended recipient(s) and may contain confidential and
privileged information. Any unauthorized use of this email is strictly
prohibited. ©2015 Signal. All rights reserved.*

Re: Data lineage and data portal

Posted by Koen Mevissen <km...@travix.com>.
+1

I'm interested as well!



Op di 28 nov. 2017 om 14:04 schreef Marc Bollinger <ma...@lumoslabs.com>

> +1
>
> On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <da...@gmail.com>
> wrote:
>
> > ‘’’
> > I'm
> > now working on sql scanners, extractors and other tools that allow me to
> > populate the database
> > ‘’’
> >
> > Very cool. Cloudera Navigator ( not an open source product) does this too
> > to some extent - collect metadata and create data lineage automatically (
> > stored as a Solr collection) by parsing sql queries.
> >
> > https://www.cloudera.com/documentation/enterprise/5-12-
> > x/topics/datamgmt_extraction_indexing.html
> >
> >
> >
> > On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <gt...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > So something that really drew my attention recently is a "data portal"
> > as
> > > described by a team from airbnb somewhere in May. The idea is
> basically a
> > > "facebook of data":
> > >
> > >
> > >
> > > https://medium.com/airbnb-engineering/democratizing-
> > data-at-airbnb-852d76c51770
> > >
> > >
> > > Unfortunately it looks like it's not going to be opensourced due to how
> > > heavily integrated it is with their specific infrastructure; but the
> idea
> > > itself to me sounds like it's something every organization of a certain
> > > size should have to keep track of data and stay informed as an
> > > organization.
> > >
> > > Based on the descriptions, I prototyped some things away and am happy
> > with
> > > the results and the speed that something like this can be constructed.
> > I'm
> > > now working on sql scanners, extractors and other tools that allow me
> to
> > > populate the database and put a poc together on some real data.
> > >
> > > If other people have similar concerns in their organization and think
> > this
> > > would be a great thing to have, reply to me or the list; with
> sufficient
> > > interest I may set up a web chat/meet session so this can be discussed
> in
> > > more detail and find ways to progress this.
> > >
> > >
> > > Best regards,
> > >
> > > Gerard
> > >
> >
>
-- 
Kind regards,
Met vriendelijke groet,

*Koen Mevissen*
Principal BI Developer


*Travix Nederland B.V.*
Piet Heinkade 55
1019 GM Amsterdam
The Netherlands

T. +31 (0)20 203 3241
E: KMevissen@travix.com
www.travix.com

*Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
 Flugladen

Re: Data lineage and data portal

Posted by Marc Bollinger <ma...@lumoslabs.com>.
+1

On Mon, Nov 27, 2017 at 6:18 PM, Ruslan Dautkhanov <da...@gmail.com>
wrote:

> ‘’’
> I'm
> now working on sql scanners, extractors and other tools that allow me to
> populate the database
> ‘’’
>
> Very cool. Cloudera Navigator ( not an open source product) does this too
> to some extent - collect metadata and create data lineage automatically (
> stored as a Solr collection) by parsing sql queries.
>
> https://www.cloudera.com/documentation/enterprise/5-12-
> x/topics/datamgmt_extraction_indexing.html
>
>
>
> On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <gt...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > So something that really drew my attention recently is a "data portal"
> as
> > described by a team from airbnb somewhere in May. The idea is basically a
> > "facebook of data":
> >
> >
> >
> > https://medium.com/airbnb-engineering/democratizing-
> data-at-airbnb-852d76c51770
> >
> >
> > Unfortunately it looks like it's not going to be opensourced due to how
> > heavily integrated it is with their specific infrastructure; but the idea
> > itself to me sounds like it's something every organization of a certain
> > size should have to keep track of data and stay informed as an
> > organization.
> >
> > Based on the descriptions, I prototyped some things away and am happy
> with
> > the results and the speed that something like this can be constructed.
> I'm
> > now working on sql scanners, extractors and other tools that allow me to
> > populate the database and put a poc together on some real data.
> >
> > If other people have similar concerns in their organization and think
> this
> > would be a great thing to have, reply to me or the list; with sufficient
> > interest I may set up a web chat/meet session so this can be discussed in
> > more detail and find ways to progress this.
> >
> >
> > Best regards,
> >
> > Gerard
> >
>

Re: Data lineage and data portal

Posted by Ruslan Dautkhanov <da...@gmail.com>.
‘’’
I'm
now working on sql scanners, extractors and other tools that allow me to
populate the database
‘’’

Very cool. Cloudera Navigator ( not an open source product) does this too
to some extent - collect metadata and create data lineage automatically (
stored as a Solr collection) by parsing sql queries.

https://www.cloudera.com/documentation/enterprise/5-12-x/topics/datamgmt_extraction_indexing.html



On Mon, Nov 27, 2017 at 12:38 PM Gerard Toonstra <gt...@gmail.com>
wrote:

> Hi all,
>
> So something that really drew my attention recently is a "data portal"  as
> described by a team from airbnb somewhere in May. The idea is basically a
> "facebook of data":
>
>
>
> https://medium.com/airbnb-engineering/democratizing-data-at-airbnb-852d76c51770
>
>
> Unfortunately it looks like it's not going to be opensourced due to how
> heavily integrated it is with their specific infrastructure; but the idea
> itself to me sounds like it's something every organization of a certain
> size should have to keep track of data and stay informed as an
> organization.
>
> Based on the descriptions, I prototyped some things away and am happy with
> the results and the speed that something like this can be constructed. I'm
> now working on sql scanners, extractors and other tools that allow me to
> populate the database and put a poc together on some real data.
>
> If other people have similar concerns in their organization and think this
> would be a great thing to have, reply to me or the list; with sufficient
> interest I may set up a web chat/meet session so this can be discussed in
> more detail and find ways to progress this.
>
>
> Best regards,
>
> Gerard
>