You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zeppelin.apache.org by Peter McCaffrey <pm...@gmail.com> on 2016/12/12 22:00:54 UTC

Data Source Metadata Your Thoughts

Hey everyone,
     I wanted to send this out to gather some opinions before I submit any
PRs on this topic. I've been using zeppelin for about 7 months now, and it
has gained rapid adoption amongst my team.  Unfortunately, one sticking
point for our team is it's lack of some useful data source exploration
tools (if there is such functionality and I just missed it, please let me
know!).

I made a simple change to the JDBC interpreter to add an "explore" feature
as shown in this video I recorded (https://s3.amazonaws.com/scre
enshots-mockups/embedvid.html).

I understand that JDBC data source metadata can be gathered using SQL but
this works exclusively through the JDBC driver api and seems like a simple
and clean way to get tables and views while working on a query.

I've tested this with PostgreSQL and MySQL and I just wanted to see if the
community thinks this is valuable. I see in the sourcecode something called
the "sqlCompleter" which appears to work with metadata but I'm not sure if
or how this pulls data source tables and views so please let me know if I'm
missing something big. If this feature seems useful, then once PR-1744 (
https://github.com/apache/zeppelin/pull/1744) is merged I'll open a PR for
this.

Sincerely,
-Peter

Re: Data Source Metadata Your Thoughts

Posted by Peter McCaffrey <pm...@gmail.com>.

Hey Alex and Hyung,
     Thanks so much for getting back to me so quickly! I'm glad you also
think this would be a good feature. I agree 200% Alex about this becoming
more robust as part of the workflow. I had envisioned--ultimately--there
may be some generalized metadata feature for all interpreters that they
implement especially for their respective backend tool. This speaks to your
concern about making the feature consistent and I think you're right on the
money that in order to accomplish that, it would likely have to fit
somewhere else in the UI that reflects the fact that it's an additional
part of the workflow apart from paragraph execution.

For the time being though, I agree it can be a documented little feature of
the JDBC interpreter (and a bit of an odd one at that considering that it
operates on a special keyword) but hopefully that's just temporary and it
will grow from there! I'm also hoping to spend more time on this concept
too.

Thanks again!

Sincerely,
-Peter

On Mon, Dec 12, 2016 at 10:52 PM, Alexander Bezzubov <bz...@apache.org> wrote:

> Datasource schema exploration is something very usefull and practical
> indeed.
>
> And thank you very much Peter for sharing PoC and a very nice video through
> a new feature discussion - this really helps!
>
> The only minor concern is - how to make user experience with such feature
> consistent across the interpreters? What is the best way to fit it into the
> "mental framework" of the notebook user?
>
> As this is outside of the grammar of SQL and does not start with % as
> paragraph meta-information about the interpreter, how would one discover
> this feature in a way, that might be extensible enough to be supported by
> other interpreters?
>
> At the same time it's great not to over-generalise at first and make it
> simple to make sure it's usefull. Something as simple as have it in the
> docs for SQL interpreter might be a good start.
>
> And then enhance it up to the point where it's general enough to become the
> major part of "mental framework" on par with Note/paragraph concept, to
> cover all interpreters and potentially event fit into the GUI.
>
> What do you guys think?
>
> --
> Alex
>
> On Tue, Dec 13, 2016, 11:10 Hyung Sung Shim <hs...@nflabs.com> wrote:
>
> > Hello Peter.
> > Thank you for suggesting great function.
> > It would be really useful function for zeppelin users!
> >
> >
> > 2016년 12월 13일 (화) 오전 7:01, Peter McCaffrey <pm...@gmail.com>님이 작성:
> >
> > > Hey everyone,
> > >      I wanted to send this out to gather some opinions before I submit
> > any
> > > PRs on this topic. I've been using zeppelin for about 7 months now, and
> > it
> > > has gained rapid adoption amongst my team.  Unfortunately, one sticking
> > > point for our team is it's lack of some useful data source exploration
> > > tools (if there is such functionality and I just missed it, please let
> me
> > > know!).
> > >
> > > I made a simple change to the JDBC interpreter to add an "explore"
> > feature
> > > as shown in this video I recorded (https://s3.amazonaws.com/scre
> > > enshots-mockups/embedvid.html).
> > >
> > > I understand that JDBC data source metadata can be gathered using SQL
> but
> > > this works exclusively through the JDBC driver api and seems like a
> > simple
> > > and clean way to get tables and views while working on a query.
> > >
> > > I've tested this with PostgreSQL and MySQL and I just wanted to see if
> > the
> > > community thinks this is valuable. I see in the sourcecode something
> > called
> > > the "sqlCompleter" which appears to work with metadata but I'm not sure
> > if
> > > or how this pulls data source tables and views so please let me know if
> > I'm
> > > missing something big. If this feature seems useful, then once PR-1744
> (
> > > https://github.com/apache/zeppelin/pull/1744) is merged I'll open a PR
> > for
> > > this.
> > >
> > > Sincerely,
> > > -Peter
> > >
> >
>

Re: Data Source Metadata Your Thoughts

Posted by Alexander Bezzubov <bz...@apache.org>.

Datasource schema exploration is something very usefull and practical
indeed.

And thank you very much Peter for sharing PoC and a very nice video through
a new feature discussion - this really helps!

The only minor concern is - how to make user experience with such feature
consistent across the interpreters? What is the best way to fit it into the
"mental framework" of the notebook user?

As this is outside of the grammar of SQL and does not start with % as
paragraph meta-information about the interpreter, how would one discover
this feature in a way, that might be extensible enough to be supported by
other interpreters?

At the same time it's great not to over-generalise at first and make it
simple to make sure it's usefull. Something as simple as have it in the
docs for SQL interpreter might be a good start.

And then enhance it up to the point where it's general enough to become the
major part of "mental framework" on par with Note/paragraph concept, to
cover all interpreters and potentially event fit into the GUI.

What do you guys think?

--
Alex

On Tue, Dec 13, 2016, 11:10 Hyung Sung Shim <hs...@nflabs.com> wrote:

> Hello Peter.
> Thank you for suggesting great function.
> It would be really useful function for zeppelin users!
>
>
> 2016년 12월 13일 (화) 오전 7:01, Peter McCaffrey <pm...@gmail.com>님이 작성:
>
> > Hey everyone,
> >      I wanted to send this out to gather some opinions before I submit
> any
> > PRs on this topic. I've been using zeppelin for about 7 months now, and
> it
> > has gained rapid adoption amongst my team.  Unfortunately, one sticking
> > point for our team is it's lack of some useful data source exploration
> > tools (if there is such functionality and I just missed it, please let me
> > know!).
> >
> > I made a simple change to the JDBC interpreter to add an "explore"
> feature
> > as shown in this video I recorded (https://s3.amazonaws.com/scre
> > enshots-mockups/embedvid.html).
> >
> > I understand that JDBC data source metadata can be gathered using SQL but
> > this works exclusively through the JDBC driver api and seems like a
> simple
> > and clean way to get tables and views while working on a query.
> >
> > I've tested this with PostgreSQL and MySQL and I just wanted to see if
> the
> > community thinks this is valuable. I see in the sourcecode something
> called
> > the "sqlCompleter" which appears to work with metadata but I'm not sure
> if
> > or how this pulls data source tables and views so please let me know if
> I'm
> > missing something big. If this feature seems useful, then once PR-1744 (
> > https://github.com/apache/zeppelin/pull/1744) is merged I'll open a PR
> for
> > this.
> >
> > Sincerely,
> > -Peter
> >
>

Re: Data Source Metadata Your Thoughts

Posted by Hyung Sung Shim <hs...@nflabs.com>.

Hello Peter.
Thank you for suggesting great function.
It would be really useful function for zeppelin users!


2016년 12월 13일 (화) 오전 7:01, Peter McCaffrey <pm...@gmail.com>님이 작성:

> Hey everyone,
>      I wanted to send this out to gather some opinions before I submit any
> PRs on this topic. I've been using zeppelin for about 7 months now, and it
> has gained rapid adoption amongst my team.  Unfortunately, one sticking
> point for our team is it's lack of some useful data source exploration
> tools (if there is such functionality and I just missed it, please let me
> know!).
>
> I made a simple change to the JDBC interpreter to add an "explore" feature
> as shown in this video I recorded (https://s3.amazonaws.com/scre
> enshots-mockups/embedvid.html).
>
> I understand that JDBC data source metadata can be gathered using SQL but
> this works exclusively through the JDBC driver api and seems like a simple
> and clean way to get tables and views while working on a query.
>
> I've tested this with PostgreSQL and MySQL and I just wanted to see if the
> community thinks this is valuable. I see in the sourcecode something called
> the "sqlCompleter" which appears to work with metadata but I'm not sure if
> or how this pulls data source tables and views so please let me know if I'm
> missing something big. If this feature seems useful, then once PR-1744 (
> https://github.com/apache/zeppelin/pull/1744) is merged I'll open a PR for
> this.
>
> Sincerely,
> -Peter
>