You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by John Omernik <jo...@omernik.com> on 2017/03/01 17:55:27 UTC

Re: Discussion: Comments in Drill Views

Sorry, I let this idea drop (I didn't follow up and found when searching
for something else...)  Any other thoughts on this idea?

Should I open a JIRA if people think it would be handy?

On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <te...@gmail.com> wrote:

> This is very interesting.  I love docstrings in Lisp and Python and Javadoc
> in Java.
>
> Basically this is like that, but for SQL. Very helpful.
>
> On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <jo...@omernik.com> wrote:
>
> > I am looking for discussion here. A colleague was asking me how to add
> > comments to the metadata of a view.  (He's new to Drill, thus the idea of
> > not having metadata for a table is one he's warming up to).
> >
> > That got me thinking... why couldn't we use Drill Views to store
> > table/field comments?  This could be a great way to help add contextual
> > information for users. Here's some current observations when I issue a
> > describe view_myview
> >
> >
> > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
> > 2. Even thought the underlying parquet table has types, the view does not
> > pass the types for the underlying parquet files through.  (The type is
> ANY)
> > 3. The data for the view is all just a json file that could be easily
> > extended.
> >
> >
> > So, a few things would be a nice to have
> >
> > 1. Table comments.  when I issue a describe table, if the view has a
> > "Description" field, then having that print out as a description for the
> > whole view would be nice.  This is harder, I think because it's not just
> > extending the view information.
> >
> > 2. Column comments:  A text field that could be added to the view, and
> just
> > print out another column with description.  This would be very helpful.
> > While Drill being schemaless is awesome, the ability to add information
> to
> > known data, is huge.
> >
> > 3. Ability to to use the types from the Parquet files (without manually
> > specifying each type).  If we could provide an option to View creation to
> > attempt to infer type, that would be handy. I realize that folks are
> using
> > the LIMIT 0 to get metadata, but describe could be done well too.
> >
> > 4. Ability, using ANSI Sql to update the view column descriptions and the
> > description for the view itself.
> >
> > 5. I believe Avro has the ability to add this information to the files,
> so
> > if the data exists outside of views (such as in AVRO files) should we
> > present it to the user in describe table events as well?
> >
> > Curious if folks think this would be valuable, how much work an addition
> > like this would be to Drill, and other thoughts in general.
> >
> >
> > John
> >
>

Re: Discussion: Comments in Drill Views

Posted by John Omernik <jo...@omernik.com>.
I created  a JIRA for this based on the Hangout today!

https://issues.apache.org/jira/browse/DRILL-5461



On Mon, Mar 6, 2017 at 7:55 AM, John Omernik <jo...@omernik.com> wrote:

> I can see both sides. But Ted is right, this won't hurt any thing from a
> performance perspective, even if they put War and Peace in there 30 times,
> that's 100mb of information to serve. People may choose to use formatting
> languages like Markup or something. I do think we should have a limit so we
> know what happens if someone tries to break that limit (from a security
> perspective) but we could set that quite high, and then just test putting
> data that exceeds that as a unit test.
>
>
>
> On Fri, Mar 3, 2017 at 8:28 PM, Ted Dunning <te...@gmail.com> wrote:
>
>> All of War and Peace is only 3MB.
>>
>> Let people document however they want. Don't over-optimize for problems
>> that have never occurred.
>>
>>
>>
>> On Fri, Mar 3, 2017 at 3:19 PM, Kunal Khatua <kk...@mapr.com> wrote:
>>
>> > It might be, incase someone begins to dump a massive design doc into the
>> > comment field for a view's JSON.
>> >
>> >
>> > I'm also not sure about how this information can be consumed. If it is
>> > through CLI, either we rely on the SQLLine shell to trim the output, or
>> not
>> > worry at all. I'm assuming we'd also probably want something like a
>> >
>> > DESCRIBE VIEW ...
>> >
>> > to be enhanced to something like
>> >
>> > DESCRIBE VIEW WITH COMMENTARY ...
>> >
>> >
>> > A 1KB field is quite generous IMHO. That's more than 7 tweets to
>> describe
>> > something ! [?]
>> >
>> >
>> > Kunal Khatua
>> >
>> > ________________________________
>> > From: Ted Dunning <te...@gmail.com>
>> > Sent: Friday, March 3, 2017 12:56:44 PM
>> > To: user
>> > Subject: Re: Discussion: Comments in Drill Views
>> >
>> > It it really necessary to put a technical limit in to prevent people
>> from
>> > OVER-documenting views?
>> >
>> >
>> > What is the last time you saw code that had too many comments in it?
>> >
>> >
>> >
>> > On Thu, Mar 2, 2017 at 8:42 AM, John Omernik <jo...@omernik.com> wrote:
>> >
>> > > So I think on your worry that's an easily definable "abuse"
>> condition...
>> > > i.e. if we set a limit of say 1024 characters, that provides ample
>> space
>> > > for descriptions, but at 1kb per view, that's an allowable condition,
>> > i.e.
>> > > it would be hard to abuse it ... or am I missing something?
>> > >
>> > > On Wed, Mar 1, 2017 at 8:08 PM, Kunal Khatua <kk...@mapr.com>
>> wrote:
>> > >
>> > > > +1
>> > > >
>> > > >
>> > > > I this this can be very useful. The only worry is of someone abusing
>> > it,
>> > > > so we probably should have a limit on the size of this? Not sure
>> else
>> > it
>> > > > could be exposed and consumed.
>> > > >
>> > > >
>> > > > Kunal Khatua
>> > > >
>> > > > Engineering
>> > > >
>> > > > [MapR]<http://www.mapr.com/>
>> > > >
>> > > > www.mapr.com<http://www.mapr.com/>
>> > > >
>> > > > ________________________________
>> > > > From: John Omernik <jo...@omernik.com>
>> > > > Sent: Wednesday, March 1, 2017 9:55:27 AM
>> > > > To: user
>> > > > Subject: Re: Discussion: Comments in Drill Views
>> > > >
>> > > > Sorry, I let this idea drop (I didn't follow up and found when
>> > searching
>> > > > for something else...)  Any other thoughts on this idea?
>> > > >
>> > > > Should I open a JIRA if people think it would be handy?
>> > > >
>> > > > On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <ted.dunning@gmail.com
>> >
>> > > > wrote:
>> > > >
>> > > > > This is very interesting.  I love docstrings in Lisp and Python
>> and
>> > > > Javadoc
>> > > > > in Java.
>> > > > >
>> > > > > Basically this is like that, but for SQL. Very helpful.
>> > > > >
>> > > > > On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <jo...@omernik.com>
>> > > wrote:
>> > > > >
>> > > > > > I am looking for discussion here. A colleague was asking me how
>> to
>> > > add
>> > > > > > comments to the metadata of a view.  (He's new to Drill, thus
>> the
>> > > idea
>> > > > of
>> > > > > > not having metadata for a table is one he's warming up to).
>> > > > > >
>> > > > > > That got me thinking... why couldn't we use Drill Views to store
>> > > > > > table/field comments?  This could be a great way to help add
>> > > contextual
>> > > > > > information for users. Here's some current observations when I
>> > issue
>> > > a
>> > > > > > describe view_myview
>> > > > > >
>> > > > > >
>> > > > > > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
>> > > > > > 2. Even thought the underlying parquet table has types, the view
>> > does
>> > > > not
>> > > > > > pass the types for the underlying parquet files through.  (The
>> type
>> > > is
>> > > > > ANY)
>> > > > > > 3. The data for the view is all just a json file that could be
>> > easily
>> > > > > > extended.
>> > > > > >
>> > > > > >
>> > > > > > So, a few things would be a nice to have
>> > > > > >
>> > > > > > 1. Table comments.  when I issue a describe table, if the view
>> has
>> > a
>> > > > > > "Description" field, then having that print out as a description
>> > for
>> > > > the
>> > > > > > whole view would be nice.  This is harder, I think because it's
>> not
>> > > > just
>> > > > > > extending the view information.
>> > > > > >
>> > > > > > 2. Column comments:  A text field that could be added to the
>> view,
>> > > and
>> > > > > just
>> > > > > > print out another column with description.  This would be very
>> > > helpful.
>> > > > > > While Drill being schemaless is awesome, the ability to add
>> > > information
>> > > > > to
>> > > > > > known data, is huge.
>> > > > > >
>> > > > > > 3. Ability to to use the types from the Parquet files (without
>> > > manually
>> > > > > > specifying each type).  If we could provide an option to View
>> > > creation
>> > > > to
>> > > > > > attempt to infer type, that would be handy. I realize that folks
>> > are
>> > > > > using
>> > > > > > the LIMIT 0 to get metadata, but describe could be done well
>> too.
>> > > > > >
>> > > > > > 4. Ability, using ANSI Sql to update the view column
>> descriptions
>> > and
>> > > > the
>> > > > > > description for the view itself.
>> > > > > >
>> > > > > > 5. I believe Avro has the ability to add this information to the
>> > > files,
>> > > > > so
>> > > > > > if the data exists outside of views (such as in AVRO files)
>> should
>> > we
>> > > > > > present it to the user in describe table events as well?
>> > > > > >
>> > > > > > Curious if folks think this would be valuable, how much work an
>> > > > addition
>> > > > > > like this would be to Drill, and other thoughts in general.
>> > > > > >
>> > > > > >
>> > > > > > John
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: Discussion: Comments in Drill Views

Posted by John Omernik <jo...@omernik.com>.
I can see both sides. But Ted is right, this won't hurt any thing from a
performance perspective, even if they put War and Peace in there 30 times,
that's 100mb of information to serve. People may choose to use formatting
languages like Markup or something. I do think we should have a limit so we
know what happens if someone tries to break that limit (from a security
perspective) but we could set that quite high, and then just test putting
data that exceeds that as a unit test.



On Fri, Mar 3, 2017 at 8:28 PM, Ted Dunning <te...@gmail.com> wrote:

> All of War and Peace is only 3MB.
>
> Let people document however they want. Don't over-optimize for problems
> that have never occurred.
>
>
>
> On Fri, Mar 3, 2017 at 3:19 PM, Kunal Khatua <kk...@mapr.com> wrote:
>
> > It might be, incase someone begins to dump a massive design doc into the
> > comment field for a view's JSON.
> >
> >
> > I'm also not sure about how this information can be consumed. If it is
> > through CLI, either we rely on the SQLLine shell to trim the output, or
> not
> > worry at all. I'm assuming we'd also probably want something like a
> >
> > DESCRIBE VIEW ...
> >
> > to be enhanced to something like
> >
> > DESCRIBE VIEW WITH COMMENTARY ...
> >
> >
> > A 1KB field is quite generous IMHO. That's more than 7 tweets to describe
> > something ! [?]
> >
> >
> > Kunal Khatua
> >
> > ________________________________
> > From: Ted Dunning <te...@gmail.com>
> > Sent: Friday, March 3, 2017 12:56:44 PM
> > To: user
> > Subject: Re: Discussion: Comments in Drill Views
> >
> > It it really necessary to put a technical limit in to prevent people from
> > OVER-documenting views?
> >
> >
> > What is the last time you saw code that had too many comments in it?
> >
> >
> >
> > On Thu, Mar 2, 2017 at 8:42 AM, John Omernik <jo...@omernik.com> wrote:
> >
> > > So I think on your worry that's an easily definable "abuse"
> condition...
> > > i.e. if we set a limit of say 1024 characters, that provides ample
> space
> > > for descriptions, but at 1kb per view, that's an allowable condition,
> > i.e.
> > > it would be hard to abuse it ... or am I missing something?
> > >
> > > On Wed, Mar 1, 2017 at 8:08 PM, Kunal Khatua <kk...@mapr.com> wrote:
> > >
> > > > +1
> > > >
> > > >
> > > > I this this can be very useful. The only worry is of someone abusing
> > it,
> > > > so we probably should have a limit on the size of this? Not sure else
> > it
> > > > could be exposed and consumed.
> > > >
> > > >
> > > > Kunal Khatua
> > > >
> > > > Engineering
> > > >
> > > > [MapR]<http://www.mapr.com/>
> > > >
> > > > www.mapr.com<http://www.mapr.com/>
> > > >
> > > > ________________________________
> > > > From: John Omernik <jo...@omernik.com>
> > > > Sent: Wednesday, March 1, 2017 9:55:27 AM
> > > > To: user
> > > > Subject: Re: Discussion: Comments in Drill Views
> > > >
> > > > Sorry, I let this idea drop (I didn't follow up and found when
> > searching
> > > > for something else...)  Any other thoughts on this idea?
> > > >
> > > > Should I open a JIRA if people think it would be handy?
> > > >
> > > > On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <te...@gmail.com>
> > > > wrote:
> > > >
> > > > > This is very interesting.  I love docstrings in Lisp and Python and
> > > > Javadoc
> > > > > in Java.
> > > > >
> > > > > Basically this is like that, but for SQL. Very helpful.
> > > > >
> > > > > On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <jo...@omernik.com>
> > > wrote:
> > > > >
> > > > > > I am looking for discussion here. A colleague was asking me how
> to
> > > add
> > > > > > comments to the metadata of a view.  (He's new to Drill, thus the
> > > idea
> > > > of
> > > > > > not having metadata for a table is one he's warming up to).
> > > > > >
> > > > > > That got me thinking... why couldn't we use Drill Views to store
> > > > > > table/field comments?  This could be a great way to help add
> > > contextual
> > > > > > information for users. Here's some current observations when I
> > issue
> > > a
> > > > > > describe view_myview
> > > > > >
> > > > > >
> > > > > > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
> > > > > > 2. Even thought the underlying parquet table has types, the view
> > does
> > > > not
> > > > > > pass the types for the underlying parquet files through.  (The
> type
> > > is
> > > > > ANY)
> > > > > > 3. The data for the view is all just a json file that could be
> > easily
> > > > > > extended.
> > > > > >
> > > > > >
> > > > > > So, a few things would be a nice to have
> > > > > >
> > > > > > 1. Table comments.  when I issue a describe table, if the view
> has
> > a
> > > > > > "Description" field, then having that print out as a description
> > for
> > > > the
> > > > > > whole view would be nice.  This is harder, I think because it's
> not
> > > > just
> > > > > > extending the view information.
> > > > > >
> > > > > > 2. Column comments:  A text field that could be added to the
> view,
> > > and
> > > > > just
> > > > > > print out another column with description.  This would be very
> > > helpful.
> > > > > > While Drill being schemaless is awesome, the ability to add
> > > information
> > > > > to
> > > > > > known data, is huge.
> > > > > >
> > > > > > 3. Ability to to use the types from the Parquet files (without
> > > manually
> > > > > > specifying each type).  If we could provide an option to View
> > > creation
> > > > to
> > > > > > attempt to infer type, that would be handy. I realize that folks
> > are
> > > > > using
> > > > > > the LIMIT 0 to get metadata, but describe could be done well too.
> > > > > >
> > > > > > 4. Ability, using ANSI Sql to update the view column descriptions
> > and
> > > > the
> > > > > > description for the view itself.
> > > > > >
> > > > > > 5. I believe Avro has the ability to add this information to the
> > > files,
> > > > > so
> > > > > > if the data exists outside of views (such as in AVRO files)
> should
> > we
> > > > > > present it to the user in describe table events as well?
> > > > > >
> > > > > > Curious if folks think this would be valuable, how much work an
> > > > addition
> > > > > > like this would be to Drill, and other thoughts in general.
> > > > > >
> > > > > >
> > > > > > John
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Discussion: Comments in Drill Views

Posted by Ted Dunning <te...@gmail.com>.
All of War and Peace is only 3MB.

Let people document however they want. Don't over-optimize for problems
that have never occurred.



On Fri, Mar 3, 2017 at 3:19 PM, Kunal Khatua <kk...@mapr.com> wrote:

> It might be, incase someone begins to dump a massive design doc into the
> comment field for a view's JSON.
>
>
> I'm also not sure about how this information can be consumed. If it is
> through CLI, either we rely on the SQLLine shell to trim the output, or not
> worry at all. I'm assuming we'd also probably want something like a
>
> DESCRIBE VIEW ...
>
> to be enhanced to something like
>
> DESCRIBE VIEW WITH COMMENTARY ...
>
>
> A 1KB field is quite generous IMHO. That's more than 7 tweets to describe
> something ! [?]
>
>
> Kunal Khatua
>
> ________________________________
> From: Ted Dunning <te...@gmail.com>
> Sent: Friday, March 3, 2017 12:56:44 PM
> To: user
> Subject: Re: Discussion: Comments in Drill Views
>
> It it really necessary to put a technical limit in to prevent people from
> OVER-documenting views?
>
>
> What is the last time you saw code that had too many comments in it?
>
>
>
> On Thu, Mar 2, 2017 at 8:42 AM, John Omernik <jo...@omernik.com> wrote:
>
> > So I think on your worry that's an easily definable "abuse" condition...
> > i.e. if we set a limit of say 1024 characters, that provides ample space
> > for descriptions, but at 1kb per view, that's an allowable condition,
> i.e.
> > it would be hard to abuse it ... or am I missing something?
> >
> > On Wed, Mar 1, 2017 at 8:08 PM, Kunal Khatua <kk...@mapr.com> wrote:
> >
> > > +1
> > >
> > >
> > > I this this can be very useful. The only worry is of someone abusing
> it,
> > > so we probably should have a limit on the size of this? Not sure else
> it
> > > could be exposed and consumed.
> > >
> > >
> > > Kunal Khatua
> > >
> > > Engineering
> > >
> > > [MapR]<http://www.mapr.com/>
> > >
> > > www.mapr.com<http://www.mapr.com/>
> > >
> > > ________________________________
> > > From: John Omernik <jo...@omernik.com>
> > > Sent: Wednesday, March 1, 2017 9:55:27 AM
> > > To: user
> > > Subject: Re: Discussion: Comments in Drill Views
> > >
> > > Sorry, I let this idea drop (I didn't follow up and found when
> searching
> > > for something else...)  Any other thoughts on this idea?
> > >
> > > Should I open a JIRA if people think it would be handy?
> > >
> > > On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <te...@gmail.com>
> > > wrote:
> > >
> > > > This is very interesting.  I love docstrings in Lisp and Python and
> > > Javadoc
> > > > in Java.
> > > >
> > > > Basically this is like that, but for SQL. Very helpful.
> > > >
> > > > On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <jo...@omernik.com>
> > wrote:
> > > >
> > > > > I am looking for discussion here. A colleague was asking me how to
> > add
> > > > > comments to the metadata of a view.  (He's new to Drill, thus the
> > idea
> > > of
> > > > > not having metadata for a table is one he's warming up to).
> > > > >
> > > > > That got me thinking... why couldn't we use Drill Views to store
> > > > > table/field comments?  This could be a great way to help add
> > contextual
> > > > > information for users. Here's some current observations when I
> issue
> > a
> > > > > describe view_myview
> > > > >
> > > > >
> > > > > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
> > > > > 2. Even thought the underlying parquet table has types, the view
> does
> > > not
> > > > > pass the types for the underlying parquet files through.  (The type
> > is
> > > > ANY)
> > > > > 3. The data for the view is all just a json file that could be
> easily
> > > > > extended.
> > > > >
> > > > >
> > > > > So, a few things would be a nice to have
> > > > >
> > > > > 1. Table comments.  when I issue a describe table, if the view has
> a
> > > > > "Description" field, then having that print out as a description
> for
> > > the
> > > > > whole view would be nice.  This is harder, I think because it's not
> > > just
> > > > > extending the view information.
> > > > >
> > > > > 2. Column comments:  A text field that could be added to the view,
> > and
> > > > just
> > > > > print out another column with description.  This would be very
> > helpful.
> > > > > While Drill being schemaless is awesome, the ability to add
> > information
> > > > to
> > > > > known data, is huge.
> > > > >
> > > > > 3. Ability to to use the types from the Parquet files (without
> > manually
> > > > > specifying each type).  If we could provide an option to View
> > creation
> > > to
> > > > > attempt to infer type, that would be handy. I realize that folks
> are
> > > > using
> > > > > the LIMIT 0 to get metadata, but describe could be done well too.
> > > > >
> > > > > 4. Ability, using ANSI Sql to update the view column descriptions
> and
> > > the
> > > > > description for the view itself.
> > > > >
> > > > > 5. I believe Avro has the ability to add this information to the
> > files,
> > > > so
> > > > > if the data exists outside of views (such as in AVRO files) should
> we
> > > > > present it to the user in describe table events as well?
> > > > >
> > > > > Curious if folks think this would be valuable, how much work an
> > > addition
> > > > > like this would be to Drill, and other thoughts in general.
> > > > >
> > > > >
> > > > > John
> > > > >
> > > >
> > >
> >
>

Re: Discussion: Comments in Drill Views

Posted by Kunal Khatua <kk...@mapr.com>.
It might be, incase someone begins to dump a massive design doc into the comment field for a view's JSON.


I'm also not sure about how this information can be consumed. If it is through CLI, either we rely on the SQLLine shell to trim the output, or not worry at all. I'm assuming we'd also probably want something like a

DESCRIBE VIEW ...

to be enhanced to something like

DESCRIBE VIEW WITH COMMENTARY ...


A 1KB field is quite generous IMHO. That's more than 7 tweets to describe something ! [?]


Kunal Khatua

________________________________
From: Ted Dunning <te...@gmail.com>
Sent: Friday, March 3, 2017 12:56:44 PM
To: user
Subject: Re: Discussion: Comments in Drill Views

It it really necessary to put a technical limit in to prevent people from
OVER-documenting views?


What is the last time you saw code that had too many comments in it?



On Thu, Mar 2, 2017 at 8:42 AM, John Omernik <jo...@omernik.com> wrote:

> So I think on your worry that's an easily definable "abuse" condition...
> i.e. if we set a limit of say 1024 characters, that provides ample space
> for descriptions, but at 1kb per view, that's an allowable condition, i.e.
> it would be hard to abuse it ... or am I missing something?
>
> On Wed, Mar 1, 2017 at 8:08 PM, Kunal Khatua <kk...@mapr.com> wrote:
>
> > +1
> >
> >
> > I this this can be very useful. The only worry is of someone abusing it,
> > so we probably should have a limit on the size of this? Not sure else it
> > could be exposed and consumed.
> >
> >
> > Kunal Khatua
> >
> > Engineering
> >
> > [MapR]<http://www.mapr.com/>
> >
> > www.mapr.com<http://www.mapr.com/>
> >
> > ________________________________
> > From: John Omernik <jo...@omernik.com>
> > Sent: Wednesday, March 1, 2017 9:55:27 AM
> > To: user
> > Subject: Re: Discussion: Comments in Drill Views
> >
> > Sorry, I let this idea drop (I didn't follow up and found when searching
> > for something else...)  Any other thoughts on this idea?
> >
> > Should I open a JIRA if people think it would be handy?
> >
> > On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <te...@gmail.com>
> > wrote:
> >
> > > This is very interesting.  I love docstrings in Lisp and Python and
> > Javadoc
> > > in Java.
> > >
> > > Basically this is like that, but for SQL. Very helpful.
> > >
> > > On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <jo...@omernik.com>
> wrote:
> > >
> > > > I am looking for discussion here. A colleague was asking me how to
> add
> > > > comments to the metadata of a view.  (He's new to Drill, thus the
> idea
> > of
> > > > not having metadata for a table is one he's warming up to).
> > > >
> > > > That got me thinking... why couldn't we use Drill Views to store
> > > > table/field comments?  This could be a great way to help add
> contextual
> > > > information for users. Here's some current observations when I issue
> a
> > > > describe view_myview
> > > >
> > > >
> > > > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
> > > > 2. Even thought the underlying parquet table has types, the view does
> > not
> > > > pass the types for the underlying parquet files through.  (The type
> is
> > > ANY)
> > > > 3. The data for the view is all just a json file that could be easily
> > > > extended.
> > > >
> > > >
> > > > So, a few things would be a nice to have
> > > >
> > > > 1. Table comments.  when I issue a describe table, if the view has a
> > > > "Description" field, then having that print out as a description for
> > the
> > > > whole view would be nice.  This is harder, I think because it's not
> > just
> > > > extending the view information.
> > > >
> > > > 2. Column comments:  A text field that could be added to the view,
> and
> > > just
> > > > print out another column with description.  This would be very
> helpful.
> > > > While Drill being schemaless is awesome, the ability to add
> information
> > > to
> > > > known data, is huge.
> > > >
> > > > 3. Ability to to use the types from the Parquet files (without
> manually
> > > > specifying each type).  If we could provide an option to View
> creation
> > to
> > > > attempt to infer type, that would be handy. I realize that folks are
> > > using
> > > > the LIMIT 0 to get metadata, but describe could be done well too.
> > > >
> > > > 4. Ability, using ANSI Sql to update the view column descriptions and
> > the
> > > > description for the view itself.
> > > >
> > > > 5. I believe Avro has the ability to add this information to the
> files,
> > > so
> > > > if the data exists outside of views (such as in AVRO files) should we
> > > > present it to the user in describe table events as well?
> > > >
> > > > Curious if folks think this would be valuable, how much work an
> > addition
> > > > like this would be to Drill, and other thoughts in general.
> > > >
> > > >
> > > > John
> > > >
> > >
> >
>

Re: Discussion: Comments in Drill Views

Posted by Ted Dunning <te...@gmail.com>.
It it really necessary to put a technical limit in to prevent people from
OVER-documenting views?


What is the last time you saw code that had too many comments in it?



On Thu, Mar 2, 2017 at 8:42 AM, John Omernik <jo...@omernik.com> wrote:

> So I think on your worry that's an easily definable "abuse" condition...
> i.e. if we set a limit of say 1024 characters, that provides ample space
> for descriptions, but at 1kb per view, that's an allowable condition, i.e.
> it would be hard to abuse it ... or am I missing something?
>
> On Wed, Mar 1, 2017 at 8:08 PM, Kunal Khatua <kk...@mapr.com> wrote:
>
> > +1
> >
> >
> > I this this can be very useful. The only worry is of someone abusing it,
> > so we probably should have a limit on the size of this? Not sure else it
> > could be exposed and consumed.
> >
> >
> > Kunal Khatua
> >
> > Engineering
> >
> > [MapR]<http://www.mapr.com/>
> >
> > www.mapr.com<http://www.mapr.com/>
> >
> > ________________________________
> > From: John Omernik <jo...@omernik.com>
> > Sent: Wednesday, March 1, 2017 9:55:27 AM
> > To: user
> > Subject: Re: Discussion: Comments in Drill Views
> >
> > Sorry, I let this idea drop (I didn't follow up and found when searching
> > for something else...)  Any other thoughts on this idea?
> >
> > Should I open a JIRA if people think it would be handy?
> >
> > On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <te...@gmail.com>
> > wrote:
> >
> > > This is very interesting.  I love docstrings in Lisp and Python and
> > Javadoc
> > > in Java.
> > >
> > > Basically this is like that, but for SQL. Very helpful.
> > >
> > > On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <jo...@omernik.com>
> wrote:
> > >
> > > > I am looking for discussion here. A colleague was asking me how to
> add
> > > > comments to the metadata of a view.  (He's new to Drill, thus the
> idea
> > of
> > > > not having metadata for a table is one he's warming up to).
> > > >
> > > > That got me thinking... why couldn't we use Drill Views to store
> > > > table/field comments?  This could be a great way to help add
> contextual
> > > > information for users. Here's some current observations when I issue
> a
> > > > describe view_myview
> > > >
> > > >
> > > > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
> > > > 2. Even thought the underlying parquet table has types, the view does
> > not
> > > > pass the types for the underlying parquet files through.  (The type
> is
> > > ANY)
> > > > 3. The data for the view is all just a json file that could be easily
> > > > extended.
> > > >
> > > >
> > > > So, a few things would be a nice to have
> > > >
> > > > 1. Table comments.  when I issue a describe table, if the view has a
> > > > "Description" field, then having that print out as a description for
> > the
> > > > whole view would be nice.  This is harder, I think because it's not
> > just
> > > > extending the view information.
> > > >
> > > > 2. Column comments:  A text field that could be added to the view,
> and
> > > just
> > > > print out another column with description.  This would be very
> helpful.
> > > > While Drill being schemaless is awesome, the ability to add
> information
> > > to
> > > > known data, is huge.
> > > >
> > > > 3. Ability to to use the types from the Parquet files (without
> manually
> > > > specifying each type).  If we could provide an option to View
> creation
> > to
> > > > attempt to infer type, that would be handy. I realize that folks are
> > > using
> > > > the LIMIT 0 to get metadata, but describe could be done well too.
> > > >
> > > > 4. Ability, using ANSI Sql to update the view column descriptions and
> > the
> > > > description for the view itself.
> > > >
> > > > 5. I believe Avro has the ability to add this information to the
> files,
> > > so
> > > > if the data exists outside of views (such as in AVRO files) should we
> > > > present it to the user in describe table events as well?
> > > >
> > > > Curious if folks think this would be valuable, how much work an
> > addition
> > > > like this would be to Drill, and other thoughts in general.
> > > >
> > > >
> > > > John
> > > >
> > >
> >
>

Re: Discussion: Comments in Drill Views

Posted by John Omernik <jo...@omernik.com>.
So I think on your worry that's an easily definable "abuse" condition...
i.e. if we set a limit of say 1024 characters, that provides ample space
for descriptions, but at 1kb per view, that's an allowable condition, i.e.
it would be hard to abuse it ... or am I missing something?

On Wed, Mar 1, 2017 at 8:08 PM, Kunal Khatua <kk...@mapr.com> wrote:

> +1
>
>
> I this this can be very useful. The only worry is of someone abusing it,
> so we probably should have a limit on the size of this? Not sure else it
> could be exposed and consumed.
>
>
> Kunal Khatua
>
> Engineering
>
> [MapR]<http://www.mapr.com/>
>
> www.mapr.com<http://www.mapr.com/>
>
> ________________________________
> From: John Omernik <jo...@omernik.com>
> Sent: Wednesday, March 1, 2017 9:55:27 AM
> To: user
> Subject: Re: Discussion: Comments in Drill Views
>
> Sorry, I let this idea drop (I didn't follow up and found when searching
> for something else...)  Any other thoughts on this idea?
>
> Should I open a JIRA if people think it would be handy?
>
> On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <te...@gmail.com>
> wrote:
>
> > This is very interesting.  I love docstrings in Lisp and Python and
> Javadoc
> > in Java.
> >
> > Basically this is like that, but for SQL. Very helpful.
> >
> > On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <jo...@omernik.com> wrote:
> >
> > > I am looking for discussion here. A colleague was asking me how to add
> > > comments to the metadata of a view.  (He's new to Drill, thus the idea
> of
> > > not having metadata for a table is one he's warming up to).
> > >
> > > That got me thinking... why couldn't we use Drill Views to store
> > > table/field comments?  This could be a great way to help add contextual
> > > information for users. Here's some current observations when I issue a
> > > describe view_myview
> > >
> > >
> > > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
> > > 2. Even thought the underlying parquet table has types, the view does
> not
> > > pass the types for the underlying parquet files through.  (The type is
> > ANY)
> > > 3. The data for the view is all just a json file that could be easily
> > > extended.
> > >
> > >
> > > So, a few things would be a nice to have
> > >
> > > 1. Table comments.  when I issue a describe table, if the view has a
> > > "Description" field, then having that print out as a description for
> the
> > > whole view would be nice.  This is harder, I think because it's not
> just
> > > extending the view information.
> > >
> > > 2. Column comments:  A text field that could be added to the view, and
> > just
> > > print out another column with description.  This would be very helpful.
> > > While Drill being schemaless is awesome, the ability to add information
> > to
> > > known data, is huge.
> > >
> > > 3. Ability to to use the types from the Parquet files (without manually
> > > specifying each type).  If we could provide an option to View creation
> to
> > > attempt to infer type, that would be handy. I realize that folks are
> > using
> > > the LIMIT 0 to get metadata, but describe could be done well too.
> > >
> > > 4. Ability, using ANSI Sql to update the view column descriptions and
> the
> > > description for the view itself.
> > >
> > > 5. I believe Avro has the ability to add this information to the files,
> > so
> > > if the data exists outside of views (such as in AVRO files) should we
> > > present it to the user in describe table events as well?
> > >
> > > Curious if folks think this would be valuable, how much work an
> addition
> > > like this would be to Drill, and other thoughts in general.
> > >
> > >
> > > John
> > >
> >
>

Re: Discussion: Comments in Drill Views

Posted by Kunal Khatua <kk...@mapr.com>.
+1


I this this can be very useful. The only worry is of someone abusing it, so we probably should have a limit on the size of this? Not sure else it could be exposed and consumed.


Kunal Khatua

Engineering

[MapR]<http://www.mapr.com/>

www.mapr.com<http://www.mapr.com/>

________________________________
From: John Omernik <jo...@omernik.com>
Sent: Wednesday, March 1, 2017 9:55:27 AM
To: user
Subject: Re: Discussion: Comments in Drill Views

Sorry, I let this idea drop (I didn't follow up and found when searching
for something else...)  Any other thoughts on this idea?

Should I open a JIRA if people think it would be handy?

On Thu, Jun 23, 2016 at 4:02 PM, Ted Dunning <te...@gmail.com> wrote:

> This is very interesting.  I love docstrings in Lisp and Python and Javadoc
> in Java.
>
> Basically this is like that, but for SQL. Very helpful.
>
> On Thu, Jun 23, 2016 at 11:48 AM, John Omernik <jo...@omernik.com> wrote:
>
> > I am looking for discussion here. A colleague was asking me how to add
> > comments to the metadata of a view.  (He's new to Drill, thus the idea of
> > not having metadata for a table is one he's warming up to).
> >
> > That got me thinking... why couldn't we use Drill Views to store
> > table/field comments?  This could be a great way to help add contextual
> > information for users. Here's some current observations when I issue a
> > describe view_myview
> >
> >
> > 1. I get three columns ,COLUMN_NAME, DATA_TYPE, and IS_NULLABLE
> > 2. Even thought the underlying parquet table has types, the view does not
> > pass the types for the underlying parquet files through.  (The type is
> ANY)
> > 3. The data for the view is all just a json file that could be easily
> > extended.
> >
> >
> > So, a few things would be a nice to have
> >
> > 1. Table comments.  when I issue a describe table, if the view has a
> > "Description" field, then having that print out as a description for the
> > whole view would be nice.  This is harder, I think because it's not just
> > extending the view information.
> >
> > 2. Column comments:  A text field that could be added to the view, and
> just
> > print out another column with description.  This would be very helpful.
> > While Drill being schemaless is awesome, the ability to add information
> to
> > known data, is huge.
> >
> > 3. Ability to to use the types from the Parquet files (without manually
> > specifying each type).  If we could provide an option to View creation to
> > attempt to infer type, that would be handy. I realize that folks are
> using
> > the LIMIT 0 to get metadata, but describe could be done well too.
> >
> > 4. Ability, using ANSI Sql to update the view column descriptions and the
> > description for the view itself.
> >
> > 5. I believe Avro has the ability to add this information to the files,
> so
> > if the data exists outside of views (such as in AVRO files) should we
> > present it to the user in describe table events as well?
> >
> > Curious if folks think this would be valuable, how much work an addition
> > like this would be to Drill, and other thoughts in general.
> >
> >
> > John
> >
>