You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Venki Korukanti <ve...@gmail.com> on 2015/12/15 23:52:05 UTC

[ANNOUNCE] Release of Apache Drill 1.4.0

On behalf of Apache Drill community, I am happy to announce the release of
Apache Drill 1.4.0.

This release of Drill fixes many issues and introduces a number of
enhancements,
including the following ones:

- Partition pruning improvements to reduce the planning time (DRILL-3765
<https://issues.apache.org/jira/browse/DRILL-3765>).
- Select with options. More about this feature here [1].
- ValueVector related code is extracted from 'exec/java-exec' module into a
separate module 'exec/vector'. See [2]. There is no change for Drill end
user experience, but it is an opportunity for developers to use the Drill's
in-memory columnar representation in their own projects.

The source and binary artifacts are available at [3]
Review a complete list of fixes and enhancements at [4]

Thanks to everyone in the community who contributed in this release.

[1]
https://drill.apache.org/docs/plugin-configuration-basics/#using-the-formats-attributes-as-table-function-parameters
[2] https://issues.apache.org/jira/browse/DRILL-3987
[3] http://drill.apache.org/download/
[4] http://drill.apache.org/docs/apache-drill-1-4-0-release-notes/

Thanks
Venki

Re: [ANNOUNCE] Release of Apache Drill 1.4.0

Posted by John Omernik <jo...@omernik.com>.
I think it's ok to set them at a default level (system) or apply them to
all tables in a session (session) but there needs to be a precedence that
the setting closest to the data wins.  As an administrator, I may have a
need to implement a system wide standard for how data is read that is
different from the default that drill sets. (Consider the default being
that, for example, Drill infers ints as ints and doubles as doubles when
reading JSON).  That's the default, maybe I as an administrator want Drill
to act different by default. That's systems administration, and a good
ALTER SYSTEM.   But if a user sets it for their session that should take
precedence. If it's set in a view or query that would take precedence.

One way to distinguish is perhaps at the system or session level instead of
a variable like

store.json.read_numbers_as_double

we could do

store.json.read_numbers_as_double.default

To indicate that is the default setting, and then setting
store.json.read_numbers_as_double in options would override that.  I am
just spitballing ideas here, so feel to tear it apart.  I think there was a
post or JIRA that talked about the principal of least surprise, and I am an
acolyte of that school of thought, so this type of discussion is great.
Thanks for facilitating.

John





On Wed, Dec 16, 2015 at 7:54 PM, Jason Altekruse <al...@gmail.com>
wrote:

> Hi John,
>
> Unfortunately this feature only works with the options that can be set as
> part of the format plugin right now. I think we should definitely move the
> options for interpreting files, like the JSON options you mentioned to the
> format plugin/select with options scope, rather than making users set them
> at a system or session level.
>
> Do you think it is useful to keep these available at the system/session
> level as well? Steven Phillips has mentioned before that he has concerns
> about these types of options that can affect query results that are not
> associated directly with a table or file. It would be worth opening a
> discussion about a community consensus on what behavior makes the most
> sense, and if we need to maintain backwards compatibility. I am okay with
> breaking compatibility for consistency, but I would like us to have a few
> releases where we warn people setting the soon to be deprecated version at
> the system/session and direct them to docs on the new expected workflow,
> setting the option in the format plugin or using select with options.
>
> I have opened a JIRA for this task:
> https://issues.apache.org/jira/browse/DRILL-4206
>
> On Wed, Dec 16, 2015 at 11:38 AM, John Omernik <jo...@omernik.com> wrote:
>
> > This is great!
> >
> > I tried using the select with options on a json table, where I get the
> > issues with different types on number fields. I tried this and got an
> > error.  I think I am following the docs correctly, any thoughts here?
> (note
> > I've tried it with json.read_numbers_as_double,
> > store.json.read_numbers_as_double and get the same error. I get a
> different
> > error, below with just read_numbers_as_double. Anythoughts would be
> > helpful!!
> >
> > > select * from table(dfs.dev.`jsontable`(json.read_numbers_as_double =>
> > true)) limit 10;
> >
> > Error: PARSE ERROR: Encountered "=>" at line 1, column 67.
> >
> > Was expecting one of:
> >
> >     ")" ...
> >
> >     "ORDER" ...
> >
> >     "LIMIT" ...
> >
> >     "OFFSET" ...
> >
> >     "FETCH" ...
> >
> >     "," ...
> >
> >     "UNION" ...
> >
> >     "INTERSECT" ...
> >
> >     "EXCEPT" ...
> >
> >     "NOT" ...
> >
> >     "IN" ...
> >
> >     "BETWEEN" ...
> >
> >     "LIKE" ...
> >
> >     "SIMILAR" ...
> >
> >     "=" ...
> >
> >     ">" ...
> >
> >     "<" ...
> >
> >     "<=" ...
> >
> >     ">=" ...
> >
> >     "<>" ...
> >
> >     "+" ...
> >
> >     "-" ...
> >
> >     "*" ...
> >
> >     "/" ...
> >
> >     "||" ...
> >
> >     "AND" ...
> >
> >     "OR" ...
> >
> >     "IS" ...
> >
> >     "MEMBER" ...
> >
> >     "SUBMULTISET" ...
> >
> >     "MULTISET" ...
> >
> >     "[" ...
> >
> >     "." ...
> >
> >     "(" ...
> >
> >
> >
> >
> > while parsing SQL query:
> >
> > select * from table(dfs.dev.`jsontable`(json.read_numbers_as_double =>
> > true)) limit 10
> >
> >                                                                   ^
> >
> >
> >
> > [Error Id: 9b07ad9b-0f4a-440f-8c4e-a380a7e73f73 on node4:31010]
> > (state=,code=0)
> >
> > > select * from table(dfs.dev.`jsondata`(read_numbers_as_double => true))
> > limit 10;
> >
> > Error: VALIDATION ERROR: From line 1, column 29 to line 1, column 69: No
> > match found for function signature jsondata(read_numbers_as_double =>
> > <BOOLEAN>)
> >
> >
> >
> > [Error Id: 708ed594-a54a-49a1-808e-a21000ae2ca3 on node4:31010]
> > (state=,code=0)
> >
> >
> >
> >
> >
> > On Tue, Dec 15, 2015 at 4:52 PM, Venki Korukanti <
> > venki.korukanti@gmail.com>
> > wrote:
> >
> > > On behalf of Apache Drill community, I am happy to announce the release
> > of
> > > Apache Drill 1.4.0.
> > >
> > > This release of Drill fixes many issues and introduces a number of
> > > enhancements,
> > > including the following ones:
> > >
> > > - Partition pruning improvements to reduce the planning time
> (DRILL-3765
> > > <https://issues.apache.org/jira/browse/DRILL-3765>).
> > > - Select with options. More about this feature here [1].
> > > - ValueVector related code is extracted from 'exec/java-exec' module
> > into a
> > > separate module 'exec/vector'. See [2]. There is no change for Drill
> end
> > > user experience, but it is an opportunity for developers to use the
> > Drill's
> > > in-memory columnar representation in their own projects.
> > >
> > > The source and binary artifacts are available at [3]
> > > Review a complete list of fixes and enhancements at [4]
> > >
> > > Thanks to everyone in the community who contributed in this release.
> > >
> > > [1]
> > >
> > >
> >
> https://drill.apache.org/docs/plugin-configuration-basics/#using-the-formats-attributes-as-table-function-parameters
> > > [2] https://issues.apache.org/jira/browse/DRILL-3987
> > > [3] http://drill.apache.org/download/
> > > [4] http://drill.apache.org/docs/apache-drill-1-4-0-release-notes/
> > >
> > > Thanks
> > > Venki
> > >
> >
>

Re: [ANNOUNCE] Release of Apache Drill 1.4.0

Posted by Jason Altekruse <al...@gmail.com>.
Hi John,

Unfortunately this feature only works with the options that can be set as
part of the format plugin right now. I think we should definitely move the
options for interpreting files, like the JSON options you mentioned to the
format plugin/select with options scope, rather than making users set them
at a system or session level.

Do you think it is useful to keep these available at the system/session
level as well? Steven Phillips has mentioned before that he has concerns
about these types of options that can affect query results that are not
associated directly with a table or file. It would be worth opening a
discussion about a community consensus on what behavior makes the most
sense, and if we need to maintain backwards compatibility. I am okay with
breaking compatibility for consistency, but I would like us to have a few
releases where we warn people setting the soon to be deprecated version at
the system/session and direct them to docs on the new expected workflow,
setting the option in the format plugin or using select with options.

I have opened a JIRA for this task:
https://issues.apache.org/jira/browse/DRILL-4206

On Wed, Dec 16, 2015 at 11:38 AM, John Omernik <jo...@omernik.com> wrote:

> This is great!
>
> I tried using the select with options on a json table, where I get the
> issues with different types on number fields. I tried this and got an
> error.  I think I am following the docs correctly, any thoughts here? (note
> I've tried it with json.read_numbers_as_double,
> store.json.read_numbers_as_double and get the same error. I get a different
> error, below with just read_numbers_as_double. Anythoughts would be
> helpful!!
>
> > select * from table(dfs.dev.`jsontable`(json.read_numbers_as_double =>
> true)) limit 10;
>
> Error: PARSE ERROR: Encountered "=>" at line 1, column 67.
>
> Was expecting one of:
>
>     ")" ...
>
>     "ORDER" ...
>
>     "LIMIT" ...
>
>     "OFFSET" ...
>
>     "FETCH" ...
>
>     "," ...
>
>     "UNION" ...
>
>     "INTERSECT" ...
>
>     "EXCEPT" ...
>
>     "NOT" ...
>
>     "IN" ...
>
>     "BETWEEN" ...
>
>     "LIKE" ...
>
>     "SIMILAR" ...
>
>     "=" ...
>
>     ">" ...
>
>     "<" ...
>
>     "<=" ...
>
>     ">=" ...
>
>     "<>" ...
>
>     "+" ...
>
>     "-" ...
>
>     "*" ...
>
>     "/" ...
>
>     "||" ...
>
>     "AND" ...
>
>     "OR" ...
>
>     "IS" ...
>
>     "MEMBER" ...
>
>     "SUBMULTISET" ...
>
>     "MULTISET" ...
>
>     "[" ...
>
>     "." ...
>
>     "(" ...
>
>
>
>
> while parsing SQL query:
>
> select * from table(dfs.dev.`jsontable`(json.read_numbers_as_double =>
> true)) limit 10
>
>                                                                   ^
>
>
>
> [Error Id: 9b07ad9b-0f4a-440f-8c4e-a380a7e73f73 on node4:31010]
> (state=,code=0)
>
> > select * from table(dfs.dev.`jsondata`(read_numbers_as_double => true))
> limit 10;
>
> Error: VALIDATION ERROR: From line 1, column 29 to line 1, column 69: No
> match found for function signature jsondata(read_numbers_as_double =>
> <BOOLEAN>)
>
>
>
> [Error Id: 708ed594-a54a-49a1-808e-a21000ae2ca3 on node4:31010]
> (state=,code=0)
>
>
>
>
>
> On Tue, Dec 15, 2015 at 4:52 PM, Venki Korukanti <
> venki.korukanti@gmail.com>
> wrote:
>
> > On behalf of Apache Drill community, I am happy to announce the release
> of
> > Apache Drill 1.4.0.
> >
> > This release of Drill fixes many issues and introduces a number of
> > enhancements,
> > including the following ones:
> >
> > - Partition pruning improvements to reduce the planning time (DRILL-3765
> > <https://issues.apache.org/jira/browse/DRILL-3765>).
> > - Select with options. More about this feature here [1].
> > - ValueVector related code is extracted from 'exec/java-exec' module
> into a
> > separate module 'exec/vector'. See [2]. There is no change for Drill end
> > user experience, but it is an opportunity for developers to use the
> Drill's
> > in-memory columnar representation in their own projects.
> >
> > The source and binary artifacts are available at [3]
> > Review a complete list of fixes and enhancements at [4]
> >
> > Thanks to everyone in the community who contributed in this release.
> >
> > [1]
> >
> >
> https://drill.apache.org/docs/plugin-configuration-basics/#using-the-formats-attributes-as-table-function-parameters
> > [2] https://issues.apache.org/jira/browse/DRILL-3987
> > [3] http://drill.apache.org/download/
> > [4] http://drill.apache.org/docs/apache-drill-1-4-0-release-notes/
> >
> > Thanks
> > Venki
> >
>

Re: [ANNOUNCE] Release of Apache Drill 1.4.0

Posted by John Omernik <jo...@omernik.com>.
This is great!

I tried using the select with options on a json table, where I get the
issues with different types on number fields. I tried this and got an
error.  I think I am following the docs correctly, any thoughts here? (note
I've tried it with json.read_numbers_as_double,
store.json.read_numbers_as_double and get the same error. I get a different
error, below with just read_numbers_as_double. Anythoughts would be
helpful!!

> select * from table(dfs.dev.`jsontable`(json.read_numbers_as_double =>
true)) limit 10;

Error: PARSE ERROR: Encountered "=>" at line 1, column 67.

Was expecting one of:

    ")" ...

    "ORDER" ...

    "LIMIT" ...

    "OFFSET" ...

    "FETCH" ...

    "," ...

    "UNION" ...

    "INTERSECT" ...

    "EXCEPT" ...

    "NOT" ...

    "IN" ...

    "BETWEEN" ...

    "LIKE" ...

    "SIMILAR" ...

    "=" ...

    ">" ...

    "<" ...

    "<=" ...

    ">=" ...

    "<>" ...

    "+" ...

    "-" ...

    "*" ...

    "/" ...

    "||" ...

    "AND" ...

    "OR" ...

    "IS" ...

    "MEMBER" ...

    "SUBMULTISET" ...

    "MULTISET" ...

    "[" ...

    "." ...

    "(" ...




while parsing SQL query:

select * from table(dfs.dev.`jsontable`(json.read_numbers_as_double =>
true)) limit 10

                                                                  ^



[Error Id: 9b07ad9b-0f4a-440f-8c4e-a380a7e73f73 on node4:31010]
(state=,code=0)

> select * from table(dfs.dev.`jsondata`(read_numbers_as_double => true))
limit 10;

Error: VALIDATION ERROR: From line 1, column 29 to line 1, column 69: No
match found for function signature jsondata(read_numbers_as_double =>
<BOOLEAN>)



[Error Id: 708ed594-a54a-49a1-808e-a21000ae2ca3 on node4:31010]
(state=,code=0)





On Tue, Dec 15, 2015 at 4:52 PM, Venki Korukanti <ve...@gmail.com>
wrote:

> On behalf of Apache Drill community, I am happy to announce the release of
> Apache Drill 1.4.0.
>
> This release of Drill fixes many issues and introduces a number of
> enhancements,
> including the following ones:
>
> - Partition pruning improvements to reduce the planning time (DRILL-3765
> <https://issues.apache.org/jira/browse/DRILL-3765>).
> - Select with options. More about this feature here [1].
> - ValueVector related code is extracted from 'exec/java-exec' module into a
> separate module 'exec/vector'. See [2]. There is no change for Drill end
> user experience, but it is an opportunity for developers to use the Drill's
> in-memory columnar representation in their own projects.
>
> The source and binary artifacts are available at [3]
> Review a complete list of fixes and enhancements at [4]
>
> Thanks to everyone in the community who contributed in this release.
>
> [1]
>
> https://drill.apache.org/docs/plugin-configuration-basics/#using-the-formats-attributes-as-table-function-parameters
> [2] https://issues.apache.org/jira/browse/DRILL-3987
> [3] http://drill.apache.org/download/
> [4] http://drill.apache.org/docs/apache-drill-1-4-0-release-notes/
>
> Thanks
> Venki
>