You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2019/11/30 23:37:17 UTC
[GitHub] [drill] paul-rogers opened a new pull request #1913: DRILL-6953:
EVF-based version of the JSON reader
paul-rogers opened a new pull request #1913: DRILL-6953: EVF-based version of the JSON reader
URL: https://github.com/apache/drill/pull/1913
Reimplements the JSON reader on top of the EVF. Does not yet
handle a provided schema. New JSON parser does not yet reflect
any changes made to the "V1" JSON parser in the last year.
Does not yet handle the Union and List-of-union types.
Enabling those encountered many issues elsewhere in Drill.
Provides more robust (but still limited) handling of JSON
type ambigutities. Handles runs of nulls before the first
non-null value (within the first batch.) Handles runs of
empty arrays before the first non-empty array (again, within
the first batch.) Handles the case where a null value turns out
to be an object or array. Handles reasonable conversions between
types.
Handling ambiguities makes the new parser more complex than
the "V1" version. The new one uses explict states for each
kind of JSON object, where as the old one used implicit states
expressed via if-statements, which can be a bit hard to follow
as the states get more complex.
The new "V2" JSON scan is controlled by a new option:
store.json.enable_v2_reader, which is false by default in this
PR.
Adds a "projection type" to the column writer so that the
JSON parser can receive a "hint" as to the expected type.
The hint is from the form of the projected column: `a[0]`,
`a.b` or just `a`.
Reimplements a number of JSON tests to test both the original
"V1" and the new "V2" versions of the JSON reader. Adds many
new tests for the new features of the "V2" parser.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services