You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/03/13 21:35:00 UTC

[jira] [Commented] (DRILL-8167) Add JSON Config Options to Format Config

    [ https://issues.apache.org/jira/browse/DRILL-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17505913#comment-17505913 ] 

ASF GitHub Bot commented on DRILL-8167:
---------------------------------------

cgivre opened a new pull request #2494:
URL: https://github.com/apache/drill/pull/2494


   # [DRILL-8167](https://issues.apache.org/jira/browse/DRILL-8167): Add JSON Config Options to Format Config
   
   ## Description
   Most all Drill format plugins allow the user to configure various options for that plugin as part of the format config.  The one glaring exception is the JSON reader which has several configuration options which can only be set globally.  This PR moves these to the format config so that users can set these options when they configure a storage plugin.  
   
   This PR does not eliminate the global settings for JSON.  It simply adds another place where a user can update the settings.  If the settings in the config file are not defined (`null`) Drill will use the global settings.
   
   The config is set to only include these values when they are not `null`, so there are no breaking changes.
   
   ## Documentation
   Drill's JSON reader can be configured with various global configuration variables.  However these variables can also be overridden in an individual storage plugin configuration.  The parameters are:
   
   * `allTextMode`:  When `true`, Drill will read all fields in a given JSON file as text.
   * `readNumbersAsDouble`:  When `true`, Drill will read all numbers as Doubles.  This is useful if your data contains fields with a mix of integers and floating point numbers.  A very common place this happens is when the record contains `0`.
   * `skipMalformedJSONRecords`:  When set to `true`, Drill will attempt to skip malformed JSON records.  When `false`, Drill will throw an exception for bad records.
   * `escapeAnyChar`:  Allows escaping of any character when set to `true`. 
   * `nanInf`:  Allows `NaN` and `Infinity` in JSON data when set to `true`. 
   
   A JSON config could look like this:
   
   ```json
   ...
   "json": {
      "type": "json",
      "extensions": ["json"],
      "allTextMode": true,
      "readNumbersAsDouble": true,
      "skipMalformedJSONRecords": true,
      "escapeAnyChar": false,
      "nanInf": true
   }
   ...
   ```
   
   You can also include these values at query time:
   
   ```sql
   SELECT `integer`, `float` 
   FROM table(cp.`jsoninput/input2.json` (type => 'json', allTextMode => True))"
   ```
   
   ## Testing
   Added unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Add JSON Config Options to Format Config
> ----------------------------------------
>
>                 Key: DRILL-8167
>                 URL: https://issues.apache.org/jira/browse/DRILL-8167
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - JSON
>    Affects Versions: 1.20.0
>            Reporter: Charles Givre
>            Assignee: Charles Givre
>            Priority: Major
>             Fix For: Future
>
>
> Most all Drill format plugins allow the user to configure various options for that plugin as part of the format config.  The one glaring exception is the JSON reader which has several configuration options which can only be set globally.  This PR moves these to the format config so that users can set these options when they configure a storage plugin.  
> This PR does not eliminate the global settings for JSON.  It simply adds another place where a user can update the settings.  If the settings in the config file are not defined (`null`) Drill will use the global settings.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)