You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Tamas Nemeth (JIRA)" <ji...@apache.org> on 2017/08/29 05:25:00 UTC

[jira] [Updated] (GOBBLIN-231) Grok to Json Converter

     [ https://issues.apache.org/jira/browse/GOBBLIN-231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tamas Nemeth updated GOBBLIN-231:
---------------------------------
    Priority: Major  (was: Minor)

> Grok to Json Converter
> ----------------------
>
>                 Key: GOBBLIN-231
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-231
>             Project: Apache Gobblin
>          Issue Type: New Feature
>          Components: gobblin-core
>            Reporter: Tamas Nemeth
>            Assignee: Abhishek Tiwari
>
> Converter can convert text to json base on a GROK pattern.
> GrokToJsonConverter accepts already deserialized text row, String.
> Converts Text to JSON based on Grok pattern. Schema is represented by the form of JsonArray same interface being used by CsvToJonConverter.
> Each text record is represented by a String.
> The converter only supports Grok patterns where every group is named because it uses the group names as column names.
> The following config properties can be set:
> The grok pattern to use for the conversion:
> converter.grok_to_json.pattern=^%{IPORHOST:clientip} (?:-|%{USER:ident}) (?:-|%{USER:auth}) \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)\" %{NUMBER:response} (?:-|%{NUMBER:bytes})
> Path to the grok patterns (if not set it will use the default ones):
> converter.grok_to_json.patterns=/tmp/grok_patterns
> Treat empty string as null value:
> converter.grok_to_json.empty_as_null=true
> Specify the null string:
> converter.grok_to_json.null_string=null
> Example of schema:
>  [
>   {
>     "columnName": "Day",
>     "comment": "",
>     "isNullable": "true",
>     "dataType": {
>       "type": "string"
>     }
>   },
>   {
>     "columnName": "Pageviews",
>     "comment": "",
>     "isNullable": "true",
>     "dataType": {
>       "type": "long"
>     }
>   }
> ]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)