You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Tamas Nemeth (JIRA)" <ji...@apache.org> on 2017/08/29 05:25:00 UTC
[jira] [Updated] (GOBBLIN-231) Grok to Json Converter
[ https://issues.apache.org/jira/browse/GOBBLIN-231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tamas Nemeth updated GOBBLIN-231:
---------------------------------
Priority: Major (was: Minor)
> Grok to Json Converter
> ----------------------
>
> Key: GOBBLIN-231
> URL: https://issues.apache.org/jira/browse/GOBBLIN-231
> Project: Apache Gobblin
> Issue Type: New Feature
> Components: gobblin-core
> Reporter: Tamas Nemeth
> Assignee: Abhishek Tiwari
>
> Converter can convert text to json base on a GROK pattern.
> GrokToJsonConverter accepts already deserialized text row, String.
> Converts Text to JSON based on Grok pattern. Schema is represented by the form of JsonArray same interface being used by CsvToJonConverter.
> Each text record is represented by a String.
> The converter only supports Grok patterns where every group is named because it uses the group names as column names.
> The following config properties can be set:
> The grok pattern to use for the conversion:
> converter.grok_to_json.pattern=^%{IPORHOST:clientip} (?:-|%{USER:ident}) (?:-|%{USER:auth}) \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)\" %{NUMBER:response} (?:-|%{NUMBER:bytes})
> Path to the grok patterns (if not set it will use the default ones):
> converter.grok_to_json.patterns=/tmp/grok_patterns
> Treat empty string as null value:
> converter.grok_to_json.empty_as_null=true
> Specify the null string:
> converter.grok_to_json.null_string=null
> Example of schema:
> [
> {
> "columnName": "Day",
> "comment": "",
> "isNullable": "true",
> "dataType": {
> "type": "string"
> }
> },
> {
> "columnName": "Pageviews",
> "comment": "",
> "isNullable": "true",
> "dataType": {
> "type": "long"
> }
> }
> ]
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)