You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by ja...@apache.org on 2020/06/11 07:00:32 UTC
[flink] branch release-1.11 updated: [FLINK-18131][docs] Add
documentation for the new JSON format
This is an automated email from the ASF dual-hosted git repository.
jark pushed a commit to branch release-1.11
in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/release-1.11 by this push:
new e54ca4b [FLINK-18131][docs] Add documentation for the new JSON format
e54ca4b is described below
commit e54ca4bd8e871557bdbb149eb8c9aff68e5d4db7
Author: Danny Chan <yu...@gmail.com>
AuthorDate: Thu Jun 11 14:59:05 2020 +0800
[FLINK-18131][docs] Add documentation for the new JSON format
This closes #12574
---
docs/dev/table/connectors/formats/index.md | 2 +-
docs/dev/table/connectors/formats/index.zh.md | 2 +-
docs/dev/table/connectors/formats/json.md | 200 ++++++++++++++++++++++++++
docs/dev/table/connectors/formats/json.zh.md | 200 ++++++++++++++++++++++++++
4 files changed, 402 insertions(+), 2 deletions(-)
diff --git a/docs/dev/table/connectors/formats/index.md b/docs/dev/table/connectors/formats/index.md
index 6f45d74..e0e03b5 100644
--- a/docs/dev/table/connectors/formats/index.md
+++ b/docs/dev/table/connectors/formats/index.md
@@ -42,7 +42,7 @@ Flink supports the following formats:
<a href="{{ site.baseurl }}/dev/table/connectors/filesystem.html">Filesystem</a></td>
</tr>
<tr>
- <td>JSON</td>
+ <td><a href="{{ site.baseurl }}/dev/table/connectors/formats/json.html">JSON</a></td>
<td>Apache Kafka,
<a href="{{ site.baseurl }}/dev/table/connectors/filesystem.html">Filesystem</a>,
Elasticsearch</td>
diff --git a/docs/dev/table/connectors/formats/index.zh.md b/docs/dev/table/connectors/formats/index.zh.md
index 6f45d74..e0e03b5 100644
--- a/docs/dev/table/connectors/formats/index.zh.md
+++ b/docs/dev/table/connectors/formats/index.zh.md
@@ -42,7 +42,7 @@ Flink supports the following formats:
<a href="{{ site.baseurl }}/dev/table/connectors/filesystem.html">Filesystem</a></td>
</tr>
<tr>
- <td>JSON</td>
+ <td><a href="{{ site.baseurl }}/dev/table/connectors/formats/json.html">JSON</a></td>
<td>Apache Kafka,
<a href="{{ site.baseurl }}/dev/table/connectors/filesystem.html">Filesystem</a>,
Elasticsearch</td>
diff --git a/docs/dev/table/connectors/formats/json.md b/docs/dev/table/connectors/formats/json.md
new file mode 100644
index 0000000..892f30c
--- /dev/null
+++ b/docs/dev/table/connectors/formats/json.md
@@ -0,0 +1,200 @@
+---
+title: "JSON Format"
+nav-title: JSON
+nav-parent_id: sql-formats
+nav-pos: 2
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+<span class="label label-info">Format: Serialization Schema</span>
+<span class="label label-info">Format: Deserialization Schema</span>
+
+* This will be replaced by the TOC
+{:toc}
+
+The [JSON](https://www.json.org/json-en.html) format allows to read and write JSON data based on an JSON schema. Currently, the JSON schema is derived from table schema.
+
+Dependencies
+------------
+
+In order to setup the JSON format, the following table provides dependency information for both projects using a build automation tool (such as Maven or SBT) and SQL Client with SQL JAR bundles.
+
+| Maven dependency | SQL Client JAR |
+| :----------------- | :----------------------|
+| `flink-json` | Built-in |
+
+How to create a table with JSON format
+----------------
+
+Here is an example to create a table using Kafka connector and JSON format.
+
+<div class="codetabs" markdown="1">
+<div data-lang="SQL" markdown="1">
+{% highlight sql %}
+CREATE TABLE user_behavior (
+ user_id BIGINT,
+ item_id BIGINT,
+ category_id BIGINT,
+ behavior STRING,
+ ts TIMESTAMP(3)
+) WITH (
+ 'connector' = 'kafka',
+ 'topic' = 'user_behavior',
+ 'properties.bootstrap.servers' = 'localhost:9092',
+ 'properties.group.id' = 'testGroup',
+ 'format' = 'json',
+ 'json.fail-on-missing-field' = 'false',
+ 'json.ignore-parse-errors' = 'true'
+)
+{% endhighlight %}
+</div>
+</div>
+
+Format Options
+----------------
+
+<table class="table table-bordered">
+ <thead>
+ <tr>
+ <th class="text-left" style="width: 25%">Option</th>
+ <th class="text-center" style="width: 8%">Required</th>
+ <th class="text-center" style="width: 7%">Default</th>
+ <th class="text-center" style="width: 10%">Type</th>
+ <th class="text-center" style="width: 50%">Description</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td><h5>format</h5></td>
+ <td>required</td>
+ <td style="word-wrap: break-word;">(none)</td>
+ <td>String</td>
+ <td>Specify what format to use, here should be 'json'.</td>
+ </tr>
+ <tr>
+ <td><h5>json.fail-on-missing-field</h5></td>
+ <td>optional</td>
+ <td style="word-wrap: break-word;">false</td>
+ <td>Boolean</td>
+ <td>Flag to specify whether to fail if a field is missing or not, false by default.</td>
+ </tr>
+ <tr>
+ <td><h5>json.ignore-parse-errors</h5></td>
+ <td>optional</td>
+ <td style="word-wrap: break-word;">false</td>
+ <td>Boolean</td>
+ <td>Flag to skip fields and rows with parse errors instead of failing;
+ fields are set to null in case of errors, false by default.</td>
+ </tr>
+ </tbody>
+</table>
+
+Data Type Mapping
+----------------
+
+Currently, the JSON schema is always derived from table schema. Explicitly defining an JSON schema is not supported yet.
+
+Flink JSON format uses [jackson databind API](https://github.com/FasterXML/jackson-databind) to parse and generate JSON string.
+
+The following table lists the type mapping from Flink type to JSON type.
+
+<table class="table table-bordered">
+ <thead>
+ <tr>
+ <th class="text-left">Flink Data Type</th>
+ <th class="text-center">JSON Data Type</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>CHAR / VARCHAR / STRING</td>
+ <td>string</td>
+ </tr>
+ <tr>
+ <td>BOOLEAN</td>
+ <td>boolean</td>
+ </tr>
+ <tr>
+ <td>BINARY / VARBINARY</td>
+ <td>string with encoding: base64</td>
+ </tr>
+ <tr>
+ <td>DECIMAL</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>TINYINT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>SMALLINT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>INT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>BIGINT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>FLOAT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>DOUBLE</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>DATE</td>
+ <td>string with format: date</td>
+ </tr>
+ <tr>
+ <td>TIME</td>
+ <td>string with format: time</td>
+ </tr>
+ <tr>
+ <td>TIMESTAMP</td>
+ <td>string with format: date-time</td>
+ </tr>
+ <tr>
+ <td>INTERVAL</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>ARRAY</td>
+ <td>array</td>
+ </tr>
+ <tr>
+ <td>MAP/MULTISET</td>
+ <td>object</td>
+ </tr>
+ <tr>
+ <td>ROW</td>
+ <td>object</td>
+ </tr>
+ </tbody>
+</table>
+
+
+
+
+
diff --git a/docs/dev/table/connectors/formats/json.zh.md b/docs/dev/table/connectors/formats/json.zh.md
new file mode 100644
index 0000000..892f30c
--- /dev/null
+++ b/docs/dev/table/connectors/formats/json.zh.md
@@ -0,0 +1,200 @@
+---
+title: "JSON Format"
+nav-title: JSON
+nav-parent_id: sql-formats
+nav-pos: 2
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+<span class="label label-info">Format: Serialization Schema</span>
+<span class="label label-info">Format: Deserialization Schema</span>
+
+* This will be replaced by the TOC
+{:toc}
+
+The [JSON](https://www.json.org/json-en.html) format allows to read and write JSON data based on an JSON schema. Currently, the JSON schema is derived from table schema.
+
+Dependencies
+------------
+
+In order to setup the JSON format, the following table provides dependency information for both projects using a build automation tool (such as Maven or SBT) and SQL Client with SQL JAR bundles.
+
+| Maven dependency | SQL Client JAR |
+| :----------------- | :----------------------|
+| `flink-json` | Built-in |
+
+How to create a table with JSON format
+----------------
+
+Here is an example to create a table using Kafka connector and JSON format.
+
+<div class="codetabs" markdown="1">
+<div data-lang="SQL" markdown="1">
+{% highlight sql %}
+CREATE TABLE user_behavior (
+ user_id BIGINT,
+ item_id BIGINT,
+ category_id BIGINT,
+ behavior STRING,
+ ts TIMESTAMP(3)
+) WITH (
+ 'connector' = 'kafka',
+ 'topic' = 'user_behavior',
+ 'properties.bootstrap.servers' = 'localhost:9092',
+ 'properties.group.id' = 'testGroup',
+ 'format' = 'json',
+ 'json.fail-on-missing-field' = 'false',
+ 'json.ignore-parse-errors' = 'true'
+)
+{% endhighlight %}
+</div>
+</div>
+
+Format Options
+----------------
+
+<table class="table table-bordered">
+ <thead>
+ <tr>
+ <th class="text-left" style="width: 25%">Option</th>
+ <th class="text-center" style="width: 8%">Required</th>
+ <th class="text-center" style="width: 7%">Default</th>
+ <th class="text-center" style="width: 10%">Type</th>
+ <th class="text-center" style="width: 50%">Description</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td><h5>format</h5></td>
+ <td>required</td>
+ <td style="word-wrap: break-word;">(none)</td>
+ <td>String</td>
+ <td>Specify what format to use, here should be 'json'.</td>
+ </tr>
+ <tr>
+ <td><h5>json.fail-on-missing-field</h5></td>
+ <td>optional</td>
+ <td style="word-wrap: break-word;">false</td>
+ <td>Boolean</td>
+ <td>Flag to specify whether to fail if a field is missing or not, false by default.</td>
+ </tr>
+ <tr>
+ <td><h5>json.ignore-parse-errors</h5></td>
+ <td>optional</td>
+ <td style="word-wrap: break-word;">false</td>
+ <td>Boolean</td>
+ <td>Flag to skip fields and rows with parse errors instead of failing;
+ fields are set to null in case of errors, false by default.</td>
+ </tr>
+ </tbody>
+</table>
+
+Data Type Mapping
+----------------
+
+Currently, the JSON schema is always derived from table schema. Explicitly defining an JSON schema is not supported yet.
+
+Flink JSON format uses [jackson databind API](https://github.com/FasterXML/jackson-databind) to parse and generate JSON string.
+
+The following table lists the type mapping from Flink type to JSON type.
+
+<table class="table table-bordered">
+ <thead>
+ <tr>
+ <th class="text-left">Flink Data Type</th>
+ <th class="text-center">JSON Data Type</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>CHAR / VARCHAR / STRING</td>
+ <td>string</td>
+ </tr>
+ <tr>
+ <td>BOOLEAN</td>
+ <td>boolean</td>
+ </tr>
+ <tr>
+ <td>BINARY / VARBINARY</td>
+ <td>string with encoding: base64</td>
+ </tr>
+ <tr>
+ <td>DECIMAL</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>TINYINT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>SMALLINT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>INT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>BIGINT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>FLOAT</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>DOUBLE</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>DATE</td>
+ <td>string with format: date</td>
+ </tr>
+ <tr>
+ <td>TIME</td>
+ <td>string with format: time</td>
+ </tr>
+ <tr>
+ <td>TIMESTAMP</td>
+ <td>string with format: date-time</td>
+ </tr>
+ <tr>
+ <td>INTERVAL</td>
+ <td>number</td>
+ </tr>
+ <tr>
+ <td>ARRAY</td>
+ <td>array</td>
+ </tr>
+ <tr>
+ <td>MAP/MULTISET</td>
+ <td>object</td>
+ </tr>
+ <tr>
+ <td>ROW</td>
+ <td>object</td>
+ </tr>
+ </tbody>
+</table>
+
+
+
+
+