You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by lz...@apache.org on 2022/07/28 11:23:59 UTC

[flink-table-store] branch master updated: [FLINK-28704] Add document for data type and fix format issue

This is an automated email from the ASF dual-hosted git repository.

lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink-table-store.git


The following commit(s) were added to refs/heads/master by this push:
     new 5f6c707a [FLINK-28704] Add document for data type and fix format issue
5f6c707a is described below

commit 5f6c707a050f4e4e3d44a2acaf949b5f3cffad08
Author: Jane Chan <55...@users.noreply.github.com>
AuthorDate: Thu Jul 28 19:23:27 2022 +0800

    [FLINK-28704] Add document for data type and fix format issue
    
    This closes #252
---
 docs/content/docs/development/create-table.md    |  19 +++--
 docs/content/docs/development/overview.md        |   2 +-
 docs/content/docs/development/streaming-query.md |   6 +-
 docs/content/docs/development/write-table.md     |  32 +++----
 docs/content/docs/engines/hive.md                | 104 ++++++++++++++++++++++-
 docs/content/docs/engines/spark.md               |  99 +++++++++++++++++++++
 docs/content/docs/try-table-store/quick-start.md |   2 +-
 7 files changed, 236 insertions(+), 28 deletions(-)

diff --git a/docs/content/docs/development/create-table.md b/docs/content/docs/development/create-table.md
index baee2568..2849b8d0 100644
--- a/docs/content/docs/development/create-table.md
+++ b/docs/content/docs/development/create-table.md
@@ -33,7 +33,7 @@ Table Store uses its own catalog to manage all the databases and tables. Users n
 ```sql
 CREATE CATALOG my_catalog WITH (
   'type'='table-store',
-  'warehouse'='hdfs://nn:8020/warehouse/path'
+  'warehouse'='hdfs://nn:8020/warehouse/path' -- or 'file://tmp/foo/bar'
 );
 
 USE CATALOG my_catalog;
@@ -90,10 +90,10 @@ Important options include the following:
     <thead>
     <tr>
       <th class="text-left" style="width: 20%">Option</th>
-      <th class="text-center" style="width: 5%">Required</th>
-      <th class="text-center" style="width: 5%">Default</th>
-      <th class="text-center" style="width: 10%">Type</th>
-      <th class="text-center" style="width: 60%">Description</th>
+      <th class="text-left" style="width: 5%">Required</th>
+      <th class="text-left" style="width: 5%">Default</th>
+      <th class="text-left" style="width: 10%">Type</th>
+      <th class="text-left" style="width: 60%">Description</th>
     </tr>
     </thead>
     <tbody>
@@ -145,7 +145,7 @@ CREATE TABLE MyTable (
 );
 ```
 
-For example, the `MyTable` table above has its data distribution
+For example, the above `MyTable` has its data distribution
 in the following order:
 - Partition: isolating different data based on partition fields.
 - Bucket: Within a single partition, distributed into 4 different
@@ -343,3 +343,10 @@ The query
 SELECT * FROM T2
 ``` 
 will return either `(1, 1.0, 'AAA'), (2, 2.0, 'BBB'), (1, 1.0, 'AAA'), (3, 3.0, 'CCC')` or `(3, 3.0, 'CCC'), (1, 1.0, 'AAA'), (2, 2.0, 'BBB'), (1, 1.0, 'AAA')`.
+
+### Supported Flink Data Type
+
+Users can refer to [Flink Data Types](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/types/).
+{{< hint info >}}
+__Note:__ `MULTISET` is **not supported** for all `write-mode`, and `MAP` is **only supported** as a non-primary key field in a primary-keyed table.
+{{< /hint >}}
diff --git a/docs/content/docs/development/overview.md b/docs/content/docs/development/overview.md
index a5e6e403..064e2aa9 100644
--- a/docs/content/docs/development/overview.md
+++ b/docs/content/docs/development/overview.md
@@ -41,7 +41,7 @@ As shown in the architecture above:
 - For writes, it supports streaming synchronization from the changelog of databases (CDC) or batch
   insert/overwrite from offline data.
 
-**Ecosystem:** In addition to Apache Flink, Table Store also supports read/write by other computation
+**Ecosystem:** In addition to Apache Flink, Table Store also supports read by other computation
 engines like Apache Hive, Apache Spark and Trino.
 
 **Internal:** Under the hood, Table Store uses a hybrid storage architecture with a lake format to store
diff --git a/docs/content/docs/development/streaming-query.md b/docs/content/docs/development/streaming-query.md
index a714ee56..532e8b13 100644
--- a/docs/content/docs/development/streaming-query.md
+++ b/docs/content/docs/development/streaming-query.md
@@ -49,9 +49,9 @@ Different `log.scan` mode will result in different consuming behavior under stre
 <table class="table table-bordered">
     <thead>
     <tr>
-      <th class="text-left" style="width: 20%">Scan Mode</th>
-      <th class="text-center" style="width: 5%">Default</th>
-      <th class="text-center" style="width: 60%">Description</th>
+      <th class="text-left" style="width: 10%">Scan Mode</th>
+      <th class="text-left" style="width: 5%">Default</th>
+      <th class="text-left" style="width: 60%">Description</th>
     </tr>
     </thead>
     <tbody>
diff --git a/docs/content/docs/development/write-table.md b/docs/content/docs/development/write-table.md
index 4dcf1076..6873fc81 100644
--- a/docs/content/docs/development/write-table.md
+++ b/docs/content/docs/development/write-table.md
@@ -51,10 +51,10 @@ parallelism of the sink with the `sink.parallelism` option.
     <thead>
     <tr>
       <th class="text-left" style="width: 20%">Option</th>
-      <th class="text-center" style="width: 5%">Required</th>
-      <th class="text-center" style="width: 5%">Default</th>
-      <th class="text-center" style="width: 10%">Type</th>
-      <th class="text-center" style="width: 60%">Description</th>
+      <th class="text-left" style="width: 5%">Required</th>
+      <th class="text-left" style="width: 5%">Default</th>
+      <th class="text-left" style="width: 10%">Type</th>
+      <th class="text-left" style="width: 60%">Description</th>
     </tr>
     </thead>
     <tbody>
@@ -78,10 +78,10 @@ to eliminate expired snapshots:
     <thead>
     <tr>
       <th class="text-left" style="width: 20%">Option</th>
-      <th class="text-center" style="width: 5%">Required</th>
-      <th class="text-center" style="width: 5%">Default</th>
-      <th class="text-center" style="width: 10%">Type</th>
-      <th class="text-center" style="width: 60%">Description</th>
+      <th class="text-left" style="width: 5%">Required</th>
+      <th class="text-left" style="width: 5%">Default</th>
+      <th class="text-left" style="width: 10%">Type</th>
+      <th class="text-left" style="width: 60%">Description</th>
     </tr>
     </thead>
     <tbody>
@@ -126,10 +126,10 @@ following parameters control this tradeoff:
     <thead>
     <tr>
       <th class="text-left" style="width: 20%">Option</th>
-      <th class="text-center" style="width: 5%">Required</th>
-      <th class="text-center" style="width: 5%">Default</th>
-      <th class="text-center" style="width: 10%">Type</th>
-      <th class="text-center" style="width: 60%">Description</th>
+      <th class="text-left" style="width: 5%">Required</th>
+      <th class="text-left" style="width: 5%">Default</th>
+      <th class="text-left" style="width: 10%">Type</th>
+      <th class="text-left" style="width: 60%">Description</th>
     </tr>
     </thead>
     <tbody>
@@ -170,10 +170,10 @@ The following parameters determine when to stop writing:
     <thead>
     <tr>
       <th class="text-left" style="width: 20%">Option</th>
-      <th class="text-center" style="width: 5%">Required</th>
-      <th class="text-center" style="width: 5%">Default</th>
-      <th class="text-center" style="width: 10%">Type</th>
-      <th class="text-center" style="width: 60%">Description</th>
+      <th class="text-left" style="width: 5%">Required</th>
+      <th class="text-left" style="width: 5%">Default</th>
+      <th class="text-left" style="width: 10%">Type</th>
+      <th class="text-left" style="width: 60%">Description</th>
     </tr>
     </thead>
     <tbody>
diff --git a/docs/content/docs/engines/hive.md b/docs/content/docs/engines/hive.md
index 59872248..6cc49777 100644
--- a/docs/content/docs/engines/hive.md
+++ b/docs/content/docs/engines/hive.md
@@ -147,4 +147,106 @@ OK
 1	Table
 2	Store
 */
-```
\ No newline at end of file
+```
+
+### Hive Type Conversion
+
+This section lists all supported type conversion between Hive and Flink.
+All Hive's data types are available in package `org.apache.hadoop.hive.serde2.typeinfo`.
+
+<table class="table table-bordered">
+    <thead>
+    <tr>
+      <th class="text-left" style="width: 10%">Hive Data Type</th>
+      <th class="text-left" style="width: 10%">Flink Data Type</th>
+      <th class="text-left" style="width: 5%">Atomic Type</th>
+    </tr>
+    </thead>
+    <tbody>
+    <tr>
+      <td><code>StructTypeInfo</code></td>
+      <td><code>RowType</code></td>
+      <td>false</td>
+    </tr>
+    <tr>
+      <td><code>MapTypeInfo</code></td>
+      <td><code>MapType</code></td>
+      <td>false</td>
+    </tr>
+    <tr>
+      <td><code>ListTypeInfo</code></td>
+      <td><code>ArrayType</code></td>
+      <td>false</td>
+    </tr>
+    <tr>
+      <td><code>PrimitiveTypeInfo("boolean")</code></td>
+      <td><code>BooleanType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>PrimitiveTypeInfo("tinyint")</code></td>
+      <td><code>TinyIntType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>PrimitiveTypeInfo("smallint")</code></td>
+      <td><code>SmallIntType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>PrimitiveTypeInfo("int")</code></td>
+      <td><code>IntType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>PrimitiveTypeInfo("bigint")</code></td>
+      <td><code>BigIntType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>PrimitiveTypeInfo("float")</code></td>
+      <td><code>FloatType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>PrimitiveTypeInfo("double")</code></td>
+      <td><code>DoubleType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>BaseCharTypeInfo("char(%d)")</code></td>
+      <td><code>CharType(length)</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>PrimitiveTypeInfo("string")</code></td>
+      <td><code>VarCharType(VarCharType.MAX_LENGTH)</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>BaseCharTypeInfo("varchar(%d)")</code></td>
+      <td><code>VarCharType(length), length is less than VarCharType.MAX_LENGTH</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>PrimitiveTypeInfo("date")</code></td>
+      <td><code>DateType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>TimestampType</code></td>
+      <td><code>TimestampType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>DecimalTypeInfo("decimal(%d, %d)")</code></td>
+      <td><code>DecimalType(precision, scale)</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>DecimalTypeInfo("binary")</code></td>
+      <td><code>VarBinaryType</code>, <code>BinaryType</code></td>
+      <td>true</td>
+    </tr>
+    </tbody>
+</table>
diff --git a/docs/content/docs/engines/spark.md b/docs/content/docs/engines/spark.md
index 7582d9e9..5c7ede95 100644
--- a/docs/content/docs/engines/spark.md
+++ b/docs/content/docs/engines/spark.md
@@ -179,3 +179,102 @@ DROP NAMESPACE tablestore.bar
 {{< hint warning >}}
 __Attention__: Drop a namespace will delete all table's metadata and files under this namespace on the disk.
 {{< /hint >}}
+
+
+### Spark Type Conversion
+
+This section lists all supported type conversion between Spark and Flink.
+All Spark's data types are available in package `org.apache.spark.sql.types`.
+
+<table class="table table-bordered">
+    <thead>
+    <tr>
+      <th class="text-left" style="width: 10%">Spark Data Type</th>
+      <th class="text-left" style="width: 10%">Flink Data Type</th>
+      <th class="text-left" style="width: 5%">Atomic Type</th>
+    </tr>
+    </thead>
+    <tbody>
+    <tr>
+      <td><code>StructType</code></td>
+      <td><code>RowType</code></td>
+      <td>false</td>
+    </tr>
+    <tr>
+      <td><code>MapType</code></td>
+      <td><code>MapType</code></td>
+      <td>false</td>
+    </tr>
+    <tr>
+      <td><code>ArrayType</code></td>
+      <td><code>ArrayType</code></td>
+      <td>false</td>
+    </tr>
+    <tr>
+      <td><code>BooleanType</code></td>
+      <td><code>BooleanType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>ByteType</code></td>
+      <td><code>TinyIntType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>ShortType</code></td>
+      <td><code>SmallIntType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>IntegerType</code></td>
+      <td><code>IntType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>LongType</code></td>
+      <td><code>BigIntType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>FloatType</code></td>
+      <td><code>FloatType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>DoubleType</code></td>
+      <td><code>DoubleType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>StringType</code></td>
+      <td><code>VarCharType</code>, <code>CharType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>DateType</code></td>
+      <td><code>DateType</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>TimestampType</code></td>
+      <td><code>TimestampType</code>, <code>LocalZonedTimestamp</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>DecimalType(precision, scale)</code></td>
+      <td><code>DecimalType(precision, scale)</code></td>
+      <td>true</td>
+    </tr>
+    <tr>
+      <td><code>BinaryType</code></td>
+      <td><code>VarBinaryType</code>, <code>BinaryType</code></td>
+      <td>true</td>
+    </tr>
+    </tbody>
+</table>
+
+{{< hint info >}}
+__Note:__
+- Currently, Spark's field comment cannot be described under Flink CLI.
+- Conversion between Spark's `UserDefinedType` and Flink's `UserDefinedType` is not supported.
+{{< /hint >}}
diff --git a/docs/content/docs/try-table-store/quick-start.md b/docs/content/docs/try-table-store/quick-start.md
index 53f869fe..25f8012a 100644
--- a/docs/content/docs/try-table-store/quick-start.md
+++ b/docs/content/docs/try-table-store/quick-start.md
@@ -35,7 +35,7 @@ document will be guided to create a simple dynamic table to read and write it.
 __Note:__ Table Store is only supported since Flink 1.14.
 {{< /hint >}}
 
-[Download Flink 1.15](https://flink.apache.org/downloads.html) of Flink,
+[Download Flink 1.15](https://flink.apache.org/downloads.html),
 then extract the archive:
 
 ```bash