You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by di...@apache.org on 2020/08/10 06:12:21 UTC
[flink] branch release-1.11 updated: [FLINK-18847][docs][python]
Add documentation about data types in Python Table API
This is an automated email from the ASF dual-hosted git repository.
dianfu pushed a commit to branch release-1.11
in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/release-1.11 by this push:
new 0b6b5ef [FLINK-18847][docs][python] Add documentation about data types in Python Table API
0b6b5ef is described below
commit 0b6b5ef17b82db3375a017e9f56bc04d96900469
Author: Dian Fu <di...@apache.org>
AuthorDate: Mon Aug 3 19:09:16 2020 +0800
[FLINK-18847][docs][python] Add documentation about data types in Python Table API
This closes #13084.
---
docs/dev/table/python/index.md | 1 +
docs/dev/table/python/index.zh.md | 1 +
docs/dev/table/python/python_types.md | 74 +++++
docs/dev/table/python/python_types.zh.md | 74 +++++
docs/dev/table/types.md | 519 ++++++++++++++++++++++---------
docs/dev/table/types.zh.md | 477 ++++++++++++++++++++--------
6 files changed, 858 insertions(+), 288 deletions(-)
diff --git a/docs/dev/table/python/index.md b/docs/dev/table/python/index.md
index eb5d06c..cc7e1ec 100644
--- a/docs/dev/table/python/index.md
+++ b/docs/dev/table/python/index.md
@@ -30,6 +30,7 @@ Apache Flink has provided Python Table API support since 1.9.0.
## Where to go next?
- [Installation]({{ site.baseurl }}/dev/table/python/installation.html): Introduction of how to set up the Python Table API execution environment.
+- [Python Data Types]({{ site.baseurl }}/dev/table/python/python_types.html): Introduction of Python data types.
- [User-defined Functions]({{ site.baseurl }}/dev/table/python/python_udfs.html): Explanation of how to define Python user-defined functions.
- [Vectorized User-defined Functions]({{ site.baseurl }}/dev/table/python/vectorized_python_udfs.html): Explanation of how to define vectorized Python user-defined functions.
- [Conversions between PyFlink Table and Pandas DataFrame]({{ site.baseurl }}/dev/table/python/conversion_of_pandas.html): Explanation of how to convert between PyFlink Table and Pandas DataFrame.
diff --git a/docs/dev/table/python/index.zh.md b/docs/dev/table/python/index.zh.md
index cb1031d..2e97710 100644
--- a/docs/dev/table/python/index.zh.md
+++ b/docs/dev/table/python/index.zh.md
@@ -31,6 +31,7 @@ Python Table API允许用户使用Python语言开发[Table API]({{ site.baseurl
## Where to go next?
- [环境安装]({{ site.baseurl }}/zh/dev/table/python/installation.html): 介绍了如何设置Python Table API的执行环境。
+- [Python数据类型]({{ site.baseurl }}/zh/dev/table/python/python_types.html): 介绍Python数据类型。
- [自定义函数]({{ site.baseurl }}/zh/dev/table/python/python_udfs.html): 有关如何定义Python用户自定义函数的说明。
- [自定义向量化函数]({{ site.baseurl }}/zh/dev/table/python/vectorized_python_udfs.html): 有关如何定义向量化Python用户自定义函数的说明。
- [PyFlink Table 和 Pandas DataFrame 互转]({{ site.baseurl }}/zh/dev/table/python/conversion_of_pandas.html): 介绍了PyFlink Table和Pandas DataFrame之间如何互转。
diff --git a/docs/dev/table/python/python_types.md b/docs/dev/table/python/python_types.md
new file mode 100644
index 0000000..3cc229f
--- /dev/null
+++ b/docs/dev/table/python/python_types.md
@@ -0,0 +1,74 @@
+---
+title: "Python Data Types"
+nav-parent_id: python_tableapi
+nav-pos: 15
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+This page describes the data types supported in PyFlink Table API.
+
+* This will be replaced by the TOC
+{:toc}
+
+Data Type
+---------
+
+A *data type* describes the logical type of a value in the table ecosystem. It can be used to declare input and/or
+output types of Python user-defined functions. Users of the Python Table API work with instances of
+`pyflink.table.types.DataType` within the Python Table API or when defining user-defined functions.
+
+A `DataType` instance declares the **logical type** which does not imply a concrete physical representation for transmission
+or storage. All pre-defined data types are available in `pyflink.table.types` and can be instantiated with the utility methods
+defined in `pyflink.table.types.DataTypes`.
+
+A list of all pre-defined data types can be found [below]({{ site.baseurl }}/dev/table/types.html#list-of-data-types).
+
+Data Type and Python Type Mapping
+------------------
+
+A *data type* can be used to declare input and/or output types of Python user-defined functions. The inputs
+will be converted to Python objects corresponding to the data type and the type of the user-defined functions
+result must also match the defined data type.
+
+For vectorized Python UDF, the input types and output type are `pandas.Series`. The element type
+of the `pandas.Series` corresponds to the specified data type.
+
+| Data Type | Python Type | Pandas Type |
+|:-----------------|:-----------------------|
+| `BOOLEAN` | `bool` | `numpy.bool_` |
+| `TINYINT` | `int` | `numpy.int8` |
+| `SMALLINT` | `int` | `numpy.int16` |
+| `INT` | `int` | `numpy.int32` |
+| `BIGINT` | `int` | `numpy.int64` |
+| `FLOAT` | `float` | `numpy.float32` |
+| `DOUBLE` | `float` | `numpy.float64` |
+| `VARCHAR` | `str` | `str` |
+| `VARBINARY` | `bytes` | `bytes` |
+| `DECIMAL` | `decimal.Decimal` | `decimal.Decimal` |
+| `DATE` | `datetime.date` | `datetime.date` |
+| `TIME` | `datetime.time` | `datetime.time` |
+| `TimestampType` | `datetime.datetime` | `datetime.datetime` |
+| `LocalZonedTimestampType` | `datetime.datetime` | `datetime.datetime` |
+| `INTERVAL YEAR TO MONTH` | `int` | `Not Supported Yet` |
+| `INTERVAL DAY TO SECOND` | `datetime.timedelta` | `Not Supported Yet` |
+| `ARRAY` | `list` | `numpy.ndarray` |
+| `MULTISET` | `list` | `Not Supported Yet` |
+| `MAP` | `dict` | `Not Supported Yet` |
+| `ROW` | `Row` | `dict` |
diff --git a/docs/dev/table/python/python_types.zh.md b/docs/dev/table/python/python_types.zh.md
new file mode 100644
index 0000000..56971ea
--- /dev/null
+++ b/docs/dev/table/python/python_types.zh.md
@@ -0,0 +1,74 @@
+---
+title: "Python 数据类型"
+nav-parent_id: python_tableapi
+nav-pos: 15
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+This page describes the data types supported in PyFlink Table API.
+
+* This will be replaced by the TOC
+{:toc}
+
+Data Type
+---------
+
+A *data type* describes the logical type of a value in the table ecosystem. It can be used to declare input and/or
+output types of Python user-defined functions. Users of the Python Table API work with instances of
+`pyflink.table.types.DataType` within the Python Table API or when defining user-defined functions.
+
+A `DataType` instance declares the **logical type** which does not imply a concrete physical representation for transmission
+or storage. All pre-defined data types are available in `pyflink.table.types` and can be instantiated with the utility methods
+defined in `pyflink.table.types.DataTypes`.
+
+A list of all pre-defined data types can be found [below]({{ site.baseurl }}/zh/dev/table/types.html#list-of-data-types).
+
+Data Type and Python Type Mapping
+------------------
+
+A *data type* can be used to declare input and/or output types of Python user-defined functions. The inputs
+will be converted to Python objects corresponding to the data type and the type of the user-defined functions
+result must also match the defined data type.
+
+For vectorized Python UDF, the input types and output type are `pandas.Series`. The element type
+of the `pandas.Series` corresponds to the specified data type.
+
+| Data Type | Python Type | Pandas Type |
+|:-----------------|:-----------------------|
+| `BOOLEAN` | `bool` | `numpy.bool_` |
+| `TINYINT` | `int` | `numpy.int8` |
+| `SMALLINT` | `int` | `numpy.int16` |
+| `INT` | `int` | `numpy.int32` |
+| `BIGINT` | `int` | `numpy.int64` |
+| `FLOAT` | `float` | `numpy.float32` |
+| `DOUBLE` | `float` | `numpy.float64` |
+| `VARCHAR` | `str` | `str` |
+| `VARBINARY` | `bytes` | `bytes` |
+| `DECIMAL` | `decimal.Decimal` | `decimal.Decimal` |
+| `DATE` | `datetime.date` | `datetime.date` |
+| `TIME` | `datetime.time` | `datetime.time` |
+| `TimestampType` | `datetime.datetime` | `datetime.datetime` |
+| `LocalZonedTimestampType` | `datetime.datetime` | `datetime.datetime` |
+| `INTERVAL YEAR TO MONTH` | `int` | `Not Supported` |
+| `INTERVAL DAY TO SECOND` | `datetime.timedelta` | `Not Supported` |
+| `ARRAY` | `list` | `numpy.ndarray` |
+| `MULTISET` | `list` | `Not Supported` |
+| `MAP` | `dict` | `Not Supported` |
+| `ROW` | `Row` | `dict` |
diff --git a/docs/dev/table/types.md b/docs/dev/table/types.md
index 45b6a94..0b7ebe6 100644
--- a/docs/dev/table/types.md
+++ b/docs/dev/table/types.md
@@ -68,20 +68,23 @@ A list of all pre-defined data types can be found [below](#list-of-data-types).
### Data Types in the Table API
Users of the JVM-based API work with instances of `org.apache.flink.table.types.DataType` within the Table API or when
-defining connectors, catalogs, or user-defined functions.
+defining connectors, catalogs, or user-defined functions. Users of the Python API work with instances of
+`pyflink.table.types.DataType` within the Python Table API or when defining Python user-defined functions.
A `DataType` instance has two responsibilities:
- **Declaration of a logical type** which does not imply a concrete physical representation for transmission
-or storage but defines the boundaries between JVM-based languages and the table ecosystem.
-- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs .
+or storage but defines the boundaries between JVM-based/Python languages and the table ecosystem.
+- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs.
+This is currently only available in the Java/Scalar Table API and still not available in the Python Table API.
For JVM-based languages, all pre-defined data types are available in `org.apache.flink.table.api.DataTypes`.
-
-It is recommended to add a star import to your table programs for having a fluent API:
+For Python language, those types are available in `pyflink.table.types.DataTypes`.
<div class="codetabs" markdown="1">
<div data-lang="Java" markdown="1">
+It is recommended to add a star import to your table programs for having a fluent API:
+
{% highlight java %}
import static org.apache.flink.table.api.DataTypes.*;
@@ -90,6 +93,8 @@ DataType t = INTERVAL(DAY(), SECOND(3));
</div>
<div data-lang="Scala" markdown="1">
+It is recommended to add a star import to your table programs for having a fluent API:
+
{% highlight scala %}
import org.apache.flink.table.api.DataTypes._
@@ -97,6 +102,15 @@ val t: DataType = INTERVAL(DAY(), SECOND(3));
{% endhighlight %}
</div>
+<div data-lang="Python" markdown="1">
+
+{% highlight python %}
+from pyflink.table.types import DataTypes
+
+t = DataTypes.INTERVAL(DataTypes.DAY(), DataTypes.SECOND(3))
+{% endhighlight %}
+</div>
+
</div>
#### Physical Hints
@@ -143,6 +157,8 @@ val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[A
API is extended. Users of predefined sources/sinks/functions do not need to define such hints. Hints within
a table program (e.g. `field.cast(TIMESTAMP(3).bridgedTo(Timestamp.class))`) are ignored.
+<span class="label label-danger">Attention</span> Please note that physical hints are currently not supported in the Python Table API.
+
Planner Compatibility
---------------------
@@ -234,6 +250,7 @@ List of Data Types
------------------
This section lists all pre-defined data types. For the JVM-based Table API those types are also available in `org.apache.flink.table.api.DataTypes`.
+For the Python Table API, those types are available in `pyflink.table.types.DataTypes`.
### Character Strings
@@ -256,12 +273,6 @@ CHAR(n)
{% highlight java %}
DataTypes.CHAR(n)
{% endhighlight %}
-</div>
-
-</div>
-
-The type can be declared using `CHAR(n)` where `n` is the number of code points. `n` must have a value between `1`
-and `2,147,483,647` (both inclusive). If no length is specified, `n` is equal to `1`.
**Bridging to JVM Types**
@@ -271,6 +282,18 @@ and `2,147,483,647` (both inclusive). If no length is specified, `n` is equal to
|`byte[]` | X | X | Assumes UTF-8 encoding. |
|`org.apache.flink.table.data.StringData` | X | X | Internal data structure. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+Not supported.
+{% endhighlight %}
+</div>
+</div>
+
+The type can be declared using `CHAR(n)` where `n` is the number of code points. `n` must have a value between `1`
+and `2,147,483,647` (both inclusive). If no length is specified, `n` is equal to `1`.
+
#### `VARCHAR` / `STRING`
Data type of a variable-length character string.
@@ -294,14 +317,6 @@ DataTypes.VARCHAR(n)
DataTypes.STRING()
{% endhighlight %}
-</div>
-
-</div>
-
-The type can be declared using `VARCHAR(n)` where `n` is the maximum number of code points. `n` must have a value
-between `1` and `2,147,483,647` (both inclusive). If no length is specified, `n` is equal to `1`.
-
-`STRING` is a synonym for `VARCHAR(2147483647)`.
**Bridging to JVM Types**
@@ -311,6 +326,24 @@ between `1` and `2,147,483,647` (both inclusive). If no length is specified, `n`
|`byte[]` | X | X | Assumes UTF-8 encoding. |
|`org.apache.flink.table.data.StringData` | X | X | Internal data structure. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.VARCHAR(n)
+
+DataTypes.STRING()
+{% endhighlight %}
+
+<span class="label label-danger">Attention</span> The specified maximum number of code points `n` in `DataTypes.VARCHAR(n)` must be `2,147,483,647` currently.
+</div>
+</div>
+
+The type can be declared using `VARCHAR(n)` where `n` is the maximum number of code points. `n` must have a value
+between `1` and `2,147,483,647` (both inclusive). If no length is specified, `n` is equal to `1`.
+
+`STRING` is a synonym for `VARCHAR(2147483647)`.
+
### Binary Strings
#### `BINARY`
@@ -332,12 +365,6 @@ BINARY(n)
{% highlight java %}
DataTypes.BINARY(n)
{% endhighlight %}
-</div>
-
-</div>
-
-The type can be declared using `BINARY(n)` where `n` is the number of bytes. `n` must have a value
-between `1` and `2,147,483,647` (both inclusive). If no length is specified, `n` is equal to `1`.
**Bridging to JVM Types**
@@ -345,6 +372,18 @@ between `1` and `2,147,483,647` (both inclusive). If no length is specified, `n`
|:-------------------|:-----:|:------:|:------------------------|
|`byte[]` | X | X | *Default* |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+Not supported.
+{% endhighlight %}
+</div>
+</div>
+
+The type can be declared using `BINARY(n)` where `n` is the number of bytes. `n` must have a value
+between `1` and `2,147,483,647` (both inclusive). If no length is specified, `n` is equal to `1`.
+
#### `VARBINARY` / `BYTES`
Data type of a variable-length binary string (=a sequence of bytes).
@@ -368,8 +407,24 @@ DataTypes.VARBINARY(n)
DataTypes.BYTES()
{% endhighlight %}
+
+**Bridging to JVM Types**
+
+| Java Type | Input | Output | Remarks |
+|:-------------------|:-----:|:------:|:------------------------|
+|`byte[]` | X | X | *Default* |
+
</div>
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.VARBINARY(n)
+
+DataTypes.BYTES()
+{% endhighlight %}
+
+<span class="label label-danger">Attention</span> The specified maximum number of bytes `n` in `DataTypes.VARBINARY(n)` must be `2,147,483,647` currently.
+</div>
</div>
The type can be declared using `VARBINARY(n)` where `n` is the maximum number of bytes. `n` must
@@ -378,12 +433,6 @@ equal to `1`.
`BYTES` is a synonym for `VARBINARY(2147483647)`.
-**Bridging to JVM Types**
-
-| Java Type | Input | Output | Remarks |
-|:-------------------|:-----:|:------:|:------------------------|
-|`byte[]` | X | X | *Default* |
-
### Exact Numerics
#### `DECIMAL`
@@ -414,8 +463,23 @@ NUMERIC(p, s)
{% highlight java %}
DataTypes.DECIMAL(p, s)
{% endhighlight %}
+
+**Bridging to JVM Types**
+
+| Java Type | Input | Output | Remarks |
+|:-----------------------------------------|:-----:|:------:|:-------------------------|
+|`java.math.BigDecimal` | X | X | *Default* |
+|`org.apache.flink.table.data.DecimalData` | X | X | Internal data structure. |
+
</div>
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.DECIMAL(p, s)
+{% endhighlight %}
+
+<span class="label label-danger">Attention</span> The `precision` and `scale` specified in `DataTypes.DECIMAL(p, s)` must be `38` and `18` separately currently.
+</div>
</div>
The type can be declared using `DECIMAL(p, s)` where `p` is the number of digits in a
@@ -426,13 +490,6 @@ The default value for `s` is `0`.
`NUMERIC(p, s)` and `DEC(p, s)` are synonyms for this type.
-**Bridging to JVM Types**
-
-| Java Type | Input | Output | Remarks |
-|:-----------------------------------------|:-----:|:------:|:-------------------------|
-|`java.math.BigDecimal` | X | X | *Default* |
-|`org.apache.flink.table.data.DecimalData` | X | X | Internal data structure. |
-
#### `TINYINT`
Data type of a 1-byte signed integer with values from `-128` to `127`.
@@ -451,9 +508,6 @@ TINYINT
{% highlight java %}
DataTypes.TINYINT()
{% endhighlight %}
-</div>
-
-</div>
**Bridging to JVM Types**
@@ -462,6 +516,15 @@ DataTypes.TINYINT()
|`java.lang.Byte` | X | X | *Default* |
|`byte` | X | (X) | Output only if type is not nullable. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.TINYINT()
+{% endhighlight %}
+</div>
+</div>
+
#### `SMALLINT`
Data type of a 2-byte signed integer with values from `-32,768` to `32,767`.
@@ -480,9 +543,6 @@ SMALLINT
{% highlight java %}
DataTypes.SMALLINT()
{% endhighlight %}
-</div>
-
-</div>
**Bridging to JVM Types**
@@ -491,6 +551,15 @@ DataTypes.SMALLINT()
|`java.lang.Short` | X | X | *Default* |
|`short` | X | (X) | Output only if type is not nullable. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.SMALLINT()
+{% endhighlight %}
+</div>
+</div>
+
#### `INT`
Data type of a 4-byte signed integer with values from `-2,147,483,648` to `2,147,483,647`.
@@ -511,11 +580,6 @@ INTEGER
{% highlight java %}
DataTypes.INT()
{% endhighlight %}
-</div>
-
-</div>
-
-`INTEGER` is a synonym for this type.
**Bridging to JVM Types**
@@ -524,6 +588,17 @@ DataTypes.INT()
|`java.lang.Integer` | X | X | *Default* |
|`int` | X | (X) | Output only if type is not nullable. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.INT()
+{% endhighlight %}
+</div>
+</div>
+
+`INTEGER` is a synonym for this type.
+
#### `BIGINT`
Data type of an 8-byte signed integer with values from `-9,223,372,036,854,775,808` to
@@ -543,9 +618,6 @@ BIGINT
{% highlight java %}
DataTypes.BIGINT()
{% endhighlight %}
-</div>
-
-</div>
**Bridging to JVM Types**
@@ -554,6 +626,15 @@ DataTypes.BIGINT()
|`java.lang.Long` | X | X | *Default* |
|`long` | X | (X) | Output only if type is not nullable. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.BIGINT()
+{% endhighlight %}
+</div>
+</div>
+
### Approximate Numerics
#### `FLOAT`
@@ -576,9 +657,6 @@ FLOAT
{% highlight java %}
DataTypes.FLOAT()
{% endhighlight %}
-</div>
-
-</div>
**Bridging to JVM Types**
@@ -587,6 +665,15 @@ DataTypes.FLOAT()
|`java.lang.Float` | X | X | *Default* |
|`float` | X | (X) | Output only if type is not nullable. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.FLOAT()
+{% endhighlight %}
+</div>
+</div>
+
#### `DOUBLE`
Data type of an 8-byte double precision floating point number.
@@ -607,11 +694,6 @@ DOUBLE PRECISION
{% highlight java %}
DataTypes.DOUBLE()
{% endhighlight %}
-</div>
-
-</div>
-
-`DOUBLE PRECISION` is a synonym for this type.
**Bridging to JVM Types**
@@ -620,6 +702,17 @@ DataTypes.DOUBLE()
|`java.lang.Double` | X | X | *Default* |
|`double` | X | (X) | Output only if type is not nullable. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.DOUBLE()
+{% endhighlight %}
+</div>
+</div>
+
+`DOUBLE PRECISION` is a synonym for this type.
+
### Date and Time
#### `DATE`
@@ -643,9 +736,6 @@ DATE
{% highlight java %}
DataTypes.DATE()
{% endhighlight %}
-</div>
-
-</div>
**Bridging to JVM Types**
@@ -656,6 +746,15 @@ DataTypes.DATE()
|`java.lang.Integer` | X | X | Describes the number of days since epoch. |
|`int` | X | (X) | Describes the number of days since epoch.<br>Output only if type is not nullable. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.DATE()
+{% endhighlight %}
+</div>
+</div>
+
#### `TIME`
Data type of a time *without* time zone consisting of `hour:minute:second[.fractional]` with
@@ -680,13 +779,6 @@ TIME(p)
{% highlight java %}
DataTypes.TIME(p)
{% endhighlight %}
-</div>
-
-</div>
-
-The type can be declared using `TIME(p)` where `p` is the number of digits of fractional
-seconds (*precision*). `p` must have a value between `0` and `9` (both inclusive). If no
-precision is specified, `p` is equal to `0`.
**Bridging to JVM Types**
@@ -699,6 +791,21 @@ precision is specified, `p` is equal to `0`.
|`java.lang.Long` | X | X | Describes the number of nanoseconds of the day. |
|`long` | X | (X) | Describes the number of nanoseconds of the day.<br>Output only if type is not nullable. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.TIME(p)
+{% endhighlight %}
+
+<span class="label label-danger">Attention</span> The `precision` specified in `DataTypes.TIME(p)` must be `0` currently.
+</div>
+</div>
+
+The type can be declared using `TIME(p)` where `p` is the number of digits of fractional
+seconds (*precision*). `p` must have a value between `0` and `9` (both inclusive). If no
+precision is specified, `p` is equal to `0`.
+
#### `TIMESTAMP`
Data type of a timestamp *without* time zone consisting of `year-month-day hour:minute:second[.fractional]`
@@ -730,8 +837,24 @@ TIMESTAMP(p) WITHOUT TIME ZONE
{% highlight java %}
DataTypes.TIMESTAMP(p)
{% endhighlight %}
+
+**Bridging to JVM Types**
+
+| Java Type | Input | Output | Remarks |
+|:-------------------------------------------|:-----:|:------:|:-------------------------|
+|`java.time.LocalDateTime` | X | X | *Default* |
+|`java.sql.Timestamp` | X | X | |
+|`org.apache.flink.table.data.TimestampData` | X | X | Internal data structure. |
+
</div>
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.TIMESTAMP(p)
+{% endhighlight %}
+
+<span class="label label-danger">Attention</span> The `precision` specified in `DataTypes.TIMESTAMP(p)` must be `3` currently.
+</div>
</div>
The type can be declared using `TIMESTAMP(p)` where `p` is the number of digits of fractional
@@ -740,14 +863,6 @@ is specified, `p` is equal to `6`.
`TIMESTAMP(p) WITHOUT TIME ZONE` is a synonym for this type.
-**Bridging to JVM Types**
-
-| Java Type | Input | Output | Remarks |
-|:-------------------------------------------|:-----:|:------:|:-------------------------|
-|`java.time.LocalDateTime` | X | X | *Default* |
-|`java.sql.Timestamp` | X | X | |
-|`org.apache.flink.table.data.TimestampData` | X | X | Internal data structure. |
-
#### `TIMESTAMP WITH TIME ZONE`
Data type of a timestamp *with* time zone consisting of `year-month-day hour:minute:second[.fractional] zone`
@@ -776,13 +891,6 @@ TIMESTAMP(p) WITH TIME ZONE
{% highlight java %}
DataTypes.TIMESTAMP_WITH_TIME_ZONE(p)
{% endhighlight %}
-</div>
-
-</div>
-
-The type can be declared using `TIMESTAMP(p) WITH TIME ZONE` where `p` is the number of digits of
-fractional seconds (*precision*). `p` must have a value between `0` and `9` (both inclusive). If no
-precision is specified, `p` is equal to `6`.
**Bridging to JVM Types**
@@ -791,6 +899,19 @@ precision is specified, `p` is equal to `6`.
|`java.time.OffsetDateTime` | X | X | *Default* |
|`java.time.ZonedDateTime` | X | | Ignores the zone ID. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+Not supported.
+{% endhighlight %}
+</div>
+</div>
+
+The type can be declared using `TIMESTAMP(p) WITH TIME ZONE` where `p` is the number of digits of
+fractional seconds (*precision*). `p` must have a value between `0` and `9` (both inclusive). If no
+precision is specified, `p` is equal to `6`.
+
#### `TIMESTAMP WITH LOCAL TIME ZONE`
Data type of a timestamp *with local* time zone consisting of `year-month-day hour:minute:second[.fractional] zone`
@@ -822,13 +943,6 @@ TIMESTAMP(p) WITH LOCAL TIME ZONE
{% highlight java %}
DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(p)
{% endhighlight %}
-</div>
-
-</div>
-
-The type can be declared using `TIMESTAMP(p) WITH LOCAL TIME ZONE` where `p` is the number
-of digits of fractional seconds (*precision*). `p` must have a value between `0` and `9`
-(both inclusive). If no precision is specified, `p` is equal to `6`.
**Bridging to JVM Types**
@@ -841,6 +955,21 @@ of digits of fractional seconds (*precision*). `p` must have a value between `0`
|`long` | X | (X) | Describes the number of milliseconds since epoch.<br>Output only if type is not nullable. |
|`org.apache.flink.table.data.TimestampData` | X | X | Internal data structure. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(p)
+{% endhighlight %}
+
+<span class="label label-danger">Attention</span> The `precision` specified in `DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(p)` must be `3` currently.
+</div>
+</div>
+
+The type can be declared using `TIMESTAMP(p) WITH LOCAL TIME ZONE` where `p` is the number
+of digits of fractional seconds (*precision*). `p` must have a value between `0` and `9`
+(both inclusive). If no precision is specified, `p` is equal to `6`.
+
#### `INTERVAL YEAR TO MONTH`
Data type for a group of year-month interval types.
@@ -877,13 +1006,6 @@ DataTypes.INTERVAL(DataTypes.YEAR(p))
DataTypes.INTERVAL(DataTypes.YEAR(p), DataTypes.MONTH())
DataTypes.INTERVAL(DataTypes.MONTH())
{% endhighlight %}
-</div>
-
-</div>
-
-The type can be declared using the above combinations where `p` is the number of digits of years
-(*year precision*). `p` must have a value between `1` and `4` (both inclusive). If no year precision
-is specified, `p` is equal to `2`.
**Bridging to JVM Types**
@@ -893,7 +1015,23 @@ is specified, `p` is equal to `2`.
|`java.lang.Integer` | X | X | Describes the number of months. |
|`int` | X | (X) | Describes the number of months.<br>Output only if type is not nullable. |
-#### `INTERVAL DAY TO MONTH`
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.INTERVAL(DataTypes.YEAR())
+DataTypes.INTERVAL(DataTypes.YEAR(p))
+DataTypes.INTERVAL(DataTypes.YEAR(p), DataTypes.MONTH())
+DataTypes.INTERVAL(DataTypes.MONTH())
+{% endhighlight %}
+</div>
+</div>
+
+The type can be declared using the above combinations where `p` is the number of digits of years
+(*year precision*). `p` must have a value between `1` and `4` (both inclusive). If no year precision
+is specified, `p` is equal to `2`.
+
+#### `INTERVAL DAY TO SECOND`
Data type for a group of day-time interval types.
@@ -950,8 +1088,33 @@ DataTypes.INTERVAL(DataTypes.MINUTE(), DataTypes.SECOND(p2))
DataTypes.INTERVAL(DataTypes.SECOND())
DataTypes.INTERVAL(DataTypes.SECOND(p2))
{% endhighlight %}
+
+**Bridging to JVM Types**
+
+| Java Type | Input | Output | Remarks |
+|:--------------------|:-----:|:------:|:--------------------------------------|
+|`java.time.Duration` | X | X | *Default* |
+|`java.lang.Long` | X | X | Describes the number of milliseconds. |
+|`long` | X | (X) | Describes the number of milliseconds.<br>Output only if type is not nullable. |
+
</div>
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.INTERVAL(DataTypes.DAY())
+DataTypes.INTERVAL(DataTypes.DAY(p1))
+DataTypes.INTERVAL(DataTypes.DAY(p1), DataTypes.HOUR())
+DataTypes.INTERVAL(DataTypes.DAY(p1), DataTypes.MINUTE())
+DataTypes.INTERVAL(DataTypes.DAY(p1), DataTypes.SECOND(p2))
+DataTypes.INTERVAL(DataTypes.HOUR())
+DataTypes.INTERVAL(DataTypes.HOUR(), DataTypes.MINUTE())
+DataTypes.INTERVAL(DataTypes.HOUR(), DataTypes.SECOND(p2))
+DataTypes.INTERVAL(DataTypes.MINUTE())
+DataTypes.INTERVAL(DataTypes.MINUTE(), DataTypes.SECOND(p2))
+DataTypes.INTERVAL(DataTypes.SECOND())
+DataTypes.INTERVAL(DataTypes.SECOND(p2))
+{% endhighlight %}
+</div>
</div>
The type can be declared using the above combinations where `p1` is the number of digits of days
@@ -960,14 +1123,6 @@ The type can be declared using the above combinations where `p1` is the number o
and `9` (both inclusive). If no `p1` is specified, it is equal to `2` by default. If no `p2` is
specified, it is equal to `6` by default.
-**Bridging to JVM Types**
-
-| Java Type | Input | Output | Remarks |
-|:--------------------|:-----:|:------:|:--------------------------------------|
-|`java.time.Duration` | X | X | *Default* |
-|`java.lang.Long` | X | X | Describes the number of milliseconds. |
-|`long` | X | (X) | Describes the number of milliseconds.<br>Output only if type is not nullable. |
-
### Constructured Data Types
#### `ARRAY`
@@ -992,8 +1147,21 @@ t ARRAY
{% highlight java %}
DataTypes.ARRAY(t)
{% endhighlight %}
+
+**Bridging to JVM Types**
+
+| Java Type | Input | Output | Remarks |
+|:---------------------------------------|:-----:|:------:|:----------------------------------|
+|*t*`[]` | (X) | (X) | Depends on the subtype. *Default* |
+|`org.apache.flink.table.data.ArrayData` | X | X | Internal data structure. |
+
</div>
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.ARRAY(t)
+{% endhighlight %}
+</div>
</div>
The type can be declared using `ARRAY<t>` where `t` is the data type of the contained
@@ -1002,13 +1170,6 @@ elements.
`t ARRAY` is a synonym for being closer to the SQL standard. For example, `INT ARRAY` is
equivalent to `ARRAY<INT>`.
-**Bridging to JVM Types**
-
-| Java Type | Input | Output | Remarks |
-|:---------------------------------------|:-----:|:------:|:----------------------------------|
-|*t*`[]` | (X) | (X) | Depends on the subtype. *Default* |
-|`org.apache.flink.table.data.ArrayData` | X | X | Internal data structure. |
-
#### `MAP`
Data type of an associative array that maps keys (including `NULL`) to values (including `NULL`). A map
@@ -1032,12 +1193,6 @@ MAP<kt, vt>
{% highlight java %}
DataTypes.MAP(kt, vt)
{% endhighlight %}
-</div>
-
-</div>
-
-The type can be declared using `MAP<kt, vt>` where `kt` is the data type of the key elements
-and `vt` is the data type of the value elements.
**Bridging to JVM Types**
@@ -1047,6 +1202,18 @@ and `vt` is the data type of the value elements.
| *subclass* of `java.util.Map<kt, vt>` | X | | |
|`org.apache.flink.table.data.MapData` | X | X | Internal data structure. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.MAP(kt, vt)
+{% endhighlight %}
+</div>
+</div>
+
+The type can be declared using `MAP<kt, vt>` where `kt` is the data type of the key elements
+and `vt` is the data type of the value elements.
+
#### `MULTISET`
Data type of a multiset (=bag). Unlike a set, it allows for multiple instances for each of its
@@ -1069,8 +1236,22 @@ t MULTISET
{% highlight java %}
DataTypes.MULTISET(t)
{% endhighlight %}
+
+**Bridging to JVM Types**
+
+| Java Type | Input | Output | Remarks |
+|:--------------------------------------|:-----:|:------:|:---------------------------------------------------------|
+|`java.util.Map<t, java.lang.Integer>` | X | X | Assigns each value to an integer multiplicity. *Default* |
+| *subclass* of `java.util.Map<t, java.lang.Integer>>` | X | | |
+|`org.apache.flink.table.data.MapData` | X | X | Internal data structure. |
+
</div>
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.MULTISET(t)
+{% endhighlight %}
+</div>
</div>
The type can be declared using `MULTISET<t>` where `t` is the data type
@@ -1079,14 +1260,6 @@ of the contained elements.
`t MULTISET` is a synonym for being closer to the SQL standard. For example, `INT MULTISET` is
equivalent to `MULTISET<INT>`.
-**Bridging to JVM Types**
-
-| Java Type | Input | Output | Remarks |
-|:--------------------------------------|:-----:|:------:|:---------------------------------------------------------|
-|`java.util.Map<t, java.lang.Integer>` | X | X | Assigns each value to an integer multiplicity. *Default* |
-| *subclass* of `java.util.Map<t, java.lang.Integer>>` | X | | |
-|`org.apache.flink.table.data.MapData` | X | X | Internal data structure. |
-
#### `ROW`
Data type of a sequence of fields.
@@ -1119,8 +1292,22 @@ ROW(n0 t0 'd0', n1 t1 'd1', ...)
DataTypes.ROW(DataTypes.FIELD(n0, t0), DataTypes.FIELD(n1, t1), ...)
DataTypes.ROW(DataTypes.FIELD(n0, t0, d0), DataTypes.FIELD(n1, t1, d1), ...)
{% endhighlight %}
+
+**Bridging to JVM Types**
+
+| Java Type | Input | Output | Remarks |
+|:-------------------------------------|:-----:|:------:|:-------------------------|
+|`org.apache.flink.types.Row` | X | X | *Default* |
+|`org.apache.flink.table.data.RowData` | X | X | Internal data structure. |
+
</div>
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.ROW([DataTypes.FIELD(n0, t0), DataTypes.FIELD(n1, t1), ...])
+DataTypes.ROW([DataTypes.FIELD(n0, t0, d0), DataTypes.FIELD(n1, t1, d1), ...])
+{% endhighlight %}
+</div>
</div>
The type can be declared using `ROW<n0 t0 'd0', n1 t1 'd1', ...>` where `n` is the unique name of
@@ -1129,13 +1316,6 @@ a field, `t` is the logical type of a field, `d` is the description of a field.
`ROW(...)` is a synonym for being closer to the SQL standard. For example, `ROW(myField INT, myOtherField BOOLEAN)` is
equivalent to `ROW<myField INT, myOtherField BOOLEAN>`.
-**Bridging to JVM Types**
-
-| Java Type | Input | Output | Remarks |
-|:-------------------------------------|:-----:|:------:|:-------------------------|
-|`org.apache.flink.types.Row` | X | X | *Default* |
-|`org.apache.flink.table.data.RowData` | X | X | Internal data structure. |
-
### User-Defined Data Types
<span class="label label-danger">Attention</span> User-defined data types are not fully supported yet. They are
@@ -1205,6 +1385,15 @@ class User {
DataTypes.of(User.class);
{% endhighlight %}
+
+**Bridging to JVM Types**
+
+| Java Type | Input | Output | Remarks |
+|:-------------------------------------|:-----:|:------:|:----------------------------------------|
+|*class* | X | X | Originating class or subclasses (for input) or <br>superclasses (for output). *Default* |
+|`org.apache.flink.types.Row` | X | X | Represent the structured type as a row. |
+|`org.apache.flink.table.data.RowData` | X | X | Internal data structure. |
+
</div>
<div data-lang="Scala" markdown="1">
@@ -1224,9 +1413,6 @@ case class User(
DataTypes.of(classOf[User])
{% endhighlight %}
-</div>
-
-</div>
**Bridging to JVM Types**
@@ -1236,6 +1422,15 @@ DataTypes.of(classOf[User])
|`org.apache.flink.types.Row` | X | X | Represent the structured type as a row. |
|`org.apache.flink.table.data.RowData` | X | X | Internal data structure. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+Not supported.
+{% endhighlight %}
+</div>
+</div>
+
### Other Data Types
#### `BOOLEAN`
@@ -1256,9 +1451,6 @@ BOOLEAN
{% highlight java %}
DataTypes.BOOLEAN()
{% endhighlight %}
-</div>
-
-</div>
**Bridging to JVM Types**
@@ -1267,6 +1459,15 @@ DataTypes.BOOLEAN()
|`java.lang.Boolean` | X | X | *Default* |
|`boolean` | X | (X) | Output only if type is not nullable. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.BOOLEAN()
+{% endhighlight %}
+</div>
+</div>
+
#### `RAW`
Data type of an arbitrary serialized type. This type is a black box within the table ecosystem
@@ -1290,8 +1491,22 @@ DataTypes.RAW(class, serializer)
DataTypes.RAW(class)
{% endhighlight %}
+
+**Bridging to JVM Types**
+
+| Java Type | Input | Output | Remarks |
+|:------------------|:-----:|:------:|:-------------------------------------------|
+|*class* | X | X | Originating class or subclasses (for input) or <br>superclasses (for output). *Default* |
+|`byte[]` | | X | |
+|`org.apache.flink.table.data.RawValueData` | X | X | Internal data structure. |
+
</div>
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+Not supported.
+{% endhighlight %}
+</div>
</div>
The type can be declared using `RAW('class', 'snapshot')` where `class` is the originating class and
@@ -1301,14 +1516,6 @@ declared directly but is generated while persisting the type.
In the API, the `RAW` type can be declared either by directly supplying a `Class` + `TypeSerializer` or
by passing `Class` and letting the framework extract `Class` + `TypeSerializer` from there.
-**Bridging to JVM Types**
-
-| Java Type | Input | Output | Remarks |
-|:------------------|:-----:|:------:|:-------------------------------------------|
-|*class* | X | X | Originating class or subclasses (for input) or <br>superclasses (for output). *Default* |
-|`byte[]` | | X | |
-|`org.apache.flink.table.data.RawValueData` | X | X | Internal data structure. |
-
#### `NULL`
Data type for representing untyped `NULL` values.
@@ -1335,9 +1542,6 @@ NULL
{% highlight java %}
DataTypes.NULL()
{% endhighlight %}
-</div>
-
-</div>
**Bridging to JVM Types**
@@ -1346,6 +1550,15 @@ DataTypes.NULL()
|`java.lang.Object` | X | X | *Default* |
|*any class* | | (X) | Any non-primitive type. |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+Not supported.
+{% endhighlight %}
+</div>
+</div>
+
Data Type Extraction
--------------------
diff --git a/docs/dev/table/types.zh.md b/docs/dev/table/types.zh.md
index 819abc8..92fb8b0 100644
--- a/docs/dev/table/types.zh.md
+++ b/docs/dev/table/types.zh.md
@@ -54,19 +54,22 @@ Flink 的数据类型和 SQL 标准的 *数据类型* 术语类似,但也包
### Table API 的数据类型
-JVM API 的用户可以在 Table API 中使用 `org.apache.flink.table.types.DataType` 的实例,以及定义连接器(Connector)、Catalog 或者用户自定义函数(User-Defined Function)。
+JVM API 的用户可以在 Table API 中、定义连接器(Connector)、Catalog 或者用户自定义函数(User-Defined Function)中使用
+`org.apache.flink.table.types.DataType` 的实例。Python API的用户可以在 Python Table API中、Python 用户自定义函数
+(Python User-Defined Function)中使用`pyflink.table.types.DataType` 的实例。
一个 `DataType` 实例有两个作用:
-- **逻辑类型的声明**,它不表达具体物理类型的存储和转换,但是定义了基于 JVM 的语言和 Table 编程环境之间的边界。
-- *可选的:* **向 Planner 提供有关数据的物理表示的提示**,这对于边界 API 很有用。
+- **逻辑类型的声明**,它不表达具体物理类型的存储和转换,但是定义了基于 JVM 的语言或者 Python 语言和 Table 编程环境之间的边界。
+- *可选的:* **向 Planner 提供有关数据的物理表示的提示**,这对于边界 API 很有用。当前只支持在Java/Scala Table API中使用,Python Table API中尚不支持该功能。
对于基于 JVM 的语言,所有预定义的数据类型都在 `org.apache.flink.table.api.DataTypes` 里提供。
-
-建议使用星号将全部的 API 导入到 Table 程序中以便于使用:
+对于 Python 的语言,所有预定义的数据类型都在 `pyflink.table.types.DataTypes` 里提供。
<div class="codetabs" markdown="1">
<div data-lang="Java" markdown="1">
+建议使用星号将全部的 API 导入到 Table 程序中以便于使用:
+
{% highlight java %}
import static org.apache.flink.table.api.DataTypes.*;
@@ -75,6 +78,8 @@ DataType t = INTERVAL(DAY(), SECOND(3));
</div>
<div data-lang="Scala" markdown="1">
+建议使用星号将全部的 API 导入到 Table 程序中以便于使用:
+
{% highlight scala %}
import org.apache.flink.table.api.DataTypes._
@@ -82,6 +87,14 @@ val t: DataType = INTERVAL(DAY(), SECOND(3));
{% endhighlight %}
</div>
+<div data-lang="Python" markdown="1">
+
+{% highlight python %}
+from pyflink.table.types import DataTypes
+
+t = DataTypes.INTERVAL(DataTypes.DAY(), DataTypes.SECOND(3))
+{% endhighlight %}
+
</div>
#### 物理提示
@@ -123,6 +136,8 @@ val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[A
<span class="label label-danger">注意</span> 请注意,通常只有在扩展 API 时才需要物理提示。
预定义的 Source、Sink、Function 的用户不需要定义这样的提示。在 Table 编程中(例如 `field.cast(TIMESTAMP(3).bridgedTo(Timestamp.class))`)这些提示将被忽略。
+<span class="label label-danger">注意</span> 请注意,物理提示当前在Python Table API中尚不支持。
+
Planner 兼容性
---------------------
@@ -205,6 +220,7 @@ Flink 1.9 之前引入的旧的 Planner 主要支持类型信息(Type Informat
------------------
本节列出了所有预定义的数据类型。对于基于 JVM 的 Table API,这些类型也可以从 `org.apache.flink.table.api.DataTypes` 中找到。
+对于Python Table API, 这些类型可以从 `pyflink.table.types.DataTypes` 中找到。
### 字符串
@@ -227,11 +243,6 @@ CHAR(n)
{% highlight java %}
DataTypes.CHAR(n)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `CHAR(n)` 声明,其中 `n` 表示字符数量。`n` 的值必须在 `1` 和 `2,147,483,647` 之间(含边界值)。如果未指定长度,`n` 等于 `1`。
**JVM 类型**
@@ -241,6 +252,17 @@ DataTypes.CHAR(n)
|`byte[]` | X | X | 假设使用 UTF-8 编码。 |
|`org.apache.flink.table.data.StringData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+尚不支持
+{% endhighlight %}
+</div>
+</div>
+
+此类型用 `CHAR(n)` 声明,其中 `n` 表示字符数量。`n` 的值必须在 `1` 和 `2,147,483,647` 之间(含边界值)。如果未指定长度,`n` 等于 `1`。
+
#### `VARCHAR` / `STRING`
可变长度字符串的数据类型。
@@ -264,13 +286,6 @@ DataTypes.VARCHAR(n)
DataTypes.STRING()
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `VARCHAR(n)` 声明,其中 `n` 表示最大的字符数量。`n` 的值必须在 `1` 和 `2,147,483,647` 之间(含边界值)。如果未指定长度,`n` 等于 `1`。
-
-`STRING` 等价于 `VARCHAR(2147483647)`.
**JVM 类型**
@@ -280,6 +295,23 @@ DataTypes.STRING()
|`byte[]` | X | X | 假设使用 UTF-8 编码。 |
|`org.apache.flink.table.data.StringData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.VARCHAR(n)
+
+DataTypes.STRING()
+{% endhighlight %}
+
+<span class="label label-danger">注意</span> 当前,声明`DataTypes.VARCHAR(n)`中的所指定的最大的字符数量 `n` 必须为 `2,147,483,647`。
+</div>
+</div>
+
+此类型用 `VARCHAR(n)` 声明,其中 `n` 表示最大的字符数量。`n` 的值必须在 `1` 和 `2,147,483,647` 之间(含边界值)。如果未指定长度,`n` 等于 `1`。
+
+`STRING` 等价于 `VARCHAR(2147483647)`.
+
### 二进制字符串
#### `BINARY`
@@ -301,11 +333,6 @@ BINARY(n)
{% highlight java %}
DataTypes.BINARY(n)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `BINARY(n)` 声明,其中 `n` 是字节数量。`n` 的值必须在 `1` 和 `2,147,483,647` 之间(含边界值)。如果未指定长度,`n` 等于 `1`。
**JVM 类型**
@@ -313,6 +340,17 @@ DataTypes.BINARY(n)
|:-------------------|:-----:|:------:|:------------------------|
|`byte[]` | X | X | *缺省* |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+尚不支持
+{% endhighlight %}
+</div>
+</div>
+
+此类型用 `BINARY(n)` 声明,其中 `n` 是字节数量。`n` 的值必须在 `1` 和 `2,147,483,647` 之间(含边界值)。如果未指定长度,`n` 等于 `1`。
+
#### `VARBINARY` / `BYTES`
可变长度二进制字符串的数据类型(=字节序列)。
@@ -336,20 +374,30 @@ DataTypes.VARBINARY(n)
DataTypes.BYTES()
{% endhighlight %}
+
+**JVM 类型**
+
+| Java 类型 | 输入 | 输出 | 备注 |
+|:-------------------|:-----:|:------:|:------------------------|
+|`byte[]` | X | X | *缺省* |
+
</div>
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.VARBINARY(n)
+
+DataTypes.BYTES()
+{% endhighlight %}
+
+<span class="label label-danger">注意</span> 当前,声明`DataTypes.VARBINARY(n)`中的所指定的最大的字符数量 `n` 必须为 `2,147,483,647`。
+</div>
</div>
此类型用 `VARBINARY(n)` 声明,其中 `n` 是最大的字节数量。`n` 的值必须在 `1` 和 `2,147,483,647` 之间(含边界值)。如果未指定长度,`n` 等于 `1`。
`BYTES` 等价于 `VARBINARY(2147483647)`。
-**JVM 类型**
-
-| Java 类型 | 输入 | 输出 | 备注 |
-|:-------------------|:-----:|:------:|:------------------------|
-|`byte[]` | X | X | *缺省* |
-
### 精确数值
#### `DECIMAL`
@@ -380,13 +428,6 @@ NUMERIC(p, s)
{% highlight java %}
DataTypes.DECIMAL(p, s)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `DECIMAL(p, s)` 声明,其中 `p` 是数字的位数(*精度*),`s` 是数字中小数点右边的位数(*尾数*)。`p` 的值必须介于 `1` 和 `38` 之间(含边界值)。`s` 的值必须介于 `0` 和 `p` 之间(含边界值)。其中 `p` 的缺省值是 `10`,`s` 的缺省值是 `0`。
-
-`NUMERIC(p, s)` 和 `DEC(p, s)` 都等价于这个类型。
**JVM 类型**
@@ -395,6 +436,21 @@ DataTypes.DECIMAL(p, s)
|`java.math.BigDecimal` | X | X | *缺省* |
|`org.apache.flink.table.data.DecimalData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.DECIMAL(p, s)
+{% endhighlight %}
+
+<span class="label label-danger">注意</span> 当前,声明`DataTypes.DECIMAL(p, s)`中的所指定的精度 `p` 必须为`38`,尾数 `n` 必须为 `18`。
+</div>
+</div>
+
+此类型用 `DECIMAL(p, s)` 声明,其中 `p` 是数字的位数(*精度*),`s` 是数字中小数点右边的位数(*尾数*)。`p` 的值必须介于 `1` 和 `38` 之间(含边界值)。`s` 的值必须介于 `0` 和 `p` 之间(含边界值)。其中 `p` 的缺省值是 `10`,`s` 的缺省值是 `0`。
+
+`NUMERIC(p, s)` 和 `DEC(p, s)` 都等价于这个类型。
+
#### `TINYINT`
1 字节有符号整数的数据类型,其值从 `-128` to `127`。
@@ -413,9 +469,6 @@ TINYINT
{% highlight java %}
DataTypes.TINYINT()
{% endhighlight %}
-</div>
-
-</div>
**JVM 类型**
@@ -424,6 +477,15 @@ DataTypes.TINYINT()
|`java.lang.Byte` | X | X | *缺省* |
|`byte` | X | (X) | 仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.TINYINT()
+{% endhighlight %}
+</div>
+</div>
+
#### `SMALLINT`
2 字节有符号整数的数据类型,其值从 `-32,768` 到 `32,767`。
@@ -442,9 +504,6 @@ SMALLINT
{% highlight java %}
DataTypes.SMALLINT()
{% endhighlight %}
-</div>
-
-</div>
**JVM 类型**
@@ -453,6 +512,15 @@ DataTypes.SMALLINT()
|`java.lang.Short` | X | X | *缺省* |
|`short` | X | (X) | 仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.SMALLINT()
+{% endhighlight %}
+</div>
+</div>
+
#### `INT`
4 字节有符号整数的数据类型,其值从 `-2,147,483,648` 到 `2,147,483,647`。
@@ -473,11 +541,6 @@ INTEGER
{% highlight java %}
DataTypes.INT()
{% endhighlight %}
-</div>
-
-</div>
-
-`INTEGER` 等价于此类型。
**JVM 类型**
@@ -486,6 +549,17 @@ DataTypes.INT()
|`java.lang.Integer` | X | X | *缺省* |
|`int` | X | (X) | 仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.INT()
+{% endhighlight %}
+</div>
+</div>
+
+`INTEGER` 等价于此类型。
+
#### `BIGINT`
8 字节有符号整数的数据类型,其值从 `-9,223,372,036,854,775,808` 到 `9,223,372,036,854,775,807`。
@@ -504,9 +578,6 @@ BIGINT
{% highlight java %}
DataTypes.BIGINT()
{% endhighlight %}
-</div>
-
-</div>
**JVM 类型**
@@ -515,6 +586,15 @@ DataTypes.BIGINT()
|`java.lang.Long` | X | X | *缺省* |
|`long` | X | (X) | 仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.BIGINT()
+{% endhighlight %}
+</div>
+</div>
+
### 近似数值
#### `FLOAT`
@@ -537,9 +617,6 @@ FLOAT
{% highlight java %}
DataTypes.FLOAT()
{% endhighlight %}
-</div>
-
-</div>
**JVM 类型**
@@ -548,6 +625,15 @@ DataTypes.FLOAT()
|`java.lang.Float` | X | X | *缺省* |
|`float` | X | (X) | 仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.FLOAT()
+{% endhighlight %}
+</div>
+</div>
+
#### `DOUBLE`
8 字节双精度浮点数的数据类型。
@@ -568,11 +654,6 @@ DOUBLE PRECISION
{% highlight java %}
DataTypes.DOUBLE()
{% endhighlight %}
-</div>
-
-</div>
-
-`DOUBLE PRECISION` 等价于此类型。
**JVM 类型**
@@ -581,6 +662,17 @@ DataTypes.DOUBLE()
|`java.lang.Double` | X | X | *缺省* |
|`double` | X | (X) | 仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.DOUBLE()
+{% endhighlight %}
+</div>
+</div>
+
+`DOUBLE PRECISION` 等价于此类型。
+
### 日期和时间
#### `DATE`
@@ -603,9 +695,6 @@ DATE
{% highlight java %}
DataTypes.DATE()
{% endhighlight %}
-</div>
-
-</div>
**JVM 类型**
@@ -616,6 +705,15 @@ DataTypes.DATE()
|`java.lang.Integer` | X | X | 描述从 Epoch 算起的天数。 |
|`int` | X | (X) | 描述从 Epoch 算起的天数。<br>仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.DATE()
+{% endhighlight %}
+</div>
+</div>
+
#### `TIME`
*不带*时区的时间数据类型,由 `hour:minute:second[.fractional]` 组成,精度达到纳秒,范围从 `00:00:00.000000000` 到 `23:59:59.999999999`。
@@ -637,11 +735,6 @@ TIME(p)
{% highlight java %}
DataTypes.TIME(p)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `TIME(p)` 声明,其中 `p` 是秒的小数部分的位数(*精度*)。`p` 的值必须介于 `0` 和 `9` 之间(含边界值)。如果未指定精度,则 `p` 等于 `0`。
**JVM 类型**
@@ -654,6 +747,19 @@ DataTypes.TIME(p)
|`java.lang.Long` | X | X | 描述自当天以来的纳秒数。 |
|`long` | X | (X) | 描述自当天以来的纳秒数。<br>仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.TIME(p)
+{% endhighlight %}
+
+<span class="label label-danger">注意</span> 当前,声明`DataTypes.TIME(p)`中的所指定的精度 `p` 必须为`0`。
+</div>
+</div>
+
+此类型用 `TIME(p)` 声明,其中 `p` 是秒的小数部分的位数(*精度*)。`p` 的值必须介于 `0` 和 `9` 之间(含边界值)。如果未指定精度,则 `p` 等于 `0`。
+
#### `TIMESTAMP`
*不带*时区的时间戳数据类型,由 `year-month-day hour:minute:second[.fractional]` 组成,精度达到纳秒,范围从 `0000-01-01 00:00:00.000000000` 到 `9999-12-31 23:59:59.999999999`。
@@ -680,13 +786,6 @@ TIMESTAMP(p) WITHOUT TIME ZONE
{% highlight java %}
DataTypes.TIMESTAMP(p)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `TIMESTAMP(p)` 声明,其中 `p` 是秒的小数部分的位数(*精度*)。`p` 的值必须介于 `0` 和 `9` 之间(含边界值)。如果未指定精度,则 `p` 等于 `6`。
-
-`TIMESTAMP(p) WITHOUT TIME ZONE` 等价于此类型。
**JVM 类型**
@@ -696,6 +795,21 @@ DataTypes.TIMESTAMP(p)
|`java.sql.Timestamp` | X | X | |
|`org.apache.flink.table.data.TimestampData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.TIMESTAMP(p)
+{% endhighlight %}
+
+<span class="label label-danger">注意</span> 当前,声明`DataTypes.TIMESTAMP(p)`中的所指定的精度 `p` 必须为`3`。
+</div>
+</div>
+
+此类型用 `TIMESTAMP(p)` 声明,其中 `p` 是秒的小数部分的位数(*精度*)。`p` 的值必须介于 `0` 和 `9` 之间(含边界值)。如果未指定精度,则 `p` 等于 `6`。
+
+`TIMESTAMP(p) WITHOUT TIME ZONE` 等价于此类型。
+
#### `TIMESTAMP WITH TIME ZONE`
*带有*时区的时间戳数据类型,由 `year-month-day hour:minute:second[.fractional] zone` 组成,精度达到纳秒,范围从 `0000-01-01 00:00:00.000000000 +14:59` 到
@@ -720,11 +834,6 @@ TIMESTAMP(p) WITH TIME ZONE
{% highlight java %}
DataTypes.TIMESTAMP_WITH_TIME_ZONE(p)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `TIMESTAMP(p) WITH TIME ZONE` 声明,其中 `p` 是秒的小数部分的位数(*精度*)。`p` 的值必须介于 `0` 和 `9` 之间(含边界值)。如果未指定精度,则 `p` 等于 `6`。
**JVM 类型**
@@ -733,6 +842,17 @@ DataTypes.TIMESTAMP_WITH_TIME_ZONE(p)
|`java.time.OffsetDateTime` | X | X | *缺省* |
|`java.time.ZonedDateTime` | X | | 忽略时区 ID。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+尚不支持
+{% endhighlight %}
+</div>
+</div>
+
+此类型用 `TIMESTAMP(p) WITH TIME ZONE` 声明,其中 `p` 是秒的小数部分的位数(*精度*)。`p` 的值必须介于 `0` 和 `9` 之间(含边界值)。如果未指定精度,则 `p` 等于 `6`。
+
#### `TIMESTAMP WITH LOCAL TIME ZONE`
*带有本地*时区的时间戳数据类型,由 `year-month-day hour:minute:second[.fractional] zone` 组成,精度达到纳秒,范围从 `0000-01-01 00:00:00.000000000 +14:59` 到
@@ -759,11 +879,6 @@ TIMESTAMP(p) WITH LOCAL TIME ZONE
{% highlight java %}
DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(p)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `TIMESTAMP(p) WITH LOCAL TIME ZONE` 声明,其中 `p` 是秒的小数部分的位数(*精度*)。`p` 的值必须介于 `0` 和 `9` 之间(含边界值)。如果未指定精度,则 `p` 等于 `6`。
**JVM 类型**
@@ -776,6 +891,19 @@ DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(p)
|`long` | X | (X) | 描述从 Epoch 算起的毫秒数。<br>仅当类型不可为空时才输出 |
|`org.apache.flink.table.data.TimestampData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(p)
+{% endhighlight %}
+
+<span class="label label-danger">注意</span> 当前,声明`DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(p)`中的所指定的精度 `p` 必须为3。
+</div>
+</div>
+
+此类型用 `TIMESTAMP(p) WITH LOCAL TIME ZONE` 声明,其中 `p` 是秒的小数部分的位数(*精度*)。`p` 的值必须介于 `0` 和 `9` 之间(含边界值)。如果未指定精度,则 `p` 等于 `6`。
+
#### `INTERVAL YEAR TO MONTH`
一组 Year-Month Interval 数据类型。
@@ -809,11 +937,6 @@ DataTypes.INTERVAL(DataTypes.YEAR(p))
DataTypes.INTERVAL(DataTypes.YEAR(p), DataTypes.MONTH())
DataTypes.INTERVAL(DataTypes.MONTH())
{% endhighlight %}
-</div>
-
-</div>
-
-可以使用以上组合来声明类型,其中 `p` 是年数(*年精度*)的位数。`p` 的值必须介于 `1` 和 `4` 之间(含边界值)。如果未指定年精度,`p` 则等于 `2`。
**JVM 类型**
@@ -823,7 +946,21 @@ DataTypes.INTERVAL(DataTypes.MONTH())
|`java.lang.Integer` | X | X | 描述月的数量。 |
|`int` | X | (X) | 描述月的数量。<br>仅当类型不可为空时才输出。 |
-#### `INTERVAL DAY TO MONTH`
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.INTERVAL(DataTypes.YEAR())
+DataTypes.INTERVAL(DataTypes.YEAR(p))
+DataTypes.INTERVAL(DataTypes.YEAR(p), DataTypes.MONTH())
+DataTypes.INTERVAL(DataTypes.MONTH())
+{% endhighlight %}
+</div>
+</div>
+
+可以使用以上组合来声明类型,其中 `p` 是年数(*年精度*)的位数。`p` 的值必须介于 `1` 和 `4` 之间(含边界值)。如果未指定年精度,`p` 则等于 `2`。
+
+#### `INTERVAL DAY TO SECOND`
一组 Day-Time Interval 数据类型。
@@ -879,11 +1016,6 @@ DataTypes.INTERVAL(DataTypes.MINUTE(), DataTypes.SECOND(p2))
DataTypes.INTERVAL(DataTypes.SECOND())
DataTypes.INTERVAL(DataTypes.SECOND(p2))
{% endhighlight %}
-</div>
-
-</div>
-
-可以使用以上组合来声明类型,其中 `p1` 是天数(*天精度*)的位数,`p2` 是秒的小数部分的位数(*小数精度*)。`p1` 的值必须介于 `1` 和之间 `6`(含边界值),`p2` 的值必须介于 `0` 和之间 `9`(含边界值)。如果 `p1` 未指定值,则缺省等于 `2`,如果 `p2` 未指定值,则缺省等于 `6`。
**JVM 类型**
@@ -893,6 +1025,28 @@ DataTypes.INTERVAL(DataTypes.SECOND(p2))
|`java.lang.Long` | X | X | 描述毫秒数。 |
|`long` | X | (X) | 描述毫秒数。<br>仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.INTERVAL(DataTypes.DAY())
+DataTypes.INTERVAL(DataTypes.DAY(p1))
+DataTypes.INTERVAL(DataTypes.DAY(p1), DataTypes.HOUR())
+DataTypes.INTERVAL(DataTypes.DAY(p1), DataTypes.MINUTE())
+DataTypes.INTERVAL(DataTypes.DAY(p1), DataTypes.SECOND(p2))
+DataTypes.INTERVAL(DataTypes.HOUR())
+DataTypes.INTERVAL(DataTypes.HOUR(), DataTypes.MINUTE())
+DataTypes.INTERVAL(DataTypes.HOUR(), DataTypes.SECOND(p2))
+DataTypes.INTERVAL(DataTypes.MINUTE())
+DataTypes.INTERVAL(DataTypes.MINUTE(), DataTypes.SECOND(p2))
+DataTypes.INTERVAL(DataTypes.SECOND())
+DataTypes.INTERVAL(DataTypes.SECOND(p2))
+{% endhighlight %}
+</div>
+</div>
+
+可以使用以上组合来声明类型,其中 `p1` 是天数(*天精度*)的位数,`p2` 是秒的小数部分的位数(*小数精度*)。`p1` 的值必须介于 `1` 和之间 `6`(含边界值),`p2` 的值必须介于 `0` 和之间 `9`(含边界值)。如果 `p1` 未指定值,则缺省等于 `2`,如果 `p2` 未指定值,则缺省等于 `6`。
+
### 结构化的数据类型
#### `ARRAY`
@@ -916,13 +1070,6 @@ t ARRAY
{% highlight java %}
DataTypes.ARRAY(t)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `ARRAY<t>` 声明,其中 `t` 是所包含元素的数据类型。
-
-`t ARRAY` 接近等价于 SQL 标准。例如,`INT ARRAY` 等价于 `ARRAY<INT>`。
**JVM 类型**
@@ -931,6 +1078,19 @@ DataTypes.ARRAY(t)
|*t*`[]` | (X) | (X) | 依赖于子类型。 *缺省* |
|`org.apache.flink.table.data.ArrayData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.ARRAY(t)
+{% endhighlight %}
+</div>
+</div>
+
+此类型用 `ARRAY<t>` 声明,其中 `t` 是所包含元素的数据类型。
+
+`t ARRAY` 接近等价于 SQL 标准。例如,`INT ARRAY` 等价于 `ARRAY<INT>`。
+
#### `MAP`
将键(包括 `NULL`)映射到值(包括 `NULL`)的关联数组的数据类型。映射不能包含重复的键;每个键最多可以映射到一个值。
@@ -953,11 +1113,6 @@ MAP<kt, vt>
{% highlight java %}
DataTypes.MAP(kt, vt)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `MAP<kt, vt>` 声明,其中 `kt` 是键的数据类型,`vt` 是值的数据类型。
**JVM 类型**
@@ -967,6 +1122,17 @@ DataTypes.MAP(kt, vt)
| `java.util.Map<kt, vt>` 的*子类型* | X | | |
|`org.apache.flink.table.data.MapData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.MAP(kt, vt)
+{% endhighlight %}
+</div>
+</div>
+
+此类型用 `MAP<kt, vt>` 声明,其中 `kt` 是键的数据类型,`vt` 是值的数据类型。
+
#### `MULTISET`
多重集合的数据类型(=bag)。与集合不同的是,它允许每个具有公共子类型的元素有多个实例。每个唯一值(包括 `NULL`)都映射到某种多重性。
@@ -988,13 +1154,6 @@ t MULTISET
{% highlight java %}
DataTypes.MULTISET(t)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `MULTISET<t>` 声明,其中 `t` 是所包含元素的数据类型。
-
-`t MULTISET` 接近等价于 SQL 标准。例如,`INT MULTISET` 等价于 `MULTISET<INT>`。
**JVM 类型**
@@ -1004,6 +1163,19 @@ DataTypes.MULTISET(t)
| `java.util.Map<t, java.lang.Integer>` 的*子类型* | X | | |
|`org.apache.flink.table.data.MapData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.MULTISET(t)
+{% endhighlight %}
+</div>
+</div>
+
+此类型用 `MULTISET<t>` 声明,其中 `t` 是所包含元素的数据类型。
+
+`t MULTISET` 接近等价于 SQL 标准。例如,`INT MULTISET` 等价于 `MULTISET<INT>`。
+
#### `ROW`
字段序列的数据类型。
@@ -1033,13 +1205,6 @@ ROW(n0 t0 'd0', n1 t1 'd1', ...)
DataTypes.ROW(DataTypes.FIELD(n0, t0), DataTypes.FIELD(n1, t1), ...)
DataTypes.ROW(DataTypes.FIELD(n0, t0, d0), DataTypes.FIELD(n1, t1, d1), ...)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `ROW<n0 t0 'd0', n1 t1 'd1', ...>` 声明,其中 `n` 是唯一的字段名称,`t` 是字段的逻辑类型,`d` 是字段的描述。
-
-`ROW(...)` 接近等价于 SQL 标准。例如,`ROW(myField INT, myOtherField BOOLEAN)` 等价于 `ROW<myField INT, myOtherField BOOLEAN>`。
**JVM 类型**
@@ -1048,6 +1213,20 @@ DataTypes.ROW(DataTypes.FIELD(n0, t0, d0), DataTypes.FIELD(n1, t1, d1), ...)
|`org.apache.flink.types.Row` | X | X | *缺省* |
|`org.apache.flink.table.data.RowData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.ROW([DataTypes.FIELD(n0, t0), DataTypes.FIELD(n1, t1), ...])
+DataTypes.ROW([DataTypes.FIELD(n0, t0, d0), DataTypes.FIELD(n1, t1, d1), ...])
+{% endhighlight %}
+</div>
+</div>
+
+此类型用 `ROW<n0 t0 'd0', n1 t1 'd1', ...>` 声明,其中 `n` 是唯一的字段名称,`t` 是字段的逻辑类型,`d` 是字段的描述。
+
+`ROW(...)` 接近等价于 SQL 标准。例如,`ROW(myField INT, myOtherField BOOLEAN)` 等价于 `ROW<myField INT, myOtherField BOOLEAN>`。
+
### 用户自定义数据类型
<span class="label label-danger">注意</span> 还未完全支持用户自定义数据类型,当前(从 Flink 1.11 开始)它们仅可作为函数参数和返回值的未注册的结构化类型。
@@ -1103,6 +1282,15 @@ class User {
DataTypes.of(User.class);
{% endhighlight %}
+
+**JVM 类型**
+
+| Java 类型 | 输入 | 输出 | 备注 |
+|:-------------------------------------|:-----:|:------:|:------------------------------------------------------|
+|*类型* | X | X | 原始类或子类(用于输入)或超类(用于输出)*缺省* |
+|`org.apache.flink.types.Row` | X | X | 代表一行数据的结构化类型。 |
+|`org.apache.flink.table.data.RowData` | X | X | 内部数据结构。 |
+
</div>
<div data-lang="Scala" markdown="1">
@@ -1122,9 +1310,6 @@ case class User(
DataTypes.of(classOf[User])
{% endhighlight %}
-</div>
-
-</div>
**JVM 类型**
@@ -1134,6 +1319,10 @@ DataTypes.of(classOf[User])
|`org.apache.flink.types.Row` | X | X | 代表一行数据的结构化类型。 |
|`org.apache.flink.table.data.RowData` | X | X | 内部数据结构。 |
+</div>
+
+</div>
+
### 其他数据类型
#### `BOOLEAN`
@@ -1154,9 +1343,6 @@ BOOLEAN
{% highlight java %}
DataTypes.BOOLEAN()
{% endhighlight %}
-</div>
-
-</div>
**JVM 类型**
@@ -1165,6 +1351,15 @@ DataTypes.BOOLEAN()
|`java.lang.Boolean` | X | X | *缺省* |
|`boolean` | X | (X) | 仅当类型不可为空时才输出。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+DataTypes.BOOLEAN()
+{% endhighlight %}
+</div>
+</div>
+
#### `RAW`
任意序列化类型的数据类型。此类型对于 Flink Table 来讲是一个黑盒子,仅在跟外部交互时被反序列化。
@@ -1187,13 +1382,6 @@ DataTypes.RAW(class, serializer)
DataTypes.RAW(class)
{% endhighlight %}
-</div>
-
-</div>
-
-此类型用 `RAW('class', 'snapshot')` 声明,其中 `class` 是原始类,`snapshot` 是 Base64 编码的序列化的 `TypeSerializerSnapshot`。通常,类型字符串不是直接声明的,而是在持久化类型时生成的。
-
-在 API 中,可以通过直接提供 `Class` + `TypeSerializer` 或通过传递 `TypeInformation` 并让框架从那里提取 `Class` + `TypeSerializer` 来声明 `RAW` 类型。
**JVM 类型**
@@ -1203,6 +1391,19 @@ DataTypes.RAW(class)
|`byte[]` | | X | |
|`org.apache.flink.table.data.RawValueData` | X | X | 内部数据结构。 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+尚不支持
+{% endhighlight %}
+</div>
+</div>
+
+此类型用 `RAW('class', 'snapshot')` 声明,其中 `class` 是原始类,`snapshot` 是 Base64 编码的序列化的 `TypeSerializerSnapshot`。通常,类型字符串不是直接声明的,而是在持久化类型时生成的。
+
+在 API 中,可以通过直接提供 `Class` + `TypeSerializer` 或通过传递 `TypeInformation` 并让框架从那里提取 `Class` + `TypeSerializer` 来声明 `RAW` 类型。
+
#### `NULL`
表示空类型 `NULL` 值的数据类型。
@@ -1227,9 +1428,6 @@ NULL
{% highlight java %}
DataTypes.NULL()
{% endhighlight %}
-</div>
-
-</div>
**JVM 类型**
@@ -1238,6 +1436,15 @@ DataTypes.NULL()
|`java.lang.Object` | X | X | *缺省* |
|*任何类型* | | (X) | 任何非基本数据类型 |
+</div>
+
+<div data-lang="Python" markdown="1">
+{% highlight python %}
+尚不支持
+{% endhighlight %}
+</div>
+</div>
+
数据类型注解
---------------------