You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by ki...@apache.org on 2022/05/16 09:25:48 UTC
[incubator-seatunnel] branch dev updated: [Feature][seatunnel-connector-flink-fake] Support FakeSourceStream to mock data. (#1880)
This is an automated email from the ASF dual-hosted git repository.
kirs pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/incubator-seatunnel.git
The following commit(s) were added to refs/heads/dev by this push:
new 91ddc2e8 [Feature][seatunnel-connector-flink-fake] Support FakeSourceStream to mock data. (#1880)
91ddc2e8 is described below
commit 91ddc2e83c9f80a6847a7d9de15d1073a65b82c9
Author: MoSence <jm...@163.com>
AuthorDate: Mon May 16 17:25:43 2022 +0800
[Feature][seatunnel-connector-flink-fake] Support FakeSourceStream to mock data. (#1880)
* fix: pom.xml parent.
* feat: fake mock data mode
* feat: Flink fake_source_stream connector more feature.
support to mock data in flink fake_streaming.
support to limit data size in flink fake_streaming.
support to limit data interval in flink fake_streaming.
* docs: add jmockdata's licence.
* docs: usage docs.
* Revert "fix: pom.xml parent."
This reverts commit 6b0063797b411c8027c7ef89a2b7bd7d2b9a3ab2.
* fix: mock data dependencies.
* fix: mockDataBounded disable.
* docs: mock fake mode configuration document.
* feat: FakeSource add feature mock data.
* fix: default data scheme to cover old case.
* docs: default data scheme document modify.
* docs: Jmockdata licence add.
* docs: default data scheme document modify. remove mock_data_enable config option.
* docs: revert NOTICE.
* fix: ConfigFactory.parseMap cannot config Value Array, fix with ArrayList.
* fix: Code format resolve.
* fix: use dependencyManagement to manage the dependency version.
* fix: Code optimization.
* docs: remove useless document.
* docs: fix default schema config err.
---
docs/en/connector/source/Fake.mdx | 76 +++-
pom.xml | 7 +
.../seatunnel-connector-flink-fake/pom.xml | 4 +
.../org/apache/seatunnel/flink/fake/Config.java | 158 +++++++
.../seatunnel/flink/fake/source/FakeSource.java | 37 +-
.../flink/fake/source/FakeSourceStream.java | 29 +-
.../seatunnel/flink/fake/source/MockSchema.java | 492 +++++++++++++++++++++
seatunnel-dist/release-docs/LICENSE | 1 +
.../release-docs/licenses/LICENSE-jmockdata.txt | 201 +++++++++
tools/dependencies/known-dependencies.txt | 1 +
10 files changed, 977 insertions(+), 29 deletions(-)
diff --git a/docs/en/connector/source/Fake.mdx b/docs/en/connector/source/Fake.mdx
index 9111bb4e..4487f828 100644
--- a/docs/en/connector/source/Fake.mdx
+++ b/docs/en/connector/source/Fake.mdx
@@ -13,7 +13,7 @@ Engine Supported and plugin name
* [x] Spark: Fake, FakeStream
* [x] Flink: FakeSource, FakeSourceStream
- * Flink `Fake Source` is mainly used to automatically generate data. The data has only two columns. The first column is of `String type` and the content is a random one from `["Gary", "Ricky Huo", "Kid Xiong"]` . The second column is of `Long type` , which is the current 13-digit timestamp is used as input for functional verification and testing of `seatunnel` .
+ * Flink `Fake Source` is mainly used to automatically generate data. The data has only two columns. The first column is of `String type` and the content is a random one from `["Gary", "Ricky Huo", "Kid Xiong"]` . The second column is of `Int type` , which is the current 13-digit timestamp is used as input for functional verification and testing of `seatunnel` .
:::
@@ -51,10 +51,13 @@ Number of test cases generated per second
</TabItem>
<TabItem value="flink">
-| name | type | required | default value |
-| -------------- | ------ | -------- | ------------- |
-| parallelism | `Int` | no | - |
-| common-options |`string`| no | - |
+| name | type | required | default value |
+|--------------------|----------------------|----------|---------------|
+| parallelism | `Int` | no | - |
+| common-options | `string` | no | - |
+| mock_data_schema | list [column_config] | no | see details. |
+| mock_data_size | int | no | 300 |
+| mock_data_interval | int (second) | no | 1 |
### parallelism [`Int`]
@@ -67,6 +70,66 @@ The parallelism of an individual operator, for Fake Source Stream
Source plugin common parameters, please refer to [Source Plugin](common-options.mdx) for details
+### mock_data_schema Option [list[column_config]]
+
+Config mock data's schema. Each is column_config option.
+
+When mock_data_schema is not defined. Data will generate with schema like this:
+```bash
+mock_data_schema = [
+ {
+ name = "name",
+ type = "string",
+ mock_config = {
+ string_seed = ["Gary", "Ricky Huo", "Kid Xiong"]
+ size_range = [1,1]
+ }
+ }
+ {
+ name = "age",
+ type = "int",
+ mock_config = {
+ int_range = [1, 100]
+ }
+ }
+]
+```
+
+column_config option type.
+
+| name | type | required | default value | support values |
+|-------------|-------------|----------|---------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| name | string | yes | string | - |
+| type | string | yes | string | int,integer,byte,boolean,char,<br/>character,short,long,float,double,<br/>date,timestamp,decimal,bigdecimal,<br/>bigint,biginteger,int[],byte[],<br/>boolean[],char[],character[],short[],<br/>long[],float[],double[],string[],<br/>binary,varchar |
+| mock_config | mock_config | no | - | - |
+
+mock_config Option
+
+| name | type | required | default value | sample |
+|---------------|-----------------------|----------|---------------|------------------------------------------|
+| byte_range | list[byte] [size=2] | no | - | [0,127] |
+| boolean_seed | list[boolean] | no | - | [true, true, false] |
+| char_seed | list[char] [size=2] | no | - | ['a','b','c'] |
+| date_range | list[string] [size=2] | no | - | ["1970-01-01", "2100-12-31"] |
+| decimal_scale | int | no | - | 2 |
+| double_range | list[double] [size=2] | no | - | [0.0, 10000.0] |
+| float_range | list[flout] [size=2] | no | - | [0.0, 10000.0] |
+| int_range | list[int] [size=2] | no | - | [0, 100] |
+| long_range | list[long] [size=2] | no | - | [0, 100000] |
+| number_regex | string | no | - | "[1-9]{1}\\d?" |
+| time_range | list[int] [size=6] | no | - | [0,24,0,60,0,60] |
+| size_range | list[int] [size=2] | no | - | [6,10] |
+| string_regex | string | no | - | "[a-z0-9]{5}\\@\\w{3}\\.[a-z]{3}" |
+| string_seed | list[string] | no | - | ["Gary", "Ricky Huo", "Kid Xiong"] |
+
+### mock_data_size Option [int]
+
+Config mock data size.
+
+### mock_data_interval Option [int]
+
+Config the data can mock with interval, The unit is SECOND.
+
## Examples
<Tabs
@@ -111,6 +174,8 @@ The generated data is as follows, randomly extract the string from the `content`
### FakeSourceStream
+
+
```bash
source {
FakeSourceStream {
@@ -127,6 +192,7 @@ source {
FakeSource {
result_table_name = "fake"
field_name = "name,age"
+ mock_data_size = 100 // will generate 100 rows mock data.
}
}
```
diff --git a/pom.xml b/pom.xml
index ad36d244..92d09a96 100644
--- a/pom.xml
+++ b/pom.xml
@@ -172,6 +172,7 @@
<slf4j.version>1.7.25</slf4j.version>
<guava.version>19.0</guava.version>
<auto-service.version>1.0.1</auto-service.version>
+ <jmockdata.version>4.3.0</jmockdata.version>
</properties>
<dependencyManagement>
@@ -604,6 +605,12 @@
<artifactId>guava</artifactId>
<version>${guava.version}</version>
</dependency>
+
+ <dependency>
+ <groupId>com.github.jsonzou</groupId>
+ <artifactId>jmockdata</artifactId>
+ <version>${jmockdata.version}</version>
+ </dependency>
</dependencies>
</dependencyManagement>
diff --git a/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/pom.xml b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/pom.xml
index 8dbe114c..6f255053 100644
--- a/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/pom.xml
+++ b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/pom.xml
@@ -49,6 +49,10 @@
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_${scala.binary.version}</artifactId>
</dependency>
+ <dependency>
+ <groupId>com.github.jsonzou</groupId>
+ <artifactId>jmockdata</artifactId>
+ </dependency>
</dependencies>
</project>
\ No newline at end of file
diff --git a/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/Config.java b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/Config.java
new file mode 100644
index 00000000..79a55e19
--- /dev/null
+++ b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/Config.java
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.seatunnel.flink.fake;
+
+import org.apache.seatunnel.flink.fake.source.FakeSource;
+import org.apache.seatunnel.flink.fake.source.FakeSourceStream;
+
+/**
+ * FakeSource {@link FakeSource} and
+ * FakeSourceStream {@link FakeSourceStream} configuration parameters
+ */
+public final class Config {
+
+ /**
+ * Configuration mock data schema in FakeSourceStream. It used when mock_data_enable = true
+ */
+ public static final String MOCK_DATA_SCHEMA = "mock_data_schema";
+
+ /**
+ * Each schema configuration 'name' param.
+ * mock_data_schema = [
+ * {
+ * name = "colName"
+ * type = "colType"
+ * }
+ * ]
+ */
+ public static final String MOCK_DATA_SCHEMA_NAME = "name";
+
+ /**
+ * Each schema configuration 'type' param.
+ * Support Value:
+ * int,integer,byte,boolean,char,character,short,long,float,double,date,timestamp,decimal,bigdecimal,bigint,biginteger,
+ * int[],byte[],boolean[],char[],character[],short[],long[],float[],double[],string[],binary,varchar
+ * mock_data_schema = [
+ * {
+ * name = "colName"
+ * type = "colType"
+ * },
+ * ...
+ * ]
+ */
+ public static final String MOCK_DATA_SCHEMA_TYPE = "type";
+
+ /**
+ * defined the rule of mock data.
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK = "mock_config";
+
+ /**
+ * sample1: 0,f
+ * sample2: a,f
+ * byte = Byte.parseByte(rangeMin).
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#byteRange(byte min, byte max)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_BYTE_RANGE = "byte_range";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#booleanSeed(boolean... seeds)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_BOOLEAN_SEED = "boolean_seed";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#charSeed(char... seeds)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_CHAR_SEED = "char_seed";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#dateRange(String min, String max)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_DATE_RANGE = "date_range";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#decimalScale(int scale)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_DECIMAL_SCALE = "decimal_scale";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#doubleRange(double min, double max)}.
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_DOUBLE_RANGE = "double_range";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#floatRange(float min, float max)}.
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_FLOAT_RANGE = "float_range";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#intRange(int min, int max)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_INT_RANGE = "int_range";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#longRange(long min, long max)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_LONG_RANGE = "long_range";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#numberRegex(String regex)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_NUMBER_REGEX = "number_regex";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#timeRange(int minHour, int maxHour, int minMinute, int maxMinute, int minSecond, int maxSecond)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_TIME_RANGE = "time_range";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#sizeRange(int min, int max)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_SIZE_RANGE = "size_range";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#stringRegex(String regex)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_STRING_REGEX = "string_regex";
+
+ /**
+ * defined the rule of mock data config {@link com.github.jsonzou.jmockdata.MockConfig#stringSeed(String... seeds)} .
+ */
+ public static final String MOCK_DATA_SCHEMA_MOCK_STRING_SEED = "string_seed";
+
+ /**
+ * In mock_data_schema=true, limit row size. Default is 300.
+ */
+ public static final String MOCK_DATA_SIZE = "mock_data_size";
+
+ /**
+ * mock_data_size Default value.
+ */
+ public static final int MOCK_DATA_SIZE_DEFAULT_VALUE = 300;
+
+ /**
+ * Create data interval, unit is second. Default is 1 second.
+ */
+ public static final String MOCK_DATA_INTERVAL = "mock_data_interval";
+
+ /**
+ * mock_data_interval Default value.
+ */
+ public static final long MOCK_DATA_INTERVAL_DEFAULT_VALUE = 1L;
+
+}
diff --git a/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/FakeSource.java b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/FakeSource.java
index 311b1bc7..f0609546 100644
--- a/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/FakeSource.java
+++ b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/FakeSource.java
@@ -17,7 +17,11 @@
package org.apache.seatunnel.flink.fake.source;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SIZE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SIZE_DEFAULT_VALUE;
+
import org.apache.seatunnel.common.config.CheckResult;
+import org.apache.seatunnel.common.config.TypesafeConfigUtils;
import org.apache.seatunnel.flink.BaseFlinkSource;
import org.apache.seatunnel.flink.FlinkEnvironment;
import org.apache.seatunnel.flink.batch.FlinkBatchSource;
@@ -26,19 +30,18 @@ import org.apache.seatunnel.shade.com.typesafe.config.Config;
import com.google.auto.service.AutoService;
import org.apache.flink.api.java.DataSet;
-import org.apache.flink.table.api.DataTypes;
import org.apache.flink.types.Row;
-import java.util.Arrays;
-import java.util.Random;
-import java.util.stream.Collectors;
+import java.util.ArrayList;
+import java.util.List;
@AutoService(BaseFlinkSource.class)
public class FakeSource implements FlinkBatchSource {
- private static final String[] NAME_ARRAY = new String[]{"Gary", "Ricky Huo", "Kid Xiong"};
private Config config;
- private static final int AGE_LIMIT = 100;
+
+ private List<MockSchema> mockDataSchema;
+ private int mockDataSize;
@Override
public void setConfig(Config config) {
@@ -50,6 +53,12 @@ public class FakeSource implements FlinkBatchSource {
return config;
}
+ @Override
+ public void prepare(FlinkEnvironment env) {
+ mockDataSchema = MockSchema.resolveConfig(config);
+ mockDataSize = TypesafeConfigUtils.getConfig(config, MOCK_DATA_SIZE, MOCK_DATA_SIZE_DEFAULT_VALUE);
+ }
+
@Override
public CheckResult checkConfig() {
return CheckResult.success();
@@ -62,13 +71,15 @@ public class FakeSource implements FlinkBatchSource {
@Override
public DataSet<Row> getData(FlinkEnvironment env) {
- Random random = new Random();
- return env.getBatchTableEnvironment().toDataSet(
- env.getBatchTableEnvironment().fromValues(
- DataTypes.ROW(DataTypes.FIELD("name", DataTypes.STRING()),
- DataTypes.FIELD("age", DataTypes.INT())),
- Arrays.stream(NAME_ARRAY).map(n -> Row.of(n, random.nextInt(AGE_LIMIT)))
- .collect(Collectors.toList())), Row.class);
+ List<Row> dataSet = new ArrayList<>(mockDataSize);
+ for (long index = 0; index < mockDataSize; index++) {
+ dataSet.add(MockSchema.mockRowData(mockDataSchema));
+ }
+ return env.getBatchEnvironment()
+ .fromCollection(
+ dataSet,
+ MockSchema.mockRowTypeInfo(mockDataSchema)
+ );
}
}
diff --git a/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/FakeSourceStream.java b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/FakeSourceStream.java
index 2a3995d2..7fa204a7 100644
--- a/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/FakeSourceStream.java
+++ b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/FakeSourceStream.java
@@ -17,9 +17,10 @@
package org.apache.seatunnel.flink.fake.source;
-import static org.apache.flink.api.common.typeinfo.BasicTypeInfo.LONG_TYPE_INFO;
-import static org.apache.flink.api.common.typeinfo.BasicTypeInfo.STRING_TYPE_INFO;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_INTERVAL;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_INTERVAL_DEFAULT_VALUE;
+import org.apache.seatunnel.common.config.TypesafeConfigUtils;
import org.apache.seatunnel.flink.BaseFlinkSource;
import org.apache.seatunnel.flink.FlinkEnvironment;
import org.apache.seatunnel.flink.stream.FlinkStreamSource;
@@ -27,13 +28,13 @@ import org.apache.seatunnel.flink.stream.FlinkStreamSource;
import org.apache.seatunnel.shade.com.typesafe.config.Config;
import com.google.auto.service.AutoService;
-import org.apache.flink.api.java.typeutils.RowTypeInfo;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
import org.apache.flink.streaming.api.functions.source.SourceFunction;
import org.apache.flink.types.Row;
+import java.util.List;
import java.util.concurrent.TimeUnit;
@AutoService(BaseFlinkSource.class)
@@ -45,13 +46,16 @@ public class FakeSourceStream extends RichParallelSourceFunction<Row> implements
private Config config;
+ private List<MockSchema> mockDataSchema;
+ private long mockDataInterval;
+
@Override
public DataStream<Row> getData(FlinkEnvironment env) {
DataStreamSource<Row> source = env.getStreamExecutionEnvironment().addSource(this);
if (config.hasPath(PARALLELISM)) {
source = source.setParallelism(config.getInt(PARALLELISM));
}
- return source.returns(new RowTypeInfo(STRING_TYPE_INFO, LONG_TYPE_INFO));
+ return source.returns(MockSchema.mockRowTypeInfo(mockDataSchema));
}
@Override
@@ -64,20 +68,23 @@ public class FakeSourceStream extends RichParallelSourceFunction<Row> implements
return config;
}
+ @Override
+ public void prepare(FlinkEnvironment env) {
+ mockDataSchema = MockSchema.resolveConfig(config);
+ mockDataInterval = TypesafeConfigUtils.getConfig(config, MOCK_DATA_INTERVAL, MOCK_DATA_INTERVAL_DEFAULT_VALUE);
+ }
+
@Override
public String getPluginName() {
return "FakeSourceStream";
}
- private static final String[] NAME_ARRAY = new String[]{"Gary", "Ricky Huo", "Kid Xiong"};
-
@Override
public void run(SourceFunction.SourceContext<Row> ctx) throws Exception {
- while (running) {
- int randomNum = (int) (1 + Math.random() * NAME_ARRAY.length);
- Row row = Row.of(NAME_ARRAY[randomNum - 1], System.currentTimeMillis());
- ctx.collect(row);
- Thread.sleep(TimeUnit.SECONDS.toMillis(1));
+ while (running){
+ Row rowData = MockSchema.mockRowData(mockDataSchema);
+ ctx.collect(rowData);
+ Thread.sleep(TimeUnit.SECONDS.toMillis(mockDataInterval));
}
}
diff --git a/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/MockSchema.java b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/MockSchema.java
new file mode 100644
index 00000000..b8f0428d
--- /dev/null
+++ b/seatunnel-connectors/seatunnel-connectors-flink/seatunnel-connector-flink-fake/src/main/java/org/apache/seatunnel/flink/fake/source/MockSchema.java
@@ -0,0 +1,492 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.seatunnel.flink.fake.source;
+
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_BOOLEAN_SEED;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_BYTE_RANGE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_CHAR_SEED;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_DATE_RANGE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_DECIMAL_SCALE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_DOUBLE_RANGE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_FLOAT_RANGE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_INT_RANGE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_LONG_RANGE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_NUMBER_REGEX;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_SIZE_RANGE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_STRING_REGEX;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_STRING_SEED;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_MOCK_TIME_RANGE;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_NAME;
+import static org.apache.seatunnel.flink.fake.Config.MOCK_DATA_SCHEMA_TYPE;
+
+import org.apache.seatunnel.shade.com.typesafe.config.Config;
+import org.apache.seatunnel.shade.com.typesafe.config.ConfigFactory;
+
+import com.github.jsonzou.jmockdata.JMockData;
+import com.github.jsonzou.jmockdata.MockConfig;
+import org.apache.flink.api.common.typeinfo.BasicArrayTypeInfo;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo;
+import org.apache.flink.api.common.typeinfo.SqlTimeTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.typeutils.GenericTypeInfo;
+import org.apache.flink.api.java.typeutils.RowTypeInfo;
+import org.apache.flink.types.Row;
+
+import java.io.ByteArrayInputStream;
+import java.io.Serializable;
+import java.math.BigDecimal;
+import java.math.BigInteger;
+import java.nio.charset.StandardCharsets;
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+/**
+ * config param [mock_data_schema] json bean.
+ * @see org.apache.seatunnel.flink.fake.Config#MOCK_DATA_SCHEMA
+ */
+public class MockSchema implements Serializable {
+
+ private static final long serialVersionUID = -7018198671355055858L;
+
+ private static final int STATIC_RANGE_SIZE = 2;
+ private static final int STATIC_RANGE_MIN_INDEX = 0;
+ private static final int STATIC_RANGE_MAX_INDEX = 1;
+ private static final int TIME_RANGE_SIZE = 6;
+ private static final int TIME_RANGE_HOUR_MIN_INDEX = 0;
+ private static final int TIME_RANGE_HOUR_MAX_INDEX = 1;
+ private static final int TIME_RANGE_MINUTE_MIN_INDEX = 2;
+ private static final int TIME_RANGE_MINUTE_MAX_INDEX = 3;
+ private static final int TIME_RANGE_SECOND_MIN_INDEX = 4;
+ private static final int TIME_RANGE_SECOND_MAX_INDEX = 5;
+
+ private String name;
+ private String type;
+ private Config mockConfig;
+
+ public String getName() {
+ return name;
+ }
+
+ public void setName(String name) {
+ this.name = name;
+ }
+
+ public String getType() {
+ return type;
+ }
+
+ public void setType(String type) {
+ this.type = type;
+ }
+
+ public Config getMockConfig() {
+ return mockConfig;
+ }
+
+ public void setMockConfig(Config mockConfig) {
+ this.mockConfig = mockConfig;
+ }
+
+ public TypeInformation<?> typeInformation() {
+ TypeInformation<?> dataType;
+ switch (this.type.trim().toLowerCase()) {
+ case "int":
+ case "integer":
+ dataType = BasicTypeInfo.INT_TYPE_INFO;
+ break;
+ case "byte":
+ dataType = BasicTypeInfo.BYTE_TYPE_INFO;
+ break;
+ case "boolean":
+ dataType = BasicTypeInfo.BOOLEAN_TYPE_INFO;
+ break;
+ case "char":
+ case "character":
+ dataType = BasicTypeInfo.CHAR_TYPE_INFO;
+ break;
+ case "short":
+ dataType = BasicTypeInfo.SHORT_TYPE_INFO;
+ break;
+ case "long":
+ dataType = BasicTypeInfo.LONG_TYPE_INFO;
+ break;
+ case "float":
+ dataType = BasicTypeInfo.FLOAT_TYPE_INFO;
+ break;
+ case "double":
+ dataType = BasicTypeInfo.DOUBLE_TYPE_INFO;
+ break;
+ case "date":
+ dataType = BasicTypeInfo.DATE_TYPE_INFO;
+ break;
+ case "timestamp":
+ dataType = SqlTimeTypeInfo.TIMESTAMP;
+ break;
+ case "decimal":
+ case "bigdecimal":
+ dataType = BasicTypeInfo.BIG_DEC_TYPE_INFO;
+ break;
+ case "bigint":
+ case "biginteger":
+ dataType = BasicTypeInfo.BIG_INT_TYPE_INFO;
+ break;
+ case "int[]":
+ dataType = PrimitiveArrayTypeInfo.INT_PRIMITIVE_ARRAY_TYPE_INFO;
+ break;
+ case "byte[]":
+ dataType = PrimitiveArrayTypeInfo.BYTE_PRIMITIVE_ARRAY_TYPE_INFO;
+ break;
+ case "boolean[]":
+ dataType = PrimitiveArrayTypeInfo.BOOLEAN_PRIMITIVE_ARRAY_TYPE_INFO;
+ break;
+ case "char[]":
+ case "character[]":
+ dataType = PrimitiveArrayTypeInfo.CHAR_PRIMITIVE_ARRAY_TYPE_INFO;
+ break;
+ case "short[]":
+ dataType = PrimitiveArrayTypeInfo.SHORT_PRIMITIVE_ARRAY_TYPE_INFO;
+ break;
+ case "long[]":
+ dataType = PrimitiveArrayTypeInfo.LONG_PRIMITIVE_ARRAY_TYPE_INFO;
+ break;
+ case "float[]":
+ dataType = PrimitiveArrayTypeInfo.FLOAT_PRIMITIVE_ARRAY_TYPE_INFO;
+ break;
+ case "double[]":
+ dataType = PrimitiveArrayTypeInfo.DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO;
+ break;
+ case "string[]":
+ dataType = BasicArrayTypeInfo.STRING_ARRAY_TYPE_INFO;
+ break;
+ case "binary":
+ dataType = GenericTypeInfo.of(ByteArrayInputStream.class);
+ break;
+ case "varchar":
+ default:
+ dataType = BasicTypeInfo.STRING_TYPE_INFO;
+ break;
+ }
+ return dataType;
+ }
+
+ public Object mockData(){
+ Object mockData;
+ MockConfig mockConfig = new MockConfig();
+ resolve(mockConfig);
+ switch (this.type.trim().toLowerCase()){
+ case "int":
+ case "integer":
+ mockData = JMockData.mock(int.class, mockConfig);
+ break;
+ case "byte":
+ mockData = JMockData.mock(byte.class, mockConfig);
+ break;
+ case "boolean":
+ mockData = JMockData.mock(boolean.class, mockConfig);
+ break;
+ case "char":
+ case "character":
+ mockData = JMockData.mock(char.class, mockConfig);
+ break;
+ case "short":
+ mockData = JMockData.mock(short.class, mockConfig);
+ break;
+ case "long":
+ mockData = JMockData.mock(long.class, mockConfig);
+ break;
+ case "float":
+ mockData = JMockData.mock(float.class, mockConfig);
+ break;
+ case "double":
+ mockData = JMockData.mock(double.class, mockConfig);
+ break;
+ case "date":
+ mockData = JMockData.mock(Date.class, mockConfig);
+ break;
+ case "timestamp":
+ mockData = JMockData.mock(Timestamp.class, mockConfig);
+ break;
+ case "decimal":
+ case "bigdecimal":
+ mockData = JMockData.mock(BigDecimal.class, mockConfig);
+ break;
+ case "bigint":
+ case "biginteger":
+ mockData = JMockData.mock(BigInteger.class, mockConfig);
+ break;
+ case "int[]":
+ mockData = JMockData.mock(int[].class, mockConfig);
+ break;
+ case "byte[]":
+ mockData = JMockData.mock(byte[].class, mockConfig);
+ break;
+ case "boolean[]":
+ mockData = JMockData.mock(boolean[].class, mockConfig);
+ break;
+ case "char[]":
+ case "character[]":
+ mockData = JMockData.mock(char[].class, mockConfig);
+ break;
+ case "short[]":
+ mockData = JMockData.mock(short[].class, mockConfig);
+ break;
+ case "long[]":
+ mockData = JMockData.mock(long[].class, mockConfig);
+ break;
+ case "float[]":
+ mockData = JMockData.mock(float[].class, mockConfig);
+ break;
+ case "double[]":
+ mockData = JMockData.mock(double[].class, mockConfig);
+ break;
+ case "string[]":
+ mockData = JMockData.mock(String[].class, mockConfig);
+ break;
+ case "binary":
+ byte[] bytes = JMockData.mock(byte[].class, mockConfig);
+ mockData = new ByteArrayInputStream(bytes);
+ break;
+ case "varchar":
+ default:
+ mockData = JMockData.mock(String.class, mockConfig);
+ break;
+ }
+ return mockData;
+ }
+
+ private void resolve(MockConfig mockConfig) {
+ if (this.mockConfig != null) {
+ byteConfigResolve(mockConfig);
+ booleanConfigResolve(mockConfig);
+ charConfigResolve(mockConfig);
+ dateConfigResolve(mockConfig);
+ decimalConfigResolve(mockConfig);
+ doubleConfigResolve(mockConfig);
+ floatConfigResolve(mockConfig);
+ intConfigResolve(mockConfig);
+ longConfigResolve(mockConfig);
+ numberConfigResolve(mockConfig);
+ sizeConfigResolve(mockConfig);
+ stringConfigResolve(mockConfig);
+ timeConfigResolve(mockConfig);
+ }
+ }
+
+ private void byteConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_BYTE_RANGE)) {
+ List<Long> byteRange = this.mockConfig.getBytesList(MOCK_DATA_SCHEMA_MOCK_BYTE_RANGE);
+ assert byteRange.size() >= STATIC_RANGE_SIZE;
+ mockConfig.byteRange(byteRange.get(STATIC_RANGE_MIN_INDEX).byteValue(), byteRange.get(STATIC_RANGE_MAX_INDEX).byteValue());
+ }
+ }
+
+ private void booleanConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_BOOLEAN_SEED)) {
+ List<Boolean> booleanSeedList = this.mockConfig.getBooleanList(MOCK_DATA_SCHEMA_MOCK_BOOLEAN_SEED);
+ boolean[] booleanSeed = new boolean[booleanSeedList.size()];
+ for (int index = 0; index < booleanSeedList.size(); index++) {
+ booleanSeed[index] = booleanSeedList.get(index);
+ }
+ mockConfig.booleanSeed(booleanSeed);
+ }
+ }
+
+ private void charConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_CHAR_SEED)) {
+ String charSeedString = this.mockConfig.getString(MOCK_DATA_SCHEMA_MOCK_CHAR_SEED);
+ byte[] bytes = charSeedString.getBytes(StandardCharsets.UTF_8);
+ char[] charSeed = new char[bytes.length];
+ for (int index = 0; index < bytes.length; index++) {
+ charSeed[index] = (char) bytes[index];
+ }
+ mockConfig.charSeed(charSeed);
+ }
+ }
+
+ private void dateConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_DATE_RANGE)) {
+ List<String> dateRange = this.mockConfig.getStringList(MOCK_DATA_SCHEMA_MOCK_DATE_RANGE);
+ assert dateRange.size() >= STATIC_RANGE_SIZE;
+ mockConfig.dateRange(dateRange.get(STATIC_RANGE_MIN_INDEX), dateRange.get(STATIC_RANGE_MAX_INDEX));
+ }
+ }
+
+ private void decimalConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_DECIMAL_SCALE)) {
+ int scala = this.mockConfig.getInt(MOCK_DATA_SCHEMA_MOCK_DECIMAL_SCALE);
+ mockConfig.decimalScale(scala);
+ }
+ }
+
+ private void doubleConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_DOUBLE_RANGE)) {
+ List<Double> doubleRange = this.mockConfig.getDoubleList(MOCK_DATA_SCHEMA_MOCK_DOUBLE_RANGE);
+ assert doubleRange.size() >= STATIC_RANGE_SIZE;
+ mockConfig.doubleRange(doubleRange.get(STATIC_RANGE_MIN_INDEX), doubleRange.get(STATIC_RANGE_MAX_INDEX));
+ }
+ }
+
+ private void floatConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_FLOAT_RANGE)) {
+ List<Double> floatRange = this.mockConfig.getDoubleList(MOCK_DATA_SCHEMA_MOCK_FLOAT_RANGE);
+ assert floatRange.size() >= STATIC_RANGE_SIZE;
+ mockConfig.floatRange(floatRange.get(STATIC_RANGE_MIN_INDEX).floatValue(), floatRange.get(STATIC_RANGE_MAX_INDEX).floatValue());
+ }
+ }
+
+ private void intConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_INT_RANGE)) {
+ List<Integer> intRange = this.mockConfig.getIntList(MOCK_DATA_SCHEMA_MOCK_INT_RANGE);
+ assert intRange.size() >= STATIC_RANGE_SIZE;
+ mockConfig.intRange(intRange.get(STATIC_RANGE_MIN_INDEX), intRange.get(STATIC_RANGE_MAX_INDEX));
+ }
+ }
+
+ private void longConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_LONG_RANGE)) {
+ List<Long> longRange = this.mockConfig.getLongList(MOCK_DATA_SCHEMA_MOCK_LONG_RANGE);
+ assert longRange.size() >= STATIC_RANGE_SIZE;
+ mockConfig.longRange(longRange.get(STATIC_RANGE_MIN_INDEX), longRange.get(STATIC_RANGE_MAX_INDEX));
+ }
+ }
+
+ private void numberConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_NUMBER_REGEX)) {
+ String numberRegex = this.mockConfig.getString(MOCK_DATA_SCHEMA_MOCK_NUMBER_REGEX);
+ mockConfig.numberRegex(numberRegex);
+ }
+ }
+
+ private void sizeConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_SIZE_RANGE)) {
+ List<Integer> sizeRange = this.mockConfig.getIntList(MOCK_DATA_SCHEMA_MOCK_SIZE_RANGE);
+ assert sizeRange.size() >= STATIC_RANGE_SIZE;
+ mockConfig.sizeRange(sizeRange.get(STATIC_RANGE_MIN_INDEX), sizeRange.get(STATIC_RANGE_MAX_INDEX));
+ }
+ }
+
+ private void stringConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_STRING_REGEX)) {
+ String stringRegex = this.mockConfig.getString(MOCK_DATA_SCHEMA_MOCK_STRING_REGEX);
+ mockConfig.stringRegex(stringRegex);
+ }
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_STRING_SEED)) {
+ List<String> stringSeed = this.mockConfig.getStringList(MOCK_DATA_SCHEMA_MOCK_STRING_SEED);
+ mockConfig.stringSeed(stringSeed.toArray(new String[0]));
+ }
+ }
+
+ private void timeConfigResolve(MockConfig mockConfig) {
+ if (this.mockConfig.hasPath(MOCK_DATA_SCHEMA_MOCK_TIME_RANGE)) {
+ List<Integer> timeRange = this.mockConfig.getIntList(MOCK_DATA_SCHEMA_MOCK_TIME_RANGE);
+ assert timeRange.size() >= TIME_RANGE_SIZE;
+ mockConfig.timeRange(
+ timeRange.get(TIME_RANGE_HOUR_MIN_INDEX),
+ timeRange.get(TIME_RANGE_HOUR_MAX_INDEX),
+ timeRange.get(TIME_RANGE_MINUTE_MIN_INDEX),
+ timeRange.get(TIME_RANGE_MINUTE_MAX_INDEX),
+ timeRange.get(TIME_RANGE_SECOND_MIN_INDEX),
+ timeRange.get(TIME_RANGE_SECOND_MAX_INDEX)
+ );
+ }
+ }
+
+ public static RowTypeInfo mockRowTypeInfo(List<MockSchema> mockDataSchema) {
+ TypeInformation<?>[] types = new TypeInformation[mockDataSchema.size()];
+ String[] fieldNames = new String[mockDataSchema.size()];
+ for (int index = 0; index < mockDataSchema.size(); index++) {
+ MockSchema schema = mockDataSchema.get(index);
+ types[index] = schema.typeInformation();
+ fieldNames[index] = schema.getName();
+ }
+ return new RowTypeInfo(types, fieldNames);
+ }
+
+ public static Row mockRowData(List<MockSchema> mockDataSchema){
+ Object[] fieldByPosition = new Object[mockDataSchema.size()];
+ for (int index = 0; index < mockDataSchema.size(); index++) {
+ MockSchema schema = mockDataSchema.get(index);
+ Object colData = schema.mockData();
+ fieldByPosition[index] = colData;
+ }
+ return Row.of(fieldByPosition);
+ }
+
+ private static final String[] NAME_SEEDS = new String[]{"Gary", "Ricky Huo", "Kid Xiong"};
+
+ private static final Integer[] NAME_SIZE_RANGE = new Integer[]{1, 1};
+
+ private static final Integer[] AGE_RANGE = new Integer[]{1, 100};
+
+ public static List<MockSchema> DEFAULT_MOCK_SCHEMAS = new ArrayList<>(0);
+
+ static {
+ MockSchema nameSchema = new MockSchema();
+ nameSchema.setName("name");
+ nameSchema.setType("string");
+ Map<String, Object> nameSchemaConfigMap = new HashMap<>(0);
+ nameSchemaConfigMap.put(MOCK_DATA_SCHEMA_MOCK_STRING_SEED, Arrays.asList(NAME_SEEDS));
+ nameSchemaConfigMap.put(MOCK_DATA_SCHEMA_MOCK_SIZE_RANGE, Arrays.asList(NAME_SIZE_RANGE));
+ nameSchema.setMockConfig(
+ ConfigFactory.parseMap(
+ nameSchemaConfigMap
+ )
+ );
+ DEFAULT_MOCK_SCHEMAS.add(nameSchema);
+ MockSchema ageSchema = new MockSchema();
+ ageSchema.setName("age");
+ ageSchema.setType("int");
+ Map<String, Object> ageSchemaConfigMap = new HashMap<>(0);
+ ageSchemaConfigMap.put(MOCK_DATA_SCHEMA_MOCK_INT_RANGE, Arrays.asList(AGE_RANGE));
+ ageSchema.setMockConfig(
+ ConfigFactory.parseMap(
+ ageSchemaConfigMap
+ )
+ );
+ DEFAULT_MOCK_SCHEMAS.add(ageSchema);
+ }
+
+ public static List<MockSchema> resolveConfig(Config config){
+ if (config.hasPath(MOCK_DATA_SCHEMA)) {
+ return config.getConfigList(MOCK_DATA_SCHEMA)
+ .stream()
+ .map(
+ schemaConfig -> {
+ MockSchema schema = new MockSchema();
+ schema.setName(schemaConfig.getString(MOCK_DATA_SCHEMA_NAME));
+ schema.setType(schemaConfig.getString(MOCK_DATA_SCHEMA_TYPE));
+ if (schemaConfig.hasPath(MOCK_DATA_SCHEMA_MOCK)) {
+ schema.setMockConfig(schemaConfig.getConfig(MOCK_DATA_SCHEMA_MOCK));
+ }
+ return schema;
+ }
+ )
+ .collect(Collectors.toList());
+ }
+ return DEFAULT_MOCK_SCHEMAS;
+ }
+}
diff --git a/seatunnel-dist/release-docs/LICENSE b/seatunnel-dist/release-docs/LICENSE
index ee7794aa..15bb80df 100644
--- a/seatunnel-dist/release-docs/LICENSE
+++ b/seatunnel-dist/release-docs/LICENSE
@@ -579,6 +579,7 @@ The text of each license is the standard Apache 2.0 license.
(Apache License, Version 2.0) hudi-spark-bundle_2.11 (org.apache.hudi:hudi-spark-bundle_2.11:0.10.0 - https://github.com/apache/hudi/hudi-spark-bundle_2.11)
(Apache License, Version 2.0) java-xmlbuilder (com.jamesmurty.utils:java-xmlbuilder:0.4 - http://code.google.com/p/java-xmlbuilder/)
(Apache License, Version 2.0) jcommander (com.beust:jcommander:1.81 - https://jcommander.org)
+ (Apache License, Version 2.0) jmockdata (com.github.jsonzou:jmockdata:4.3.0 - https://github.com/jsonzou/jmockdata)
(Apache License, Version 2.0) stream-lib (com.clearspring.analytics:stream:2.7.0 - https://github.com/addthis/stream-lib)
(Apache License, version 2.0) JBoss Logging 3 (org.jboss.logging:jboss-logging:3.2.1.Final - http://www.jboss.org)
(Apache v2) BoneCP :: Core Library (com.jolbox:bonecp:0.8.0.RELEASE - http://jolbox.com/bonecp)
diff --git a/seatunnel-dist/release-docs/licenses/LICENSE-jmockdata.txt b/seatunnel-dist/release-docs/licenses/LICENSE-jmockdata.txt
new file mode 100644
index 00000000..9c8f3ea0
--- /dev/null
+++ b/seatunnel-dist/release-docs/licenses/LICENSE-jmockdata.txt
@@ -0,0 +1,201 @@
+ Apache License
+ Version 2.0, January 2004
+ http://www.apache.org/licenses/
+
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+ 1. Definitions.
+
+ "License" shall mean the terms and conditions for use, reproduction,
+ and distribution as defined by Sections 1 through 9 of this document.
+
+ "Licensor" shall mean the copyright owner or entity authorized by
+ the copyright owner that is granting the License.
+
+ "Legal Entity" shall mean the union of the acting entity and all
+ other entities that control, are controlled by, or are under common
+ control with that entity. For the purposes of this definition,
+ "control" means (i) the power, direct or indirect, to cause the
+ direction or management of such entity, whether by contract or
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
+ outstanding shares, or (iii) beneficial ownership of such entity.
+
+ "You" (or "Your") shall mean an individual or Legal Entity
+ exercising permissions granted by this License.
+
+ "Source" form shall mean the preferred form for making modifications,
+ including but not limited to software source code, documentation
+ source, and configuration files.
+
+ "Object" form shall mean any form resulting from mechanical
+ transformation or translation of a Source form, including but
+ not limited to compiled object code, generated documentation,
+ and conversions to other media types.
+
+ "Work" shall mean the work of authorship, whether in Source or
+ Object form, made available under the License, as indicated by a
+ copyright notice that is included in or attached to the work
+ (an example is provided in the Appendix below).
+
+ "Derivative Works" shall mean any work, whether in Source or Object
+ form, that is based on (or derived from) the Work and for which the
+ editorial revisions, annotations, elaborations, or other modifications
+ represent, as a whole, an original work of authorship. For the purposes
+ of this License, Derivative Works shall not include works that remain
+ separable from, or merely link (or bind by name) to the interfaces of,
+ the Work and Derivative Works thereof.
+
+ "Contribution" shall mean any work of authorship, including
+ the original version of the Work and any modifications or additions
+ to that Work or Derivative Works thereof, that is intentionally
+ submitted to Licensor for inclusion in the Work by the copyright owner
+ or by an individual or Legal Entity authorized to submit on behalf of
+ the copyright owner. For the purposes of this definition, "submitted"
+ means any form of electronic, verbal, or written communication sent
+ to the Licensor or its representatives, including but not limited to
+ communication on electronic mailing lists, source code control systems,
+ and issue tracking systems that are managed by, or on behalf of, the
+ Licensor for the purpose of discussing and improving the Work, but
+ excluding communication that is conspicuously marked or otherwise
+ designated in writing by the copyright owner as "Not a Contribution."
+
+ "Contributor" shall mean Licensor and any individual or Legal Entity
+ on behalf of whom a Contribution has been received by Licensor and
+ subsequently incorporated within the Work.
+
+ 2. Grant of Copyright License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ copyright license to reproduce, prepare Derivative Works of,
+ publicly display, publicly perform, sublicense, and distribute the
+ Work and such Derivative Works in Source or Object form.
+
+ 3. Grant of Patent License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ (except as stated in this section) patent license to make, have made,
+ use, offer to sell, sell, import, and otherwise transfer the Work,
+ where such license applies only to those patent claims licensable
+ by such Contributor that are necessarily infringed by their
+ Contribution(s) alone or by combination of their Contribution(s)
+ with the Work to which such Contribution(s) was submitted. If You
+ institute patent litigation against any entity (including a
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
+ or a Contribution incorporated within the Work constitutes direct
+ or contributory patent infringement, then any patent licenses
+ granted to You under this License for that Work shall terminate
+ as of the date such litigation is filed.
+
+ 4. Redistribution. You may reproduce and distribute copies of the
+ Work or Derivative Works thereof in any medium, with or without
+ modifications, and in Source or Object form, provided that You
+ meet the following conditions:
+
+ (a) You must give any other recipients of the Work or
+ Derivative Works a copy of this License; and
+
+ (b) You must cause any modified files to carry prominent notices
+ stating that You changed the files; and
+
+ (c) You must retain, in the Source form of any Derivative Works
+ that You distribute, all copyright, patent, trademark, and
+ attribution notices from the Source form of the Work,
+ excluding those notices that do not pertain to any part of
+ the Derivative Works; and
+
+ (d) If the Work includes a "NOTICE" text file as part of its
+ distribution, then any Derivative Works that You distribute must
+ include a readable copy of the attribution notices contained
+ within such NOTICE file, excluding those notices that do not
+ pertain to any part of the Derivative Works, in at least one
+ of the following places: within a NOTICE text file distributed
+ as part of the Derivative Works; within the Source form or
+ documentation, if provided along with the Derivative Works; or,
+ within a display generated by the Derivative Works, if and
+ wherever such third-party notices normally appear. The contents
+ of the NOTICE file are for informational purposes only and
+ do not modify the License. You may add Your own attribution
+ notices within Derivative Works that You distribute, alongside
+ or as an addendum to the NOTICE text from the Work, provided
+ that such additional attribution notices cannot be construed
+ as modifying the License.
+
+ You may add Your own copyright statement to Your modifications and
+ may provide additional or different license terms and conditions
+ for use, reproduction, or distribution of Your modifications, or
+ for any such Derivative Works as a whole, provided Your use,
+ reproduction, and distribution of the Work otherwise complies with
+ the conditions stated in this License.
+
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
+ any Contribution intentionally submitted for inclusion in the Work
+ by You to the Licensor shall be under the terms and conditions of
+ this License, without any additional terms or conditions.
+ Notwithstanding the above, nothing herein shall supersede or modify
+ the terms of any separate license agreement you may have executed
+ with Licensor regarding such Contributions.
+
+ 6. Trademarks. This License does not grant permission to use the trade
+ names, trademarks, service marks, or product names of the Licensor,
+ except as required for reasonable and customary use in describing the
+ origin of the Work and reproducing the content of the NOTICE file.
+
+ 7. Disclaimer of Warranty. Unless required by applicable law or
+ agreed to in writing, Licensor provides the Work (and each
+ Contributor provides its Contributions) on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied, including, without limitation, any warranties or conditions
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+ PARTICULAR PURPOSE. You are solely responsible for determining the
+ appropriateness of using or redistributing the Work and assume any
+ risks associated with Your exercise of permissions under this License.
+
+ 8. Limitation of Liability. In no event and under no legal theory,
+ whether in tort (including negligence), contract, or otherwise,
+ unless required by applicable law (such as deliberate and grossly
+ negligent acts) or agreed to in writing, shall any Contributor be
+ liable to You for damages, including any direct, indirect, special,
+ incidental, or consequential damages of any character arising as a
+ result of this License or out of the use or inability to use the
+ Work (including but not limited to damages for loss of goodwill,
+ work stoppage, computer failure or malfunction, or any and all
+ other commercial damages or losses), even if such Contributor
+ has been advised of the possibility of such damages.
+
+ 9. Accepting Warranty or Additional Liability. While redistributing
+ the Work or Derivative Works thereof, You may choose to offer,
+ and charge a fee for, acceptance of support, warranty, indemnity,
+ or other liability obligations and/or rights consistent with this
+ License. However, in accepting such obligations, You may act only
+ on Your own behalf and on Your sole responsibility, not on behalf
+ of any other Contributor, and only if You agree to indemnify,
+ defend, and hold each Contributor harmless for any liability
+ incurred by, or claims asserted against, such Contributor by reason
+ of your accepting any such warranty or additional liability.
+
+ END OF TERMS AND CONDITIONS
+
+ APPENDIX: How to apply the Apache License to your work.
+
+ To apply the Apache License to your work, attach the following
+ boilerplate notice, with the fields enclosed by brackets "{}"
+ replaced with your own identifying information. (Don't include
+ the brackets!) The text should be enclosed in the appropriate
+ comment syntax for the file format. We also recommend that a
+ file or class name and description of purpose be included on the
+ same "printed page" as the copyright notice for easier
+ identification within third-party archives.
+
+ Copyright {yyyy} {name of copyright owner}
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
\ No newline at end of file
diff --git a/tools/dependencies/known-dependencies.txt b/tools/dependencies/known-dependencies.txt
index c0f1f84e..246adc0f 100755
--- a/tools/dependencies/known-dependencies.txt
+++ b/tools/dependencies/known-dependencies.txt
@@ -421,6 +421,7 @@ jline-0.9.94.jar
jline-2.11.jar
jline-2.12.jar
jmespath-java-1.12.37.jar
+jmockdata-4.3.0.jar
jna-4.5.1.jar
joda-time-1.6.jar
joda-time-2.10.3.jar