You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by da...@apache.org on 2023/02/05 02:10:05 UTC
[hudi] branch asf-site updated: [HUDI-5699] Add more details about how to build flink bundle in quick start (#7851)
This is an automated email from the ASF dual-hosted git repository.
danny0405 pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new e63596f9132 [HUDI-5699] Add more details about how to build flink bundle in quick start (#7851)
e63596f9132 is described below
commit e63596f91328cf989ccf7999437559dda9182360
Author: Danny Chan <yu...@gmail.com>
AuthorDate: Sun Feb 5 10:09:59 2023 +0800
[HUDI-5699] Add more details about how to build flink bundle in quick start (#7851)
---
website/docs/flink-quick-start-guide.md | 23 ++++++++++++++++++----
website/docs/syncing_metastore.md | 4 ++--
.../version-0.12.2/flink-quick-start-guide.md | 9 +++++++--
.../version-0.12.2/syncing_metastore.md | 4 ++--
4 files changed, 30 insertions(+), 10 deletions(-)
diff --git a/website/docs/flink-quick-start-guide.md b/website/docs/flink-quick-start-guide.md
index 1a0cbee9cbc..c6098e2b712 100644
--- a/website/docs/flink-quick-start-guide.md
+++ b/website/docs/flink-quick-start-guide.md
@@ -35,13 +35,14 @@ quick start tool for SQL users.
#### Step.1 download Flink jar
-Hudi works with both Flink 1.13, Flink 1.14 and Flink 1.15. You can follow the
+Hudi works with both Flink 1.13, Flink 1.14, Flink 1.15 and Flink 1.16. You can follow the
instructions [here](https://flink.apache.org/downloads) for setting up Flink. Then choose the desired Hudi-Flink bundle
jar to work with different Flink and Scala versions:
- `hudi-flink1.13-bundle`
- `hudi-flink1.14-bundle`
- `hudi-flink1.15-bundle`
+- `hudi-flink1.16-bundle`
#### Step.2 start Flink cluster
Start a standalone Flink cluster within hadoop environment.
@@ -62,9 +63,9 @@ export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`
```
#### Step.3 start Flink SQL client
-Hudi has a prepared bundle jar for Flink, which should be loaded in the Flink SQL Client when it starts up.
-You can build the jar manually under path `hudi-source-dir/packaging/hudi-flink-bundle`, or download it from the
-[Apache Official Repository](https://repo.maven.apache.org/maven2/org/apache/hudi/hudi-flink-bundle_2.11/).
+Hudi supports packaged bundle jar for Flink, which should be loaded in the Flink SQL Client when it starts up.
+You can build the jar manually under path `hudi-source-dir/packaging/hudi-flink-bundle`(see [Build Flink Bundle Jar](/docs/syncing_metastore#install)), or download it from the
+[Apache Official Repository](https://repo.maven.apache.org/maven2/org/apache/hudi/).
Now starts the SQL CLI:
@@ -118,6 +119,15 @@ dependency to your project:
</dependency>
```
+```xml
+<!-- Flink 1.16 -->
+<dependency>
+ <groupId>org.apache.hudi</groupId>
+ <artifactId>hudi-flink1.16-bundle</artifactId>
+ <version>0.12.2</version>
+</dependency>
+```
+
</TabItem>
</Tabs
@@ -288,6 +298,11 @@ Hudi Flink also provides capability to obtain a stream of records that changed s
This can be achieved using Hudi's streaming querying and providing a start time from which changes need to be streamed.
We do not need to specify endTime, if we want all changes after the given commit (as is the common case).
+:::note
+The bundle jar with **hive profile** is needed for streaming query, by default the officially released flink bundle is built **without**
+**hive profile**, the jar needs to be built manually, see [Build Flink Bundle Jar](/docs/syncing_metastore#install) for more details.
+:::
+
```sql
CREATE TABLE t1(
uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
diff --git a/website/docs/syncing_metastore.md b/website/docs/syncing_metastore.md
index f5204c15c47..fe1b62ef58b 100644
--- a/website/docs/syncing_metastore.md
+++ b/website/docs/syncing_metastore.md
@@ -82,8 +82,8 @@ To use this mode, just pass the jdbc url to the hive server (`--use-jdbc` is tru
#### Install
-Now you can git clone Hudi master branch to test Flink hive sync. The first step is to install Hudi to get `hudi-flink-bundle_2.11-0.x.jar`.
-`hudi-flink-bundle` module pom.xml sets the scope related to hive as `provided` by default. If you want to use hive sync, you need to use the
+Now you can git clone Hudi master branch to test Flink hive sync. The first step is to install Hudi to get `hudi-flink1.1x-bundle-0.x.x.jar`.
+ `hudi-flink-bundle` module pom.xml sets the scope related to hive as `provided` by default. If you want to use hive sync, you need to use the
profile `flink-bundle-shade-hive` during packaging. Executing command below to install:
```bash
diff --git a/website/versioned_docs/version-0.12.2/flink-quick-start-guide.md b/website/versioned_docs/version-0.12.2/flink-quick-start-guide.md
index 1a0cbee9cbc..b8b08f636e6 100644
--- a/website/versioned_docs/version-0.12.2/flink-quick-start-guide.md
+++ b/website/versioned_docs/version-0.12.2/flink-quick-start-guide.md
@@ -63,7 +63,7 @@ export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`
#### Step.3 start Flink SQL client
Hudi has a prepared bundle jar for Flink, which should be loaded in the Flink SQL Client when it starts up.
-You can build the jar manually under path `hudi-source-dir/packaging/hudi-flink-bundle`, or download it from the
+You can build the jar manually under path `hudi-source-dir/packaging/hudi-flink-bundle` (see [Build Flink Bundle Jar](/docs/syncing_metastore#install)), or download it from the
[Apache Official Repository](https://repo.maven.apache.org/maven2/org/apache/hudi/hudi-flink-bundle_2.11/).
Now starts the SQL CLI:
@@ -286,7 +286,12 @@ denoted by the timestamp. Look for changes in `_hoodie_commit_time`, `age` field
Hudi Flink also provides capability to obtain a stream of records that changed since given commit timestamp.
This can be achieved using Hudi's streaming querying and providing a start time from which changes need to be streamed.
-We do not need to specify endTime, if we want all changes after the given commit (as is the common case).
+We do not need to specify endTime, if we want all changes after the given commit (as is the common case).
+
+:::note
+The bundle jar with **hive profile** is needed for streaming query, by default the officially released flink bundle is built **without**
+**hive profile**, the jar needs to be built manually, see [Build Flink Bundle Jar](/docs/syncing_metastore#install) for more details.
+:::
```sql
CREATE TABLE t1(
diff --git a/website/versioned_docs/version-0.12.2/syncing_metastore.md b/website/versioned_docs/version-0.12.2/syncing_metastore.md
index f5204c15c47..fe1b62ef58b 100644
--- a/website/versioned_docs/version-0.12.2/syncing_metastore.md
+++ b/website/versioned_docs/version-0.12.2/syncing_metastore.md
@@ -82,8 +82,8 @@ To use this mode, just pass the jdbc url to the hive server (`--use-jdbc` is tru
#### Install
-Now you can git clone Hudi master branch to test Flink hive sync. The first step is to install Hudi to get `hudi-flink-bundle_2.11-0.x.jar`.
-`hudi-flink-bundle` module pom.xml sets the scope related to hive as `provided` by default. If you want to use hive sync, you need to use the
+Now you can git clone Hudi master branch to test Flink hive sync. The first step is to install Hudi to get `hudi-flink1.1x-bundle-0.x.x.jar`.
+ `hudi-flink-bundle` module pom.xml sets the scope related to hive as `provided` by default. If you want to use hive sync, you need to use the
profile `flink-bundle-shade-hive` during packaging. Executing command below to install:
```bash