You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hudi.apache.org by Prabhu Joseph <pr...@gmail.com> on 2023/11/14 17:30:13 UTC

[Discussion] Avoid direct dependency on Flink Table Planner Jar

Hi!

Most of the Flink connectors (recently Hive
<https://issues.apache.org/jira/browse/FLINK-31575>) do not directly depend
on the Flink Table Planner Jar. They all work with the Flink Table Planner
Loader Jar. Hudi's Flink Table API depends on the Flink Table Planne
<https://github.com/apache/hudi/blob/master/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/sort/SortOperatorGen.java#L27>r
jar. This affects some of the Flink apps, which use two connectors (say,
Hudi and Hive). Hudi Connector requires the Flink Table Planner Jar,
whereas Hive Connector requires the Flink Table Planner Loader Jar, and
both Flink Table Planner jar and Flnk Table Planner Loader jar cannot exist
together.

Anyone else facing this issue and has any fixes for it? I thought about
relocating the Flink Table Planner classes in Hudi, so it won't cause any
issues when working with other connectors. Any thoughts?

*1. Flink Hudi Table API requires Flink Table Planner Jar*

Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/flink/table/planner/codegen/sort/SortCodeGenerator at
org.apache.hudi.sink.bucket.BucketBulkInsertWriterHelper.getFileIdSorterGen(BucketBulkInsert

*2. When both Flink Table Planner jar and Flink Table Planner Loader jar
exist together in classpath, Flink App fails with below.*

*Caused by: org.apache.flink.table.api.ValidationException: Multiple
factories for identifier 'default' that implement
'org.apache.flink.table.delegation.ExecutorFactory' found in the classpath.*

*Ambiguous factory classes are:
org.apache.flink.table.planner.delegation.DefaultExecutorFactory
org.apache.flink.table.planner.loader.DelegateExecutorFactory*

Thanks,
Prabhu Joseph