You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by aryan m <ma...@gmail.com> on 2022/05/08 02:55:52 UTC

PyFlink - java code packaging

Hi Users !

   What is the recommended way to package custom java code(which has
boilerplate source setup code, custom sql format code)  and make it
available in python classpath for local pipeline tests ?

For staging and production,  I have the java libraries in the
FLINK_HOME/lib directory and in my python code have the following, which
works.

gateway = get_gateway()
java_import(gateway.jvm, "java_package_base_path.*")

However, for local testing, I am unsure how to replicate the same.


Approaches

-------------

1. [Hack]  Install my custom java library into pyflink/lib
<https://github.com/apache/flink/blob/master/flink-python/apache-flink-libraries/setup.py#L199>
by following a similar pattern of

apache_flink_libraries

2. Extend PyFlinkStreamingTestCase and configure the FLINK_HOME
<https://github.com/apache/flink/blob/master/flink-python/pyflink/find_flink_home.py#L58>
environment

variable to include my java_code packaged as a python library in the

classpath. Next, perform the similar java_import.


Wanted to hear what's the best practice in all such cases ?


Thanks

Arya

Re: PyFlink - java code packaging

Posted by aryan m <ma...@gmail.com>.
Hi !
   Eventually, I came up with a solution following the instructions here
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/python/dependency_management/


module_home = os.path.dirname(find_spec("python_lib_internally_containing_java_lib").origin)
jar_file = 'file:///' + module_home + "/lib/java-util.jar"

env : StreamExecutionEnvironment =
StreamExecutionEnvironment.get_execution_environment()

env.add_jars(jar_file)

gateway = get_gateway()

java_import(gateway.jvm, "java_package_base_path.*")

Here, python_lib_internally_containing_java_lib is a python library
which internally contains a folder *lib* under which exists my java
jar.


Best

Arya




On Sat, May 7, 2022 at 7:55 PM aryan m <ma...@gmail.com> wrote:

> Hi Users !
>
>    What is the recommended way to package custom java code(which has
> boilerplate source setup code, custom sql format code)  and make it
> available in python classpath for local pipeline tests ?
>
> For staging and production,  I have the java libraries in the
> FLINK_HOME/lib directory and in my python code have the following, which
> works.
>
> gateway = get_gateway()
> java_import(gateway.jvm, "java_package_base_path.*")
>
> However, for local testing, I am unsure how to replicate the same.
>
>
> Approaches
>
> -------------
>
> 1. [Hack]  Install my custom java library into pyflink/lib <https://github.com/apache/flink/blob/master/flink-python/apache-flink-libraries/setup.py#L199> by following a similar pattern of
>
> apache_flink_libraries
>
> 2. Extend PyFlinkStreamingTestCase and configure the FLINK_HOME <https://github.com/apache/flink/blob/master/flink-python/pyflink/find_flink_home.py#L58> environment
>
> variable to include my java_code packaged as a python library in the
>
> classpath. Next, perform the similar java_import.
>
>
> Wanted to hear what's the best practice in all such cases ?
>
>
> Thanks
>
> Arya
>
>