You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "sunjincheng (Jira)" <ji...@apache.org> on 2020/04/02 09:39:00 UTC

[jira] [Comment Edited] (FLINK-16943) Support adding jars in PyFlink

    [ https://issues.apache.org/jira/browse/FLINK-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073553#comment-17073553 ] 

sunjincheng edited comment on FLINK-16943 at 4/2/20, 9:38 AM:
--------------------------------------------------------------

Thanks for report this issue, I think this is very important for Python users as usually Python users know very little about Java, merge JARs are very difficult for them, I have also encountered feedback from many Chinese users, and I wrote a [blog|https://enjoyment.cool/2020/03/31/Apache-Flink-%E6%89%AB%E9%9B%B7%E7%B3%BB%E5%88%97-PyFlink%E5%A6%82%E4%BD%95%E8%A7%A3%E5%86%B3%E5%A4%9AJAR%E5%8C%85%E4%BE%9D%E8%B5%96%E9%97%AE%E9%A2%98/]  for Python users if Merge JARs, but solve the issue in [blog|https://enjoyment.cool/2020/03/31/Apache-Flink-%E6%89%AB%E9%9B%B7%E7%B3%BB%E5%88%97-PyFlink%E5%A6%82%E4%BD%95%E8%A7%A3%E5%86%B3%E5%A4%9AJAR%E5%8C%85%E4%BE%9D%E8%B5%96%E9%97%AE%E9%A2%98/] way is not good compare with on the API level, and on the CLI support for multiple JARs has been added.



was (Author: sunjincheng121):
Thanks for report this issue, I think this is very important for Python users as usually Python users know very little about Java, merge JARs are very difficult for them, I have also encountered feedback from many Chinese users, and I wrote a blog for Python users if Merge JARs, but solve the issue in [blog|https://enjoyment.cool/2020/03/31/Apache-Flink-%E6%89%AB%E9%9B%B7%E7%B3%BB%E5%88%97-PyFlink%E5%A6%82%E4%BD%95%E8%A7%A3%E5%86%B3%E5%A4%9AJAR%E5%8C%85%E4%BE%9D%E8%B5%96%E9%97%AE%E9%A2%98/] way is not good compare with on the API level, and on the CLI support for multiple JARs has been added.


> Support adding jars in PyFlink
> ------------------------------
>
>                 Key: FLINK-16943
>                 URL: https://issues.apache.org/jira/browse/FLINK-16943
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / Python
>            Reporter: Wei Zhong
>            Priority: Major
>
> Since flink-1.10.0 released, many users have complained that PyFlink is inconvenient when loading external jar packages. For local execution, users need to copy the jar files to the lib folder under the installation directory of PyFlink, which is hard to locate. For job submission, users need to merge their jars into one, as `flink run` only accepts one jar file. It may be easy for Java users but difficult for Python users if they haven't touched Java.
> We intend to add a `add_jars` interface on PyFlink TableEnvironment to solve this problem. It will add the jars to the context classloader of Py4j gateway server and add to the `PipelineOptions.JARS` of the configuration of StreamExecutionEnviornment/ExecutionEnviornment.
> Via this interface, users could add jars in their python job. The jars will be loaded immediately, and users could use it even on the next line of the Python code. Submitting a job with multiple external jars won't be a problem anymore because all the jars in `PipelineOptions.JARS` will be added to the JobGraph and upload to the cluster.
> As it is not a big change I'm not sure whether it is necessary to create a FLIP to discuss this. So I created a JIRA first for flexibility. What do you think guys?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)