You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Hequn Cheng (Jira)" <ji...@apache.org> on 2019/12/06 01:31:00 UTC

[jira] [Commented] (FLINK-14590) Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"

    [ https://issues.apache.org/jira/browse/FLINK-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989296#comment-16989296 ] 

Hequn Cheng commented on FLINK-14590:
-------------------------------------

Resolved in 1.10.0 via c7572a6297802f561da6ab492593e65a2547e618

> Unify the working directory of Java process and Python process when submitting python jobs via "flink run -py"
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-14590
>                 URL: https://issues.apache.org/jira/browse/FLINK-14590
>             Project: Flink
>          Issue Type: Bug
>          Components: API / Python
>            Reporter: Wei Zhong
>            Assignee: Wei Zhong
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Assume we enter this flink directory with following structure:
> {code:java}
> flink/
>       bin/
>           flink
>           pyflink-shell.sh
>           python-gateway-server.sh
>           ...
>       bad_case/
>                word_count.py
>                data.txt
>       lib/...
>       opt/...{code}
>  And the word_count.py has such a piece of code:
> {code:java}
>     t_config = TableConfig()
>     env = StreamExecutionEnvironment.get_execution_environment()
>     t_env = StreamTableEnvironment.create(env, t_config)
>     env._j_stream_execution_environment.registerCachedFile("data", "bad_case/data.txt")
>     with open("bad_case/data.txt", "r") as f:
>         content = f.read()
>     elements = [(word, 1) for word in content.split(" ")]
>     t_env.from_elements(elements, ["word", "count"]){code}
> Then we enter the "flink" directory and run:
> {code:java}
> bin/flink run -py bad_case/word_count.py
> {code}
> The program will fail at the line of "with open("bad_case/data.txt", "r") as f:".
> It is because the working directory of Java process is current directory but the working directory of Python process is a temporary directory.
> So there is no problem when relative path is used in the api call to java process. But if relative path is used in other place such as native file access, it will fail, because the working directory of python process has been change to a temporary directory that is not known to users.
> I think it will cause some confusion for users, especially after we support dependency management. It will be great if we unify the working directory of Java process and Python process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)