You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Королькевич Михаил <mk...@yandex.ru> on 2021/12/03 09:22:39 UTC

PyFlink import internal packages

Hi Flink Team,

Im trying to implement app on pyflink.

I would like to structure the directory as follows:

flink_app/

data_service/

s3.py

filesystem.py

validator/

validator.py

metrics/

statictic.py

quality.py

common/

constants.py

main.py <\- entry job

Two questions:

1) is it possible import constants from common in the data_service package? In
clean python we can use an absolute path like "from flink_app.common import
constants".

All files imported to flink "

env = StreamExecutionEnvironment.get_execution_environment()

env.add_python_file('/path_to_flink_app/flink_app')

"

2) Can I split the pipeline from main.py to many files like import env to
another files and return datastream/table back.


Re: PyFlink import internal packages

Posted by Королькевич Михаил <mk...@yandex.ru>.
\+ CABvJ6uUPXuaKNayJ-VT7uPg-ZqDq1xzGqV8arP7RYcEosVQouA@

\- все

Hi, thank you!

it was very helpful!

03.12.2021, 12:48, "Shuiqiang Chen" <ac...@gmail.com>:

> Hi,
>
> Actually, you are able to develop your app in the clean python way. It's
> fine to split the code into multiple files and there is no need to call
> `env.add_python_file()` explicitly. When submitting the PyFlink job you can
> specify python files and entry main module with option --pyFiles and
> --pyModule[1], like:
>
> $ ./bin/flink run --pyModule flink_app.main --pyFiles ${WORKSPACE}/flink_app
>
> In this way, all files under the directory will be added to the PYTHONPAHT
> of both the local client and the remote python UDF worker.
>
> Hope this helps!
>
> Best,
>
> Shuiqiang
>
> [1] <https://nightlies.apache.org/flink/flink-docs-
> master/docs/deployment/cli/#submitting-pyflink-jobs>
>
> Королькевич Михаил <[mkorolkevich@yandex.ru](mailto:mkorolkevich@yandex.ru)>
> 于2021年12月3日周五 下午5:23写道:
>

>> Hi Flink Team,

>>

>> Im trying to implement app on pyflink.

>>

>> I would like to structure the directory as follows:

>>

>> flink_app/

>>

>> data_service/

>>

>> s3.py

>>

>> filesystem.py

>>

>> validator/

>>

>> validator.py

>>

>> metrics/

>>

>> statictic.py

>>

>> quality.py

>>

>> common/

>>

>> constants.py

>>

>> main.py <\- entry job

>>

>> Two questions:

>>

>> 1) is it possible import constants from common in the data_service package?
In clean python we can use an absolute path like "from flink_app.common import
constants".

>>

>> All files imported to flink "

>>

>> env = StreamExecutionEnvironment.get_execution_environment()

>>

>> env.add_python_file('/path_to_flink_app/flink_app')

>>

>> "

>>

>> 2) Can I split the pipeline from main.py to many files like import env to
another files and return datastream/table back.


Re: PyFlink import internal packages

Posted by Shuiqiang Chen <ac...@gmail.com>.
Hi,

Actually, you are able to develop your app in the clean python way. It's
fine to split the code into multiple files and there is no need to call
`env.add_python_file()` explicitly. When submitting the PyFlink job you can
specify python files  and entry main module with option --pyFiles and
--pyModule[1], like:

$ ./bin/flink run --pyModule flink_app.main --pyFiles
${WORKSPACE}/flink_app

In this way, all files under the directory will be added to the PYTHONPAHT
of both the local client and the remote python UDF worker.

Hope this helps!

Best,
Shuiqiang

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/cli/#submitting-pyflink-jobs

Королькевич Михаил <mk...@yandex.ru> 于2021年12月3日周五 下午5:23写道:

> Hi Flink Team,
>
> Im trying to implement app on pyflink.
> I would like to structure the directory as follows:
>
>
> flink_app/
>     data_service/
>         s3.py
>         filesystem.py
>     validator/
>         validator.py
>     metrics/
>         statictic.py
>         quality.py
>     common/
>         constants.py
>     main.py <- entry job
>
> Two questions:
> 1) is it possible import constants from common in the data_service
> package? In clean python we can use an absolute path like "from
> flink_app.common import constants".
> All files imported to flink "
>     env = StreamExecutionEnvironment.get_execution_environment()
>     env.add_python_file('/path_to_flink_app/flink_app')
> "
> 2) Can I split the pipeline from main.py to many files like import env to
> another files and return datastream/table back.
>