You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Yik San Chan <ev...@gmail.com> on 2021/04/26 12:45:57 UTC

Contradictory docs: python.files config can include not only python files

Hi community,

In
https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html,
regarding python.files:

> Attach custom python files for job.

This makes readers think only Python files are allowed here. However, in
https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs
:

./bin/flink run \
      --python examples/python/table/batch/word_count.py \
      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt

It is obviously including .txt file that is not Python files.

I believe it is contradictory here. Can anyone confirm?

Best,
Yik San

Re: Contradictory docs: python.files config can include not only python files

Posted by Yik San Chan <ev...@gmail.com>.
Hi Dian,

I created a PR to fix the docs. https://github.com/apache/flink/pull/15779

On Tue, Apr 27, 2021 at 2:08 PM Dian Fu <di...@gmail.com> wrote:

> Thanks for the suggestion. It makes sense to me~.
>
> 2021年4月27日 上午10:28,Yik San Chan <ev...@gmail.com> 写道:
>
> Hi Dian,
>
> If that's the case, shall we reword "Attach custom python files for job."
> into "attach custom files that could be put in PYTHONPATH, e.g., .zip,
> .whl, etc."
>
> Best,
> Yik San
>
> On Tue, Apr 27, 2021 at 10:08 AM Dian Fu <di...@gmail.com> wrote:
>
>> Hi Yik San,
>>
>> All the files which could be put in the PYTHONPATH are allowed here, e.g.
>> .zip, .whl, etc.
>>
>> Regards,
>> Dian
>>
>> 2021年4月27日 上午8:16,Yik San Chan <ev...@gmail.com> 写道:
>>
>> Hi Dian,
>>
>> It is still not clear to me - does it only allow Python files (.py), or
>> not?
>>
>> Best,
>> Yik San
>>
>> On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <di...@gmail.com> wrote:
>>
>>> Hi Yik San,
>>>
>>> 1) what `--pyFiles` is used for:
>>> All the files specified via `--pyFiles` will be put in the PYTHONPATH of
>>> the Python worker during execution and then they will be available for the
>>> Python user-defined functions during execution.
>>>
>>> 2) validate for the files passed to `--pyFiles`
>>> Currently it will not validate the files passed to this argument. I also
>>> think that it’s not necessary and not able to perform such kind of check.
>>> Do you have any advice for this?
>>>
>>> Regards,
>>> Dian
>>>
>>> 2021年4月26日 下午8:45,Yik San Chan <ev...@gmail.com> 写道:
>>>
>>> Hi community,
>>>
>>> In
>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html,
>>> regarding python.files:
>>>
>>> > Attach custom python files for job.
>>>
>>> This makes readers think only Python files are allowed here. However, in
>>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs
>>> :
>>>
>>> ./bin/flink run \
>>>       --python examples/python/table/batch/word_count.py \
>>>       --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
>>>
>>> It is obviously including .txt file that is not Python files.
>>>
>>> I believe it is contradictory here. Can anyone confirm?
>>>
>>> Best,
>>> Yik San
>>>
>>>
>>>
>>
>

Re: Contradictory docs: python.files config can include not only python files

Posted by Dian Fu <di...@gmail.com>.
Thanks for the suggestion. It makes sense to me~. 

> 2021年4月27日 上午10:28,Yik San Chan <ev...@gmail.com> 写道:
> 
> Hi Dian,
> 
> If that's the case, shall we reword "Attach custom python files for job." into "attach custom files that could be put in PYTHONPATH, e.g., .zip, .whl, etc."
> 
> Best,
> Yik San
> 
> On Tue, Apr 27, 2021 at 10:08 AM Dian Fu <dian0511.fu@gmail.com <ma...@gmail.com>> wrote:
> Hi Yik San,
> 
> All the files which could be put in the PYTHONPATH are allowed here, e.g. .zip, .whl, etc.
> 
> Regards,
> Dian
> 
>> 2021年4月27日 上午8:16,Yik San Chan <evan.chanyiksan@gmail.com <ma...@gmail.com>> 写道:
>> 
>> Hi Dian,
>> 
>> It is still not clear to me - does it only allow Python files (.py), or not?
>> 
>> Best,
>> Yik San
>> 
>> On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <dian0511.fu@gmail.com <ma...@gmail.com>> wrote:
>> Hi Yik San,
>> 
>> 1) what `--pyFiles` is used for:
>> All the files specified via `--pyFiles` will be put in the PYTHONPATH of the Python worker during execution and then they will be available for the Python user-defined functions during execution. 
>> 
>> 2) validate for the files passed to `--pyFiles`
>> Currently it will not validate the files passed to this argument. I also think that it’s not necessary and not able to perform such kind of check. Do you have any advice for this?
>> 
>> Regards,
>> Dian
>> 
>>> 2021年4月26日 下午8:45,Yik San Chan <evan.chanyiksan@gmail.com <ma...@gmail.com>> 写道:
>>> 
>>> Hi community,
>>> 
>>> In https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html <https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html>, regarding python.files:
>>> 
>>> > Attach custom python files for job.
>>> 
>>> This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs <https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs>:
>>> 
>>> ./bin/flink run \
>>>       --python examples/python/table/batch/word_count.py \
>>>       --pyFiles file:///user.txt,hdfs:/// <>$namenode_address/username.txt
>>> It is obviously including .txt file that is not Python files.
>>> 
>>> I believe it is contradictory here. Can anyone confirm?
>>> 
>>> Best,
>>> Yik San
>> 
> 


Re: Contradictory docs: python.files config can include not only python files

Posted by Yik San Chan <ev...@gmail.com>.
Hi Dian,

If that's the case, shall we reword "Attach custom python files for job."
into "attach custom files that could be put in PYTHONPATH, e.g., .zip,
.whl, etc."

Best,
Yik San

On Tue, Apr 27, 2021 at 10:08 AM Dian Fu <di...@gmail.com> wrote:

> Hi Yik San,
>
> All the files which could be put in the PYTHONPATH are allowed here, e.g.
> .zip, .whl, etc.
>
> Regards,
> Dian
>
> 2021年4月27日 上午8:16,Yik San Chan <ev...@gmail.com> 写道:
>
> Hi Dian,
>
> It is still not clear to me - does it only allow Python files (.py), or
> not?
>
> Best,
> Yik San
>
> On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <di...@gmail.com> wrote:
>
>> Hi Yik San,
>>
>> 1) what `--pyFiles` is used for:
>> All the files specified via `--pyFiles` will be put in the PYTHONPATH of
>> the Python worker during execution and then they will be available for the
>> Python user-defined functions during execution.
>>
>> 2) validate for the files passed to `--pyFiles`
>> Currently it will not validate the files passed to this argument. I also
>> think that it’s not necessary and not able to perform such kind of check.
>> Do you have any advice for this?
>>
>> Regards,
>> Dian
>>
>> 2021年4月26日 下午8:45,Yik San Chan <ev...@gmail.com> 写道:
>>
>> Hi community,
>>
>> In
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html,
>> regarding python.files:
>>
>> > Attach custom python files for job.
>>
>> This makes readers think only Python files are allowed here. However, in
>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs
>> :
>>
>> ./bin/flink run \
>>       --python examples/python/table/batch/word_count.py \
>>       --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
>>
>> It is obviously including .txt file that is not Python files.
>>
>> I believe it is contradictory here. Can anyone confirm?
>>
>> Best,
>> Yik San
>>
>>
>>
>

Re: Contradictory docs: python.files config can include not only python files

Posted by Dian Fu <di...@gmail.com>.
Hi Yik San,

All the files which could be put in the PYTHONPATH are allowed here, e.g. .zip, .whl, etc.

Regards,
Dian

> 2021年4月27日 上午8:16,Yik San Chan <ev...@gmail.com> 写道:
> 
> Hi Dian,
> 
> It is still not clear to me - does it only allow Python files (.py), or not?
> 
> Best,
> Yik San
> 
> On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <dian0511.fu@gmail.com <ma...@gmail.com>> wrote:
> Hi Yik San,
> 
> 1) what `--pyFiles` is used for:
> All the files specified via `--pyFiles` will be put in the PYTHONPATH of the Python worker during execution and then they will be available for the Python user-defined functions during execution. 
> 
> 2) validate for the files passed to `--pyFiles`
> Currently it will not validate the files passed to this argument. I also think that it’s not necessary and not able to perform such kind of check. Do you have any advice for this?
> 
> Regards,
> Dian
> 
>> 2021年4月26日 下午8:45,Yik San Chan <evan.chanyiksan@gmail.com <ma...@gmail.com>> 写道:
>> 
>> Hi community,
>> 
>> In https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html <https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html>, regarding python.files:
>> 
>> > Attach custom python files for job.
>> 
>> This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs <https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs>:
>> 
>> ./bin/flink run \
>>       --python examples/python/table/batch/word_count.py \
>>       --pyFiles file:///user.txt,hdfs:/// <>$namenode_address/username.txt
>> It is obviously including .txt file that is not Python files.
>> 
>> I believe it is contradictory here. Can anyone confirm?
>> 
>> Best,
>> Yik San
> 


Re: Contradictory docs: python.files config can include not only python files

Posted by Yik San Chan <ev...@gmail.com>.
Hi Dian,

It is still not clear to me - does it only allow Python files (.py), or not?

Best,
Yik San

On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <di...@gmail.com> wrote:

> Hi Yik San,
>
> 1) what `--pyFiles` is used for:
> All the files specified via `--pyFiles` will be put in the PYTHONPATH of
> the Python worker during execution and then they will be available for the
> Python user-defined functions during execution.
>
> 2) validate for the files passed to `--pyFiles`
> Currently it will not validate the files passed to this argument. I also
> think that it’s not necessary and not able to perform such kind of check.
> Do you have any advice for this?
>
> Regards,
> Dian
>
> 2021年4月26日 下午8:45,Yik San Chan <ev...@gmail.com> 写道:
>
> Hi community,
>
> In
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html,
> regarding python.files:
>
> > Attach custom python files for job.
>
> This makes readers think only Python files are allowed here. However, in
> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs
> :
>
> ./bin/flink run \
>       --python examples/python/table/batch/word_count.py \
>       --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
>
> It is obviously including .txt file that is not Python files.
>
> I believe it is contradictory here. Can anyone confirm?
>
> Best,
> Yik San
>
>
>

Re: Contradictory docs: python.files config can include not only python files

Posted by Dian Fu <di...@gmail.com>.
Hi Yik San,

1) what `--pyFiles` is used for:
All the files specified via `--pyFiles` will be put in the PYTHONPATH of the Python worker during execution and then they will be available for the Python user-defined functions during execution. 

2) validate for the files passed to `--pyFiles`
Currently it will not validate the files passed to this argument. I also think that it’s not necessary and not able to perform such kind of check. Do you have any advice for this?

Regards,
Dian

> 2021年4月26日 下午8:45,Yik San Chan <ev...@gmail.com> 写道:
> 
> Hi community,
> 
> In https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html <https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html>, regarding python.files:
> 
> > Attach custom python files for job.
> 
> This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs <https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs>:
> 
> ./bin/flink run \
>       --python examples/python/table/batch/word_count.py \
>       --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
> It is obviously including .txt file that is not Python files.
> 
> I believe it is contradictory here. Can anyone confirm?
> 
> Best,
> Yik San