You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Anand Inguva (Jira)" <ji...@apache.org> on 2022/03/08 18:21:00 UTC

[jira] [Updated] (BEAM-13709) PipelineOptions() and from_dictionary parsing use_public_ips and no_use_public_ips differently

     [ https://issues.apache.org/jira/browse/BEAM-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anand Inguva updated BEAM-13709:
--------------------------------
    Fix Version/s: 2.38.0
       Resolution: Fixed
           Status: Resolved  (was: In Progress)

> PipelineOptions() and from_dictionary parsing use_public_ips and no_use_public_ips differently
> ----------------------------------------------------------------------------------------------
>
>                 Key: BEAM-13709
>                 URL: https://issues.apache.org/jira/browse/BEAM-13709
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Minbo Bae
>            Assignee: Anand Inguva
>            Priority: P2
>              Labels: starter, usability
>             Fix For: 2.38.0
>
>          Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> {{PipelineOptions}} in Python has two methods to pass a param dict: using in constructor {{PipelineOptions(**params)}} or {{{}PipelineOptions.from_dictionary(params){}}}.
> But, they work slightly differently:
>  * [PipelineOptions(**params)|https://github.com/apache/beam/blob/v2.35.0/sdks/python/apache_beam/options/pipeline_options.py#L313-L324] discards an option if it is not defined as a dest of {{argparse}} in an Option class. For example, {{no_use_public_ips=True}} is ignored and the Dataflow job will run with public IPs. To disable public IPs, the option dictionary must use {{{}use_public_ips{}}}.
>  * [PipelineOptions.from_dictionary()|https://github.com/apache/beam/blob/v2.35.0/sdks/python/apache_beam/options/pipeline_options.py#L229] skips an option if the option value is {{{}False{}}}. For example, {{use_public_ips=False}} is ignored and the Dataflow job will run with public IPs. To disable public IPs, the option dictionary must use {{no_use_public_ips.}}
> This makes the user very confused, and sometimes the pipeline works in an unexpected way. 
> We must have the consistent behavior between the two methods, or at least a warning about invalid ignored options.
> BEAM-9093 dealt with a similar issue for {{PipelineOptions()}}. Like the issue, I guess adding a warning in `PipelineOptions.from_dictionary()` for ignored options can help reducing the confusion, if we cannot have two methods have exactly the same behavior.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)