You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "David Song (Jira)" <ji...@apache.org> on 2019/11/23 07:21:00 UTC

[jira] [Commented] (BEAM-8814) --no_auth flag is boolean type and is misleading

    [ https://issues.apache.org/jira/browse/BEAM-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980705#comment-16980705 ] 

David Song commented on BEAM-8814:
----------------------------------

More details on sdk_worker_main: I've found this bug mainly due that I could not pass no_auth flag when running the worker. sdk_worker_main will serialize the flag value and then somewhere down the line reparse it. When doing this with a type=bool flag and using use_dictionary to deserialize, there will be an error and the program will crash.

Created a simple test to mimic behavior at [https://github.com/wintermelons/beam/pull/1]

Output when running the test:
{code:java}
$ python pipeline_options_bool_test.py
usage: pipeline_options_bool_test.py [-h] [--runner RUNNER] [--streaming]
[some outputs omitted]
pipeline_options_bool_test.py: error: argument --no_auth: ignored explicit argument 'False'
E
======================================================================
ERROR: test_serialize_deserialize (__main__.PipelineOptionsTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.6/argparse.py", line 1775, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/lib/python3.6/argparse.py", line 1981, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/usr/lib/python3.6/argparse.py", line 1903, in consume_optional
    raise ArgumentError(action, msg % explicit_arg)
argparse.ArgumentError: argument --no_auth: ignored explicit argument 'False'During handling of the above exception, another exception occurred:Traceback (most recent call last):
  File "options/pipeline_options_bool_test.py", line 48, in test_serialize_deserialize
    all_options_dict = options.get_all_options()
  File "/home/wintermelons/dev/beam/sdks/python/apache_beam/options/pipeline_options.py", line 266, in get_all_options
    known_args, unknown_args = parser.parse_known_args(self._flags)
  File "/usr/lib/python3.6/argparse.py", line 1782, in parse_known_args
    self.error(str(err))
  File "/home/wintermelons/dev/beam/sdks/python/apache_beam/options/pipeline_options.py", line 123, in error
    super(_BeamArgumentParser, self).error(message)
  File "/usr/lib/python3.6/argparse.py", line 2402, in error
    self.exit(2, _('%(prog)s: error: %(message)s\n') % args)
  File "/usr/lib/python3.6/argparse.py", line 2389, in exit
    _sys.exit(status)
SystemExit: 2{code}

> --no_auth flag is boolean type and is misleading
> ------------------------------------------------
>
>                 Key: BEAM-8814
>                 URL: https://issues.apache.org/jira/browse/BEAM-8814
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-harness
>    Affects Versions: 2.14.0, 2.15.0, 2.16.0
>         Environment: Python2, Python3
>            Reporter: David Song
>            Priority: Blocker
>             Fix For: 2.14.0, 2.15.0, 2.16.0
>
>   Original Estimate: 168h
>          Time Spent: 20m
>  Remaining Estimate: 167h 40m
>
> Pipeline options defines a [no_auth|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options.py#L468]] flag that is type=bool. This type is known to be ambiguous because it will expect a value, but anything passed to it will be considered True. For example, passing in "--no_auth=False" would still evaluate to True. We should instead use action="store_true" which only detects whether the flag is passed or not. 
> Furthermore, [PipelineOptions.from_dictionary|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options.py#L229]] will assume that boolean flags are passed in without values (e.g. passing --no_auth, instead of --no_auth=True). This, combined with type=bool failing without a value, will ensure that it always fails. 
> sdk_worker_main is the only place that uses from_dictionary (aside from tests), and it will crash if no_auth flag is passed. Looking at pipeline_options_test, tests that call [from_dictionary|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options_test.py#L218]] will feed in get_all_options, which means it have intended to only be used for serializing/deserializing flag options.
> So from here, to support the no_auth flag:
>  * we change no_auth so that it is action="store_true", or
>  * we change sdk_worker_main so that it does not use from_dictionary
> Or both.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)