You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Frank Yellin (JIRA)" <ji...@apache.org> on 2016/08/06 00:06:20 UTC
[jira] [Created] (BEAM-536) Aggregator.py. More misleading
documentation. More bad documentation
Frank Yellin created BEAM-536:
---------------------------------
Summary: Aggregator.py. More misleading documentation. More bad documentation
Key: BEAM-536
URL: https://issues.apache.org/jira/browse/BEAM-536
Project: Beam
Issue Type: Bug
Reporter: Frank Yellin
Priority: Minor
The last paragraph of the documentation for Aggregator is:
You can also query the combined value(s) of an aggregator by calling
aggregated_value() or aggregated_values() on the result object returned after
running a pipeline.
There are multiple problems in this one sentence!
#1) There is no such method aggregated_value() that I can find anywhere.
#2) DirectRunner implements aggregated_values(), but DirectPipelineRunner does not. The latter is the far more interesting case.
#3) When I use a BlockingDirectPipelineRunner and ask for its aggregated_values(), I get an error message indicating that this is not implemented in DirectPipelineRunner. Very confusing since I never asked for a DirectPipelineRunner.
It is clear that this is because BlockingDirectPipelineRunner is a method rather than a class. Is this really the right thing? Will there be other confusing error messages.
#4) The documentation for aggregated_values() says "returns a dict of step names to values of the aggregator." I have no idea what a "step" means in this context. In practice, it seems to be a single-element dictionary whose key is 'user--' prefixed onto the aggregator name.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)