You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Kenneth Knowles (JIRA)" <ji...@apache.org> on 2018/07/02 21:57:00 UTC
[jira] [Created] (BEAM-4719) Enhanced LIMIT support
Kenneth Knowles created BEAM-4719:
-------------------------------------
Summary: Enhanced LIMIT support
Key: BEAM-4719
URL: https://issues.apache.org/jira/browse/BEAM-4719
Project: Beam
Issue Type: New Feature
Components: dsl-sql
Reporter: Kenneth Knowles
Currently, Beam SQL supports LIMIT in two ways:
1. Within a query, the results are subject to LIMIT. This works.
2. The shell knows to cancel a pipeline when the limit is reached, even if there is unfinished unbounded data.
The canceling of a pipeline works via a basic pattern match against the query execution plan, checking a few child nodes of the BeamEnumerableConverter for a BeamSortRel without a collation. If it can figure out what the limit is for the outermost query, then it will cancel the pipeline.
A more robust approach might be to use traits (or some other thorough analysis) to see if there is a known size for the outermost query. This would, for example, be unaffected by any number of layer of non-size-changing transformations.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)