You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@beam.apache.org by Kenneth Knowles <ke...@apache.org> on 2019/10/03 16:25:42 UTC

Re: [DISCUSS] Beam SQL filter push-down

** Bumping this thread especially if you are an IO author **

Really glad you are working on this. The basic idea in your doc seems good.

It seems mostly that Beam SQL contributors have commented on it. There are
many more people who may be interested in this and have valuable feedback,
such as authors of IO connectors. Your examples the great difference
between BigQuery and MongoDB made me think of this. Right now very few IO
connectors have SQL adapters. But at some point almost all of them should
have SQL adapters and should do their best to support pushdown. This may
require changes to the pure Java connector to unlock some capabilities in
the underlying storage system.

Kenn

On Mon, Sep 30, 2019 at 11:04 AM Kirill Kozlov <ki...@google.com>
wrote:

> The objective is to create a universal way for Beam SQL IO APIs to support
> filter/project push-down.
> A proposed way to achieve that is by introducing an interface
> responsible for identifying what portion(s) of a Calc can be moved down to
> IO layer. Also, adding following methods to a BeamSqlTable interface to
> pass necessary parameters to IO APIs:
> - BeamSqlTableFilter supportsFilter(RexNode program, RexNode filter)
> - Boolean supportsProjects()
> - PCollection<Row> buildIOReader(PBegin begin, BeamSqlTableFilter
> filters, List<String> fieldNames)
>
> Please feel free to provide feedback and suggestions on this proposal.
> Thank you!
>
> Here is a more complete design doc:
> https://docs.google.com/document/d/1-ysD7U7qF3MAmSfkbXZO_5PLJBevAL9bktlLCerd_jE/edit?usp=sharing
>
> --
> Kirill Kozlov
>