You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@impala.apache.org by Sailesh Mukil <sa...@cloudera.com> on 2017/03/27 18:42:19 UTC

Min/Max runtime filtering on Impala-Kudu

I will be working on a patch to add min/max filter support in Impala, and
as a first step, specifically target the KuduScanNode, since the Kudu
client is already able to accept a Min and a Max that it would internally
use to filter during its scans. Below is a brief design proposal.

*Goal:*

To leverage runtime min/max filter support in Kudu for the potential speed
up of queries over Kudu tables. Kudu does this by taking a min and a max
that Impala will provide and only return values in the range Impala is
interested in.

*[min <= range we're interested in >= max]*

*Proposal:*


   - As a first step, plumb the runtime filter code from
*exec/hdfs-scan-node-base.cc/h
   <http://hdfs-scan-node-base.cc/h>* to *exec/scan-node.cc/h
   <http://scan-node.cc/h>*, so that it can be applied to *KuduScanNode*
   cleanly as well, since *KuduScanNode* and *HdfsScanNodeBase* both
   inherit from *ScanNode.*
   - Reuse the *ColumnStats* class (exec/parquet-column-stats.h) or
   implement a lighter weight version of it to process and store the Min and
   the Max on the build side of the join.
   - Once the Min and Max values are added to the existing runtime filter
   structures, as a first step, we will ignore the Min and Max values for
   non-Kudu tables. Using them for non-Kudu tables can come in as a following
   patch(es).
   - Similarly, the bloom filter will be ignored for Kudu tables, and only
   the Min and Max values will be used, since Kudu does not accept bloom
   filters yet. (https://issues.apache.org/jira/browse/IMPALA-3741)
   - Applying the bloom filter on the Impala side of the Kudu scan (i.e. in
   KuduScanNode) is not in the scope of this patch.


*Complications:*

   - We have to make sure that finding the Min and Max values on the build
   side doesn't regress certain workloads, since the difference between
   generating a bloom filter and generating a Min and a Max, is that a bloom
   filter can be type agnostic (we just take a raw hash over the data) whereas
   a Min and a Max have to be type specific.

Re: Min/Max runtime filtering on Impala-Kudu

Posted by Sailesh Mukil <sa...@cloudera.com>.

On Tue, Mar 28, 2017 at 4:41 PM, Todd Lipcon <to...@cloudera.com> wrote:

> On Tue, Mar 28, 2017 at 4:39 PM, Sailesh Mukil <sa...@cloudera.com>
> wrote:
>
> > On Tue, Mar 28, 2017 at 3:33 AM, Lars Volker <lv...@cloudera.com> wrote:
> >
> > > Thanks for sending this around, Sailesh.
> > >
> > > parquet-column-stats.h is somewhat tied to parquet statistics and will
> > soon
> > > need more parquet-specific fields (null_count and distinct_count). It
> > will
> > > also need extension to copy strings and handle there memory when
> tracking
> > > min/max values for variable length data across row batches. I'm not
> sure
> > > how these changes will affect its suitability for your purposes, but I
> > > thought I'd give a quick heads up.
> > >
> > > Thanks, Lars. In that case, it looks like the right path to take is to
> > implement a simpler version of this for this patch. Thanks for weighing
> in.
> >
> > On Mon, Mar 27, 2017 at 11:35 PM, Todd Lipcon <to...@cloudera.com> wrote:
> > >
> > > > Sounds reasonable to me as well.
> > > >
> > > > The only thing I'd add is that, if it's easy to design the code to be
> > > > extended to pushing small 'IN (...)' predicate for low-cardinality
> > > filters,
> > > > that would be great. eg if the filter can start as an IN(...) and
> then
> > if
> > > > it exceeds 32K (or whatever arbitrary threshold), "collapse" it to
> the
> > > > min/max range predicate?
> > > >
> > > > This should have a big advantage for partition pruning in
> > low-cardinality
> > > > joins against hash-partitioned tables.
> > > >
> > > > -Todd
> > > >
> > >
> >
> > Thanks, Todd. You're right, the IN predicate filters would be a great
> > addition. We can add that as a follow on patch, just to keep this patch
> > simple for now.
> >
>
> Sure, just wanted to suggest that whatever path's taken now, implementation
> wise, is reasonably easy to extend to the IN() case without too much
> backtracking.
>

Yes, once we refactor the filter distribution mechanism to accommodate the
KuduScanNode, extending it to support the IN() filters shouldn't be too
complicated.


> -Todd
>
>
> >
> > >
> > > >
> > > > On Mon, Mar 27, 2017 at 2:29 PM, Matthew Jacobs <mj...@cloudera.com>
> > wrote:
> > > >
> > > > > Thanks for writing this up, Sailesh. It sounds reasonable.
> > > > >
> > > > > On Mon, Mar 27, 2017 at 2:24 PM, Sailesh Mukil <
> sailesh@cloudera.com
> > >
> > > > > wrote:
> > > > > > On Mon, Mar 27, 2017 at 11:49 AM, Marcel Kornacker <
> > > > marcel@cloudera.com>
> > > > > > wrote:
> > > > > >
> > > > > >> On Mon, Mar 27, 2017 at 11:42 AM, Sailesh Mukil <
> > > sailesh@cloudera.com
> > > > >
> > > > > >> wrote:
> > > > > >> > I will be working on a patch to add min/max filter support in
> > > > Impala,
> > > > > and
> > > > > >> > as a first step, specifically target the KuduScanNode, since
> the
> > > > Kudu
> > > > > >> > client is already able to accept a Min and a Max that it would
> > > > > internally
> > > > > >> > use to filter during its scans. Below is a brief design
> > proposal.
> > > > > >> >
> > > > > >> > *Goal:*
> > > > > >> >
> > > > > >> > To leverage runtime min/max filter support in Kudu for the
> > > potential
> > > > > >> speed
> > > > > >> > up of queries over Kudu tables. Kudu does this by taking a min
> > > and a
> > > > > max
> > > > > >> > that Impala will provide and only return values in the range
> > > Impala
> > > > is
> > > > > >> > interested in.
> > > > > >> >
> > > > > >> > *[min <= range we're interested in >= max]*
> > > > > >> >
> > > > > >> > *Proposal:*
> > > > > >> >
> > > > > >> >
> > > > > >> >    - As a first step, plumb the runtime filter code from
> > > > > >> > *exec/hdfs-scan-node-base.cc/h
> > > > > >> >    <http://hdfs-scan-node-base.cc/h>* to *exec/scan-node.cc/h
> > > > > >> >    <http://scan-node.cc/h>*, so that it can be applied to
> > > > > *KuduScanNode*
> > > > > >> >    cleanly as well, since *KuduScanNode* and
> *HdfsScanNodeBase*
> > > both
> > > > > >> >    inherit from *ScanNode.*
> > > > > >>
> > > > > >> Quick comment: please make sure your solution also applies to
> > > > > >> KuduScanNodeMt.
> > > > > >>
> > > > > >
> > > > > > Thanks for the input, I'll make sure to do that.
> > > > > >
> > > > > >
> > > > > >>
> > > > > >> >    - Reuse the *ColumnStats* class
> (exec/parquet-column-stats.h)
> > > or
> > > > > >> >    implement a lighter weight version of it to process and
> store
> > > the
> > > > > Min
> > > > > >> and
> > > > > >> >    the Max on the build side of the join.
> > > > > >> >    - Once the Min and Max values are added to the existing
> > runtime
> > > > > filter
> > > > > >> >    structures, as a first step, we will ignore the Min and Max
> > > > values
> > > > > for
> > > > > >> >    non-Kudu tables. Using them for non-Kudu tables can come in
> > as
> > > a
> > > > > >> following
> > > > > >> >    patch(es).
> > > > > >> >    - Similarly, the bloom filter will be ignored for Kudu
> > tables,
> > > > and
> > > > > >> only
> > > > > >> >    the Min and Max values will be used, since Kudu does not
> > accept
> > > > > bloom
> > > > > >> >    filters yet. (https://issues.apache.org/
> > > jira/browse/IMPALA-3741)
> > > > > >> >    - Applying the bloom filter on the Impala side of the Kudu
> > scan
> > > > > (i.e.
> > > > > >> in
> > > > > >> >    KuduScanNode) is not in the scope of this patch.
> > > > > >> >
> > > > > >> >
> > > > > >> > *Complications:*
> > > > > >> >
> > > > > >> >    - We have to make sure that finding the Min and Max values
> on
> > > the
> > > > > >> build
> > > > > >> >    side doesn't regress certain workloads, since the
> difference
> > > > > between
> > > > > >> >    generating a bloom filter and generating a Min and a Max,
> is
> > > > that a
> > > > > >> bloom
> > > > > >> >    filter can be type agnostic (we just take a raw hash over
> the
> > > > data)
> > > > > >> whereas
> > > > > >> >    a Min and a Max have to be type specific.
> > > > > >>
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Todd Lipcon
> > > > Software Engineer, Cloudera
> > > >
> > >
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: Min/Max runtime filtering on Impala-Kudu

Posted by Todd Lipcon <to...@cloudera.com>.

On Tue, Mar 28, 2017 at 4:39 PM, Sailesh Mukil <sa...@cloudera.com> wrote:

> On Tue, Mar 28, 2017 at 3:33 AM, Lars Volker <lv...@cloudera.com> wrote:
>
> > Thanks for sending this around, Sailesh.
> >
> > parquet-column-stats.h is somewhat tied to parquet statistics and will
> soon
> > need more parquet-specific fields (null_count and distinct_count). It
> will
> > also need extension to copy strings and handle there memory when tracking
> > min/max values for variable length data across row batches. I'm not sure
> > how these changes will affect its suitability for your purposes, but I
> > thought I'd give a quick heads up.
> >
> > Thanks, Lars. In that case, it looks like the right path to take is to
> implement a simpler version of this for this patch. Thanks for weighing in.
>
> On Mon, Mar 27, 2017 at 11:35 PM, Todd Lipcon <to...@cloudera.com> wrote:
> >
> > > Sounds reasonable to me as well.
> > >
> > > The only thing I'd add is that, if it's easy to design the code to be
> > > extended to pushing small 'IN (...)' predicate for low-cardinality
> > filters,
> > > that would be great. eg if the filter can start as an IN(...) and then
> if
> > > it exceeds 32K (or whatever arbitrary threshold), "collapse" it to the
> > > min/max range predicate?
> > >
> > > This should have a big advantage for partition pruning in
> low-cardinality
> > > joins against hash-partitioned tables.
> > >
> > > -Todd
> > >
> >
>
> Thanks, Todd. You're right, the IN predicate filters would be a great
> addition. We can add that as a follow on patch, just to keep this patch
> simple for now.
>

Sure, just wanted to suggest that whatever path's taken now, implementation
wise, is reasonably easy to extend to the IN() case without too much
backtracking.

-Todd


>
> >
> > >
> > > On Mon, Mar 27, 2017 at 2:29 PM, Matthew Jacobs <mj...@cloudera.com>
> wrote:
> > >
> > > > Thanks for writing this up, Sailesh. It sounds reasonable.
> > > >
> > > > On Mon, Mar 27, 2017 at 2:24 PM, Sailesh Mukil <sailesh@cloudera.com
> >
> > > > wrote:
> > > > > On Mon, Mar 27, 2017 at 11:49 AM, Marcel Kornacker <
> > > marcel@cloudera.com>
> > > > > wrote:
> > > > >
> > > > >> On Mon, Mar 27, 2017 at 11:42 AM, Sailesh Mukil <
> > sailesh@cloudera.com
> > > >
> > > > >> wrote:
> > > > >> > I will be working on a patch to add min/max filter support in
> > > Impala,
> > > > and
> > > > >> > as a first step, specifically target the KuduScanNode, since the
> > > Kudu
> > > > >> > client is already able to accept a Min and a Max that it would
> > > > internally
> > > > >> > use to filter during its scans. Below is a brief design
> proposal.
> > > > >> >
> > > > >> > *Goal:*
> > > > >> >
> > > > >> > To leverage runtime min/max filter support in Kudu for the
> > potential
> > > > >> speed
> > > > >> > up of queries over Kudu tables. Kudu does this by taking a min
> > and a
> > > > max
> > > > >> > that Impala will provide and only return values in the range
> > Impala
> > > is
> > > > >> > interested in.
> > > > >> >
> > > > >> > *[min <= range we're interested in >= max]*
> > > > >> >
> > > > >> > *Proposal:*
> > > > >> >
> > > > >> >
> > > > >> >    - As a first step, plumb the runtime filter code from
> > > > >> > *exec/hdfs-scan-node-base.cc/h
> > > > >> >    <http://hdfs-scan-node-base.cc/h>* to *exec/scan-node.cc/h
> > > > >> >    <http://scan-node.cc/h>*, so that it can be applied to
> > > > *KuduScanNode*
> > > > >> >    cleanly as well, since *KuduScanNode* and *HdfsScanNodeBase*
> > both
> > > > >> >    inherit from *ScanNode.*
> > > > >>
> > > > >> Quick comment: please make sure your solution also applies to
> > > > >> KuduScanNodeMt.
> > > > >>
> > > > >
> > > > > Thanks for the input, I'll make sure to do that.
> > > > >
> > > > >
> > > > >>
> > > > >> >    - Reuse the *ColumnStats* class (exec/parquet-column-stats.h)
> > or
> > > > >> >    implement a lighter weight version of it to process and store
> > the
> > > > Min
> > > > >> and
> > > > >> >    the Max on the build side of the join.
> > > > >> >    - Once the Min and Max values are added to the existing
> runtime
> > > > filter
> > > > >> >    structures, as a first step, we will ignore the Min and Max
> > > values
> > > > for
> > > > >> >    non-Kudu tables. Using them for non-Kudu tables can come in
> as
> > a
> > > > >> following
> > > > >> >    patch(es).
> > > > >> >    - Similarly, the bloom filter will be ignored for Kudu
> tables,
> > > and
> > > > >> only
> > > > >> >    the Min and Max values will be used, since Kudu does not
> accept
> > > > bloom
> > > > >> >    filters yet. (https://issues.apache.org/
> > jira/browse/IMPALA-3741)
> > > > >> >    - Applying the bloom filter on the Impala side of the Kudu
> scan
> > > > (i.e.
> > > > >> in
> > > > >> >    KuduScanNode) is not in the scope of this patch.
> > > > >> >
> > > > >> >
> > > > >> > *Complications:*
> > > > >> >
> > > > >> >    - We have to make sure that finding the Min and Max values on
> > the
> > > > >> build
> > > > >> >    side doesn't regress certain workloads, since the difference
> > > > between
> > > > >> >    generating a bloom filter and generating a Min and a Max, is
> > > that a
> > > > >> bloom
> > > > >> >    filter can be type agnostic (we just take a raw hash over the
> > > data)
> > > > >> whereas
> > > > >> >    a Min and a Max have to be type specific.
> > > > >>
> > > >
> > >
> > >
> > >
> > > --
> > > Todd Lipcon
> > > Software Engineer, Cloudera
> > >
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Min/Max runtime filtering on Impala-Kudu

Posted by Sailesh Mukil <sa...@cloudera.com>.

On Tue, Mar 28, 2017 at 3:33 AM, Lars Volker <lv...@cloudera.com> wrote:

> Thanks for sending this around, Sailesh.
>
> parquet-column-stats.h is somewhat tied to parquet statistics and will soon
> need more parquet-specific fields (null_count and distinct_count). It will
> also need extension to copy strings and handle there memory when tracking
> min/max values for variable length data across row batches. I'm not sure
> how these changes will affect its suitability for your purposes, but I
> thought I'd give a quick heads up.
>
> Thanks, Lars. In that case, it looks like the right path to take is to
implement a simpler version of this for this patch. Thanks for weighing in.

On Mon, Mar 27, 2017 at 11:35 PM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > Sounds reasonable to me as well.
> >
> > The only thing I'd add is that, if it's easy to design the code to be
> > extended to pushing small 'IN (...)' predicate for low-cardinality
> filters,
> > that would be great. eg if the filter can start as an IN(...) and then if
> > it exceeds 32K (or whatever arbitrary threshold), "collapse" it to the
> > min/max range predicate?
> >
> > This should have a big advantage for partition pruning in low-cardinality
> > joins against hash-partitioned tables.
> >
> > -Todd
> >
>

Thanks, Todd. You're right, the IN predicate filters would be a great
addition. We can add that as a follow on patch, just to keep this patch
simple for now.

>
> >
> > On Mon, Mar 27, 2017 at 2:29 PM, Matthew Jacobs <mj...@cloudera.com> wrote:
> >
> > > Thanks for writing this up, Sailesh. It sounds reasonable.
> > >
> > > On Mon, Mar 27, 2017 at 2:24 PM, Sailesh Mukil <sa...@cloudera.com>
> > > wrote:
> > > > On Mon, Mar 27, 2017 at 11:49 AM, Marcel Kornacker <
> > marcel@cloudera.com>
> > > > wrote:
> > > >
> > > >> On Mon, Mar 27, 2017 at 11:42 AM, Sailesh Mukil <
> sailesh@cloudera.com
> > >
> > > >> wrote:
> > > >> > I will be working on a patch to add min/max filter support in
> > Impala,
> > > and
> > > >> > as a first step, specifically target the KuduScanNode, since the
> > Kudu
> > > >> > client is already able to accept a Min and a Max that it would
> > > internally
> > > >> > use to filter during its scans. Below is a brief design proposal.
> > > >> >
> > > >> > *Goal:*
> > > >> >
> > > >> > To leverage runtime min/max filter support in Kudu for the
> potential
> > > >> speed
> > > >> > up of queries over Kudu tables. Kudu does this by taking a min
> and a
> > > max
> > > >> > that Impala will provide and only return values in the range
> Impala
> > is
> > > >> > interested in.
> > > >> >
> > > >> > *[min <= range we're interested in >= max]*
> > > >> >
> > > >> > *Proposal:*
> > > >> >
> > > >> >
> > > >> >    - As a first step, plumb the runtime filter code from
> > > >> > *exec/hdfs-scan-node-base.cc/h
> > > >> >    <http://hdfs-scan-node-base.cc/h>* to *exec/scan-node.cc/h
> > > >> >    <http://scan-node.cc/h>*, so that it can be applied to
> > > *KuduScanNode*
> > > >> >    cleanly as well, since *KuduScanNode* and *HdfsScanNodeBase*
> both
> > > >> >    inherit from *ScanNode.*
> > > >>
> > > >> Quick comment: please make sure your solution also applies to
> > > >> KuduScanNodeMt.
> > > >>
> > > >
> > > > Thanks for the input, I'll make sure to do that.
> > > >
> > > >
> > > >>
> > > >> >    - Reuse the *ColumnStats* class (exec/parquet-column-stats.h)
> or
> > > >> >    implement a lighter weight version of it to process and store
> the
> > > Min
> > > >> and
> > > >> >    the Max on the build side of the join.
> > > >> >    - Once the Min and Max values are added to the existing runtime
> > > filter
> > > >> >    structures, as a first step, we will ignore the Min and Max
> > values
> > > for
> > > >> >    non-Kudu tables. Using them for non-Kudu tables can come in as
> a
> > > >> following
> > > >> >    patch(es).
> > > >> >    - Similarly, the bloom filter will be ignored for Kudu tables,
> > and
> > > >> only
> > > >> >    the Min and Max values will be used, since Kudu does not accept
> > > bloom
> > > >> >    filters yet. (https://issues.apache.org/
> jira/browse/IMPALA-3741)
> > > >> >    - Applying the bloom filter on the Impala side of the Kudu scan
> > > (i.e.
> > > >> in
> > > >> >    KuduScanNode) is not in the scope of this patch.
> > > >> >
> > > >> >
> > > >> > *Complications:*
> > > >> >
> > > >> >    - We have to make sure that finding the Min and Max values on
> the
> > > >> build
> > > >> >    side doesn't regress certain workloads, since the difference
> > > between
> > > >> >    generating a bloom filter and generating a Min and a Max, is
> > that a
> > > >> bloom
> > > >> >    filter can be type agnostic (we just take a raw hash over the
> > data)
> > > >> whereas
> > > >> >    a Min and a Max have to be type specific.
> > > >>
> > >
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>

Re: Min/Max runtime filtering on Impala-Kudu

Posted by Lars Volker <lv...@cloudera.com>.

Thanks for sending this around, Sailesh.

parquet-column-stats.h is somewhat tied to parquet statistics and will soon
need more parquet-specific fields (null_count and distinct_count). It will
also need extension to copy strings and handle there memory when tracking
min/max values for variable length data across row batches. I'm not sure
how these changes will affect its suitability for your purposes, but I
thought I'd give a quick heads up.

On Mon, Mar 27, 2017 at 11:35 PM, Todd Lipcon <to...@cloudera.com> wrote:

> Sounds reasonable to me as well.
>
> The only thing I'd add is that, if it's easy to design the code to be
> extended to pushing small 'IN (...)' predicate for low-cardinality filters,
> that would be great. eg if the filter can start as an IN(...) and then if
> it exceeds 32K (or whatever arbitrary threshold), "collapse" it to the
> min/max range predicate?
>
> This should have a big advantage for partition pruning in low-cardinality
> joins against hash-partitioned tables.
>
> -Todd
>
>
>
> On Mon, Mar 27, 2017 at 2:29 PM, Matthew Jacobs <mj...@cloudera.com> wrote:
>
> > Thanks for writing this up, Sailesh. It sounds reasonable.
> >
> > On Mon, Mar 27, 2017 at 2:24 PM, Sailesh Mukil <sa...@cloudera.com>
> > wrote:
> > > On Mon, Mar 27, 2017 at 11:49 AM, Marcel Kornacker <
> marcel@cloudera.com>
> > > wrote:
> > >
> > >> On Mon, Mar 27, 2017 at 11:42 AM, Sailesh Mukil <sailesh@cloudera.com
> >
> > >> wrote:
> > >> > I will be working on a patch to add min/max filter support in
> Impala,
> > and
> > >> > as a first step, specifically target the KuduScanNode, since the
> Kudu
> > >> > client is already able to accept a Min and a Max that it would
> > internally
> > >> > use to filter during its scans. Below is a brief design proposal.
> > >> >
> > >> > *Goal:*
> > >> >
> > >> > To leverage runtime min/max filter support in Kudu for the potential
> > >> speed
> > >> > up of queries over Kudu tables. Kudu does this by taking a min and a
> > max
> > >> > that Impala will provide and only return values in the range Impala
> is
> > >> > interested in.
> > >> >
> > >> > *[min <= range we're interested in >= max]*
> > >> >
> > >> > *Proposal:*
> > >> >
> > >> >
> > >> >    - As a first step, plumb the runtime filter code from
> > >> > *exec/hdfs-scan-node-base.cc/h
> > >> >    <http://hdfs-scan-node-base.cc/h>* to *exec/scan-node.cc/h
> > >> >    <http://scan-node.cc/h>*, so that it can be applied to
> > *KuduScanNode*
> > >> >    cleanly as well, since *KuduScanNode* and *HdfsScanNodeBase* both
> > >> >    inherit from *ScanNode.*
> > >>
> > >> Quick comment: please make sure your solution also applies to
> > >> KuduScanNodeMt.
> > >>
> > >
> > > Thanks for the input, I'll make sure to do that.
> > >
> > >
> > >>
> > >> >    - Reuse the *ColumnStats* class (exec/parquet-column-stats.h) or
> > >> >    implement a lighter weight version of it to process and store the
> > Min
> > >> and
> > >> >    the Max on the build side of the join.
> > >> >    - Once the Min and Max values are added to the existing runtime
> > filter
> > >> >    structures, as a first step, we will ignore the Min and Max
> values
> > for
> > >> >    non-Kudu tables. Using them for non-Kudu tables can come in as a
> > >> following
> > >> >    patch(es).
> > >> >    - Similarly, the bloom filter will be ignored for Kudu tables,
> and
> > >> only
> > >> >    the Min and Max values will be used, since Kudu does not accept
> > bloom
> > >> >    filters yet. (https://issues.apache.org/jira/browse/IMPALA-3741)
> > >> >    - Applying the bloom filter on the Impala side of the Kudu scan
> > (i.e.
> > >> in
> > >> >    KuduScanNode) is not in the scope of this patch.
> > >> >
> > >> >
> > >> > *Complications:*
> > >> >
> > >> >    - We have to make sure that finding the Min and Max values on the
> > >> build
> > >> >    side doesn't regress certain workloads, since the difference
> > between
> > >> >    generating a bloom filter and generating a Min and a Max, is
> that a
> > >> bloom
> > >> >    filter can be type agnostic (we just take a raw hash over the
> data)
> > >> whereas
> > >> >    a Min and a Max have to be type specific.
> > >>
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: Min/Max runtime filtering on Impala-Kudu

Posted by Todd Lipcon <to...@cloudera.com>.

Sounds reasonable to me as well.

The only thing I'd add is that, if it's easy to design the code to be
extended to pushing small 'IN (...)' predicate for low-cardinality filters,
that would be great. eg if the filter can start as an IN(...) and then if
it exceeds 32K (or whatever arbitrary threshold), "collapse" it to the
min/max range predicate?

This should have a big advantage for partition pruning in low-cardinality
joins against hash-partitioned tables.

-Todd



On Mon, Mar 27, 2017 at 2:29 PM, Matthew Jacobs <mj...@cloudera.com> wrote:

> Thanks for writing this up, Sailesh. It sounds reasonable.
>
> On Mon, Mar 27, 2017 at 2:24 PM, Sailesh Mukil <sa...@cloudera.com>
> wrote:
> > On Mon, Mar 27, 2017 at 11:49 AM, Marcel Kornacker <ma...@cloudera.com>
> > wrote:
> >
> >> On Mon, Mar 27, 2017 at 11:42 AM, Sailesh Mukil <sa...@cloudera.com>
> >> wrote:
> >> > I will be working on a patch to add min/max filter support in Impala,
> and
> >> > as a first step, specifically target the KuduScanNode, since the Kudu
> >> > client is already able to accept a Min and a Max that it would
> internally
> >> > use to filter during its scans. Below is a brief design proposal.
> >> >
> >> > *Goal:*
> >> >
> >> > To leverage runtime min/max filter support in Kudu for the potential
> >> speed
> >> > up of queries over Kudu tables. Kudu does this by taking a min and a
> max
> >> > that Impala will provide and only return values in the range Impala is
> >> > interested in.
> >> >
> >> > *[min <= range we're interested in >= max]*
> >> >
> >> > *Proposal:*
> >> >
> >> >
> >> >    - As a first step, plumb the runtime filter code from
> >> > *exec/hdfs-scan-node-base.cc/h
> >> >    <http://hdfs-scan-node-base.cc/h>* to *exec/scan-node.cc/h
> >> >    <http://scan-node.cc/h>*, so that it can be applied to
> *KuduScanNode*
> >> >    cleanly as well, since *KuduScanNode* and *HdfsScanNodeBase* both
> >> >    inherit from *ScanNode.*
> >>
> >> Quick comment: please make sure your solution also applies to
> >> KuduScanNodeMt.
> >>
> >
> > Thanks for the input, I'll make sure to do that.
> >
> >
> >>
> >> >    - Reuse the *ColumnStats* class (exec/parquet-column-stats.h) or
> >> >    implement a lighter weight version of it to process and store the
> Min
> >> and
> >> >    the Max on the build side of the join.
> >> >    - Once the Min and Max values are added to the existing runtime
> filter
> >> >    structures, as a first step, we will ignore the Min and Max values
> for
> >> >    non-Kudu tables. Using them for non-Kudu tables can come in as a
> >> following
> >> >    patch(es).
> >> >    - Similarly, the bloom filter will be ignored for Kudu tables, and
> >> only
> >> >    the Min and Max values will be used, since Kudu does not accept
> bloom
> >> >    filters yet. (https://issues.apache.org/jira/browse/IMPALA-3741)
> >> >    - Applying the bloom filter on the Impala side of the Kudu scan
> (i.e.
> >> in
> >> >    KuduScanNode) is not in the scope of this patch.
> >> >
> >> >
> >> > *Complications:*
> >> >
> >> >    - We have to make sure that finding the Min and Max values on the
> >> build
> >> >    side doesn't regress certain workloads, since the difference
> between
> >> >    generating a bloom filter and generating a Min and a Max, is that a
> >> bloom
> >> >    filter can be type agnostic (we just take a raw hash over the data)
> >> whereas
> >> >    a Min and a Max have to be type specific.
> >>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Min/Max runtime filtering on Impala-Kudu

Posted by Matthew Jacobs <mj...@cloudera.com>.

Thanks for writing this up, Sailesh. It sounds reasonable.

On Mon, Mar 27, 2017 at 2:24 PM, Sailesh Mukil <sa...@cloudera.com> wrote:
> On Mon, Mar 27, 2017 at 11:49 AM, Marcel Kornacker <ma...@cloudera.com>
> wrote:
>
>> On Mon, Mar 27, 2017 at 11:42 AM, Sailesh Mukil <sa...@cloudera.com>
>> wrote:
>> > I will be working on a patch to add min/max filter support in Impala, and
>> > as a first step, specifically target the KuduScanNode, since the Kudu
>> > client is already able to accept a Min and a Max that it would internally
>> > use to filter during its scans. Below is a brief design proposal.
>> >
>> > *Goal:*
>> >
>> > To leverage runtime min/max filter support in Kudu for the potential
>> speed
>> > up of queries over Kudu tables. Kudu does this by taking a min and a max
>> > that Impala will provide and only return values in the range Impala is
>> > interested in.
>> >
>> > *[min <= range we're interested in >= max]*
>> >
>> > *Proposal:*
>> >
>> >
>> >    - As a first step, plumb the runtime filter code from
>> > *exec/hdfs-scan-node-base.cc/h
>> >    <http://hdfs-scan-node-base.cc/h>* to *exec/scan-node.cc/h
>> >    <http://scan-node.cc/h>*, so that it can be applied to *KuduScanNode*
>> >    cleanly as well, since *KuduScanNode* and *HdfsScanNodeBase* both
>> >    inherit from *ScanNode.*
>>
>> Quick comment: please make sure your solution also applies to
>> KuduScanNodeMt.
>>
>
> Thanks for the input, I'll make sure to do that.
>
>
>>
>> >    - Reuse the *ColumnStats* class (exec/parquet-column-stats.h) or
>> >    implement a lighter weight version of it to process and store the Min
>> and
>> >    the Max on the build side of the join.
>> >    - Once the Min and Max values are added to the existing runtime filter
>> >    structures, as a first step, we will ignore the Min and Max values for
>> >    non-Kudu tables. Using them for non-Kudu tables can come in as a
>> following
>> >    patch(es).
>> >    - Similarly, the bloom filter will be ignored for Kudu tables, and
>> only
>> >    the Min and Max values will be used, since Kudu does not accept bloom
>> >    filters yet. (https://issues.apache.org/jira/browse/IMPALA-3741)
>> >    - Applying the bloom filter on the Impala side of the Kudu scan (i.e.
>> in
>> >    KuduScanNode) is not in the scope of this patch.
>> >
>> >
>> > *Complications:*
>> >
>> >    - We have to make sure that finding the Min and Max values on the
>> build
>> >    side doesn't regress certain workloads, since the difference between
>> >    generating a bloom filter and generating a Min and a Max, is that a
>> bloom
>> >    filter can be type agnostic (we just take a raw hash over the data)
>> whereas
>> >    a Min and a Max have to be type specific.
>>

Re: Min/Max runtime filtering on Impala-Kudu

Posted by Sailesh Mukil <sa...@cloudera.com>.

On Mon, Mar 27, 2017 at 11:49 AM, Marcel Kornacker <ma...@cloudera.com>
wrote:

> On Mon, Mar 27, 2017 at 11:42 AM, Sailesh Mukil <sa...@cloudera.com>
> wrote:
> > I will be working on a patch to add min/max filter support in Impala, and
> > as a first step, specifically target the KuduScanNode, since the Kudu
> > client is already able to accept a Min and a Max that it would internally
> > use to filter during its scans. Below is a brief design proposal.
> >
> > *Goal:*
> >
> > To leverage runtime min/max filter support in Kudu for the potential
> speed
> > up of queries over Kudu tables. Kudu does this by taking a min and a max
> > that Impala will provide and only return values in the range Impala is
> > interested in.
> >
> > *[min <= range we're interested in >= max]*
> >
> > *Proposal:*
> >
> >
> >    - As a first step, plumb the runtime filter code from
> > *exec/hdfs-scan-node-base.cc/h
> >    <http://hdfs-scan-node-base.cc/h>* to *exec/scan-node.cc/h
> >    <http://scan-node.cc/h>*, so that it can be applied to *KuduScanNode*
> >    cleanly as well, since *KuduScanNode* and *HdfsScanNodeBase* both
> >    inherit from *ScanNode.*
>
> Quick comment: please make sure your solution also applies to
> KuduScanNodeMt.
>

Thanks for the input, I'll make sure to do that.


>
> >    - Reuse the *ColumnStats* class (exec/parquet-column-stats.h) or
> >    implement a lighter weight version of it to process and store the Min
> and
> >    the Max on the build side of the join.
> >    - Once the Min and Max values are added to the existing runtime filter
> >    structures, as a first step, we will ignore the Min and Max values for
> >    non-Kudu tables. Using them for non-Kudu tables can come in as a
> following
> >    patch(es).
> >    - Similarly, the bloom filter will be ignored for Kudu tables, and
> only
> >    the Min and Max values will be used, since Kudu does not accept bloom
> >    filters yet. (https://issues.apache.org/jira/browse/IMPALA-3741)
> >    - Applying the bloom filter on the Impala side of the Kudu scan (i.e.
> in
> >    KuduScanNode) is not in the scope of this patch.
> >
> >
> > *Complications:*
> >
> >    - We have to make sure that finding the Min and Max values on the
> build
> >    side doesn't regress certain workloads, since the difference between
> >    generating a bloom filter and generating a Min and a Max, is that a
> bloom
> >    filter can be type agnostic (we just take a raw hash over the data)
> whereas
> >    a Min and a Max have to be type specific.
>

Re: Min/Max runtime filtering on Impala-Kudu

Posted by Marcel Kornacker <ma...@cloudera.com>.

On Mon, Mar 27, 2017 at 11:42 AM, Sailesh Mukil <sa...@cloudera.com> wrote:
> I will be working on a patch to add min/max filter support in Impala, and
> as a first step, specifically target the KuduScanNode, since the Kudu
> client is already able to accept a Min and a Max that it would internally
> use to filter during its scans. Below is a brief design proposal.
>
> *Goal:*
>
> To leverage runtime min/max filter support in Kudu for the potential speed
> up of queries over Kudu tables. Kudu does this by taking a min and a max
> that Impala will provide and only return values in the range Impala is
> interested in.
>
> *[min <= range we're interested in >= max]*
>
> *Proposal:*
>
>
>    - As a first step, plumb the runtime filter code from
> *exec/hdfs-scan-node-base.cc/h
>    <http://hdfs-scan-node-base.cc/h>* to *exec/scan-node.cc/h
>    <http://scan-node.cc/h>*, so that it can be applied to *KuduScanNode*
>    cleanly as well, since *KuduScanNode* and *HdfsScanNodeBase* both
>    inherit from *ScanNode.*

Quick comment: please make sure your solution also applies to KuduScanNodeMt.

>    - Reuse the *ColumnStats* class (exec/parquet-column-stats.h) or
>    implement a lighter weight version of it to process and store the Min and
>    the Max on the build side of the join.
>    - Once the Min and Max values are added to the existing runtime filter
>    structures, as a first step, we will ignore the Min and Max values for
>    non-Kudu tables. Using them for non-Kudu tables can come in as a following
>    patch(es).
>    - Similarly, the bloom filter will be ignored for Kudu tables, and only
>    the Min and Max values will be used, since Kudu does not accept bloom
>    filters yet. (https://issues.apache.org/jira/browse/IMPALA-3741)
>    - Applying the bloom filter on the Impala side of the Kudu scan (i.e. in
>    KuduScanNode) is not in the scope of this patch.
>
>
> *Complications:*
>
>    - We have to make sure that finding the Min and Max values on the build
>    side doesn't regress certain workloads, since the difference between
>    generating a bloom filter and generating a Min and a Max, is that a bloom
>    filter can be type agnostic (we just take a raw hash over the data) whereas
>    a Min and a Max have to be type specific.