You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by 俊杰陈 <cj...@gmail.com> on 2019/04/11 13:40:17 UTC

Plan to merge bloom filter patch set

Hi dev

Bloom filter related patch set had been merged to bloom-filter branch of
parquet-mr for a while, it contains reading, writing and filtering logic.
Can we start to merge patch set back to master?

-- 
Thanks & Best Regards

Re: Plan to merge bloom filter patch set

Posted by 俊杰陈 <cj...@gmail.com>.
Thanks Wes for caring about this.

I think the plan is to store bloom filter binary in parquet-testing repo
for cross compatibility tests.

Generally, the compatibility may be broken due to new algorithm or hash, so
when we adding
a new algorithm or hash implementation either in java side or in c++ side,
it must provide a new bloom filter
binary into parquet-testing repo. So that when implementing corresponding
algorithm or hash in another side,
we should add some compatibility tests which takes the corresponding binary
as inputs.

Also there is no need to worry about compatibility issue when reading a
parquet with bloom filter bitset that can
not be parsed , it just skip the filter logic as there is no bloom filter.

On Thu, Apr 25, 2019 at 6:16 AM Wes McKinney <we...@gmail.com> wrote:

> hi Junjie -- I can't comment on the Java branch merge, but I am
> curious what is the path from here to integration / compatibility
> testing of the Bloom filters between Java and C++ (complicated by the
> fact that we aren't doing much systematic integration testing in
> general)?
>
> - Wes
>
> On Thu, Apr 11, 2019 at 8:40 AM 俊杰陈 <cj...@gmail.com> wrote:
> >
> > Hi dev
> >
> > Bloom filter related patch set had been merged to bloom-filter branch of
> > parquet-mr for a while, it contains reading, writing and filtering logic.
> > Can we start to merge patch set back to master?
> >
> > --
> > Thanks & Best Regards
>


-- 
Thanks & Best Regards

Re: Plan to merge bloom filter patch set

Posted by Wes McKinney <we...@gmail.com>.
hi Junjie -- I can't comment on the Java branch merge, but I am
curious what is the path from here to integration / compatibility
testing of the Bloom filters between Java and C++ (complicated by the
fact that we aren't doing much systematic integration testing in
general)?

- Wes

On Thu, Apr 11, 2019 at 8:40 AM 俊杰陈 <cj...@gmail.com> wrote:
>
> Hi dev
>
> Bloom filter related patch set had been merged to bloom-filter branch of
> parquet-mr for a while, it contains reading, writing and filtering logic.
> Can we start to merge patch set back to master?
>
> --
> Thanks & Best Regards