You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Julien Le Dem <ju...@wework.com.INVALID> on 2018/09/25 16:47:54 UTC

parquet sync notes

Lars (Cloudera Impala): listen in.
Zoltan, Gabor and Nandor (Cloudera):

   - feature branch reviewed and merged
   - Parquet-format release
   -
      - Define scope

Ryan (Netflix)
Junjie (tencent): bloom filter
Jim Apple (cloud service): bloom filter in parquet-mr? Since they got in
parquet-cpp
Gidon (IBM): encrytpion
Sahil (Cloudera impala, hive): listen in
Julien (Wework)

Status update from Gabor:

   -  Waiting for reviews.
   -
      - Plan to merge this Friday.
      - Please review in the next few days.

Parquet format release:

   - Nanosecond precision
   - Deprecation of java related code
   - Encryption metadata
   -
      - One more pr to merge
      - Plan:
   -
      - Revert the encryption patches and put them in feature branch in
      parquet-format
      - Apply the same process to bloom filters
      - Owner of pr can update it to the feature branch


Encryption:

   - Old readers can read non encrypted columns
   -
      - Changes to metadata
      - One last PR on parquet-format
      - We should have a vote before merging it.
   - Make sure parquet-cpp depends on the source of truth thrift in
   parquet-format.


Bloom filter:

   - parquet-format/62 and parquet-format/99
   - parquet-format/28: should be closed as is outdated. We should port the
   doc to the more recent PR.

Re: parquet sync notes

Posted by Zoltan Ivanfi <zi...@cloudera.com.INVALID>.
Hi,

I have created the feature branches:

- https://github.com/apache/parquet-mr/tree/bloom-filter
- https://github.com/apache/parquet-format/tree/bloom-filter

- https://github.com/apache/parquet-mr/tree/encryption
- https://github.com/apache/parquet-format/tree/encryption

I have also cherry-picked the encryption commits to the latter one.

Br,

Zoltan

On Wed, Sep 26, 2018 at 10:29 AM 俊杰陈 <cj...@gmail.com> wrote:

> Hi Zoltan
>
> PR #62 contains some rebase info which is not relate to change itself so I
> created PR#99. Actually it only contains one file change now, I will add
> another document file later.
>
> Zoltan Ivanfi <zi...@cloudera.com.invalid> 于2018年9月26日周三 下午3:19写道:
>
> > Hi,
> >
> > It seems to me that PR #99 does not supersede PR #62, as the latter
> affects
> > 16 files but the former only modifies a single one. Or has the rest of
> the
> > changes been already merged to the codebase from another PR? I checked
> the
> > history and I don't see anything related.
> >
> > Thanks,
> >
> > Zoltan
> >
> > On Wed, Sep 26, 2018 at 4:25 AM 俊杰陈 <cj...@gmail.com> wrote:
> >
> > > Hi
> > >
> > > the pr28 and pr62 of parquet-format was closed. Will we create a
> feature
> > > branch for bloom filter on parquet-mr as well?
> > >
> > > Julien Le Dem <ju...@wework.com.invalid> 于2018年9月26日周三
> 上午12:48写道:
> > >
> > > > Lars (Cloudera Impala): listen in.
> > > > Zoltan, Gabor and Nandor (Cloudera):
> > > >
> > > >    - feature branch reviewed and merged
> > > >    - Parquet-format release
> > > >    -
> > > >       - Define scope
> > > >
> > > > Ryan (Netflix)
> > > > Junjie (tencent): bloom filter
> > > > Jim Apple (cloud service): bloom filter in parquet-mr? Since they got
> > in
> > > > parquet-cpp
> > > > Gidon (IBM): encrytpion
> > > > Sahil (Cloudera impala, hive): listen in
> > > > Julien (Wework)
> > > >
> > > > Status update from Gabor:
> > > >
> > > >    -  Waiting for reviews.
> > > >    -
> > > >       - Plan to merge this Friday.
> > > >       - Please review in the next few days.
> > > >
> > > > Parquet format release:
> > > >
> > > >    - Nanosecond precision
> > > >    - Deprecation of java related code
> > > >    - Encryption metadata
> > > >    -
> > > >       - One more pr to merge
> > > >       - Plan:
> > > >    -
> > > >       - Revert the encryption patches and put them in feature branch
> in
> > > >       parquet-format
> > > >       - Apply the same process to bloom filters
> > > >       - Owner of pr can update it to the feature branch
> > > >
> > > >
> > > > Encryption:
> > > >
> > > >    - Old readers can read non encrypted columns
> > > >    -
> > > >       - Changes to metadata
> > > >       - One last PR on parquet-format
> > > >       - We should have a vote before merging it.
> > > >    - Make sure parquet-cpp depends on the source of truth thrift in
> > > >    parquet-format.
> > > >
> > > >
> > > > Bloom filter:
> > > >
> > > >    - parquet-format/62 and parquet-format/99
> > > >    - parquet-format/28: should be closed as is outdated. We should
> port
> > > the
> > > >    doc to the more recent PR.
> > > >
> > >
> > >
> > > --
> > > Thanks & Best Regards
> > >
> >
>
>
> --
> Thanks & Best Regards
>

Re: parquet sync notes

Posted by 俊杰陈 <cj...@gmail.com>.
Hi Zoltan

PR #62 contains some rebase info which is not relate to change itself so I
created PR#99. Actually it only contains one file change now, I will add
another document file later.

Zoltan Ivanfi <zi...@cloudera.com.invalid> 于2018年9月26日周三 下午3:19写道:

> Hi,
>
> It seems to me that PR #99 does not supersede PR #62, as the latter affects
> 16 files but the former only modifies a single one. Or has the rest of the
> changes been already merged to the codebase from another PR? I checked the
> history and I don't see anything related.
>
> Thanks,
>
> Zoltan
>
> On Wed, Sep 26, 2018 at 4:25 AM 俊杰陈 <cj...@gmail.com> wrote:
>
> > Hi
> >
> > the pr28 and pr62 of parquet-format was closed. Will we create a feature
> > branch for bloom filter on parquet-mr as well?
> >
> > Julien Le Dem <ju...@wework.com.invalid> 于2018年9月26日周三 上午12:48写道:
> >
> > > Lars (Cloudera Impala): listen in.
> > > Zoltan, Gabor and Nandor (Cloudera):
> > >
> > >    - feature branch reviewed and merged
> > >    - Parquet-format release
> > >    -
> > >       - Define scope
> > >
> > > Ryan (Netflix)
> > > Junjie (tencent): bloom filter
> > > Jim Apple (cloud service): bloom filter in parquet-mr? Since they got
> in
> > > parquet-cpp
> > > Gidon (IBM): encrytpion
> > > Sahil (Cloudera impala, hive): listen in
> > > Julien (Wework)
> > >
> > > Status update from Gabor:
> > >
> > >    -  Waiting for reviews.
> > >    -
> > >       - Plan to merge this Friday.
> > >       - Please review in the next few days.
> > >
> > > Parquet format release:
> > >
> > >    - Nanosecond precision
> > >    - Deprecation of java related code
> > >    - Encryption metadata
> > >    -
> > >       - One more pr to merge
> > >       - Plan:
> > >    -
> > >       - Revert the encryption patches and put them in feature branch in
> > >       parquet-format
> > >       - Apply the same process to bloom filters
> > >       - Owner of pr can update it to the feature branch
> > >
> > >
> > > Encryption:
> > >
> > >    - Old readers can read non encrypted columns
> > >    -
> > >       - Changes to metadata
> > >       - One last PR on parquet-format
> > >       - We should have a vote before merging it.
> > >    - Make sure parquet-cpp depends on the source of truth thrift in
> > >    parquet-format.
> > >
> > >
> > > Bloom filter:
> > >
> > >    - parquet-format/62 and parquet-format/99
> > >    - parquet-format/28: should be closed as is outdated. We should port
> > the
> > >    doc to the more recent PR.
> > >
> >
> >
> > --
> > Thanks & Best Regards
> >
>


-- 
Thanks & Best Regards

Re: parquet sync notes

Posted by Zoltan Ivanfi <zi...@cloudera.com.INVALID>.
Hi,

It seems to me that PR #99 does not supersede PR #62, as the latter affects
16 files but the former only modifies a single one. Or has the rest of the
changes been already merged to the codebase from another PR? I checked the
history and I don't see anything related.

Thanks,

Zoltan

On Wed, Sep 26, 2018 at 4:25 AM 俊杰陈 <cj...@gmail.com> wrote:

> Hi
>
> the pr28 and pr62 of parquet-format was closed. Will we create a feature
> branch for bloom filter on parquet-mr as well?
>
> Julien Le Dem <ju...@wework.com.invalid> 于2018年9月26日周三 上午12:48写道:
>
> > Lars (Cloudera Impala): listen in.
> > Zoltan, Gabor and Nandor (Cloudera):
> >
> >    - feature branch reviewed and merged
> >    - Parquet-format release
> >    -
> >       - Define scope
> >
> > Ryan (Netflix)
> > Junjie (tencent): bloom filter
> > Jim Apple (cloud service): bloom filter in parquet-mr? Since they got in
> > parquet-cpp
> > Gidon (IBM): encrytpion
> > Sahil (Cloudera impala, hive): listen in
> > Julien (Wework)
> >
> > Status update from Gabor:
> >
> >    -  Waiting for reviews.
> >    -
> >       - Plan to merge this Friday.
> >       - Please review in the next few days.
> >
> > Parquet format release:
> >
> >    - Nanosecond precision
> >    - Deprecation of java related code
> >    - Encryption metadata
> >    -
> >       - One more pr to merge
> >       - Plan:
> >    -
> >       - Revert the encryption patches and put them in feature branch in
> >       parquet-format
> >       - Apply the same process to bloom filters
> >       - Owner of pr can update it to the feature branch
> >
> >
> > Encryption:
> >
> >    - Old readers can read non encrypted columns
> >    -
> >       - Changes to metadata
> >       - One last PR on parquet-format
> >       - We should have a vote before merging it.
> >    - Make sure parquet-cpp depends on the source of truth thrift in
> >    parquet-format.
> >
> >
> > Bloom filter:
> >
> >    - parquet-format/62 and parquet-format/99
> >    - parquet-format/28: should be closed as is outdated. We should port
> the
> >    doc to the more recent PR.
> >
>
>
> --
> Thanks & Best Regards
>

Re: parquet sync notes

Posted by 俊杰陈 <cj...@gmail.com>.
Hi

the pr28 and pr62 of parquet-format was closed. Will we create a feature
branch for bloom filter on parquet-mr as well?

Julien Le Dem <ju...@wework.com.invalid> 于2018年9月26日周三 上午12:48写道:

> Lars (Cloudera Impala): listen in.
> Zoltan, Gabor and Nandor (Cloudera):
>
>    - feature branch reviewed and merged
>    - Parquet-format release
>    -
>       - Define scope
>
> Ryan (Netflix)
> Junjie (tencent): bloom filter
> Jim Apple (cloud service): bloom filter in parquet-mr? Since they got in
> parquet-cpp
> Gidon (IBM): encrytpion
> Sahil (Cloudera impala, hive): listen in
> Julien (Wework)
>
> Status update from Gabor:
>
>    -  Waiting for reviews.
>    -
>       - Plan to merge this Friday.
>       - Please review in the next few days.
>
> Parquet format release:
>
>    - Nanosecond precision
>    - Deprecation of java related code
>    - Encryption metadata
>    -
>       - One more pr to merge
>       - Plan:
>    -
>       - Revert the encryption patches and put them in feature branch in
>       parquet-format
>       - Apply the same process to bloom filters
>       - Owner of pr can update it to the feature branch
>
>
> Encryption:
>
>    - Old readers can read non encrypted columns
>    -
>       - Changes to metadata
>       - One last PR on parquet-format
>       - We should have a vote before merging it.
>    - Make sure parquet-cpp depends on the source of truth thrift in
>    parquet-format.
>
>
> Bloom filter:
>
>    - parquet-format/62 and parquet-format/99
>    - parquet-format/28: should be closed as is outdated. We should port the
>    doc to the more recent PR.
>


-- 
Thanks & Best Regards