You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Wes McKinney <we...@gmail.com> on 2019/11/08 01:56:40 UTC

Merged C++ Parquet Encryption implementation PARQUET-1300

hi folks,

I recently merged https://github.com/apache/arrow/pull/4826 containing
the bulk of the Parquet C++ encrypted file implementation:

https://github.com/apache/arrow/commit/41753ace481a82dea651c54639ec4adbae169187

This patch has been in progress for over a year with numerous rounds
of review, so I wanted to thank everyone for their hard work on this
project.

I'm copying dev@arrow because I would guess this patch has various
implications on packaging and build scripts and some JIRA issues may
need to be opened.

Note: I'm concerned about cross-implementation compatibility, so
developing some automated compatibility tests that exercise the
different modes of encryption (encrypted metadata, plaintext metadata,
and so forth) seems like a good idea to me.

Thanks,
Wes

Re: Merged C++ Parquet Encryption implementation PARQUET-1300

Posted by Gidon Gershinsky <gg...@gmail.com>.
Wes,

Thank you for reviewing and merging this project.
Regarding the note - we'll have interop testers in parquet-mr, so that
cpp-written files, encrypted in various modes, would be tested by java
readers - and vice versa.
These manual tests could be run during development and ahead of releases.
For automation, we've added these files to the parquet-testing repository,
and fetch/parse them during the CI tests in arrow/parquet-cpp (will do the
same in parquet-mr) - in parallel with parsing the freshly written files,
created by the code being tested. But other suggestions for
automating cross-implementation compatibility are also welcome.

Cheers, Gidon

On Fri, Nov 8, 2019 at 3:57 AM Wes McKinney <we...@gmail.com> wrote:

> hi folks,
>
> I recently merged https://github.com/apache/arrow/pull/4826 containing
> the bulk of the Parquet C++ encrypted file implementation:
>
>
> https://github.com/apache/arrow/commit/41753ace481a82dea651c54639ec4adbae169187
>
> This patch has been in progress for over a year with numerous rounds
> of review, so I wanted to thank everyone for their hard work on this
> project.
>
> I'm copying dev@arrow because I would guess this patch has various
> implications on packaging and build scripts and some JIRA issues may
> need to be opened.
>
> Note: I'm concerned about cross-implementation compatibility, so
> developing some automated compatibility tests that exercise the
> different modes of encryption (encrypted metadata, plaintext metadata,
> and so forth) seems like a good idea to me.
>
> Thanks,
> Wes
>

Re: Merged C++ Parquet Encryption implementation PARQUET-1300

Posted by Gidon Gershinsky <gg...@gmail.com>.
Wes,

Thank you for reviewing and merging this project.
Regarding the note - we'll have interop testers in parquet-mr, so that
cpp-written files, encrypted in various modes, would be tested by java
readers - and vice versa.
These manual tests could be run during development and ahead of releases.
For automation, we've added these files to the parquet-testing repository,
and fetch/parse them during the CI tests in arrow/parquet-cpp (will do the
same in parquet-mr) - in parallel with parsing the freshly written files,
created by the code being tested. But other suggestions for
automating cross-implementation compatibility are also welcome.

Cheers, Gidon

On Fri, Nov 8, 2019 at 3:57 AM Wes McKinney <we...@gmail.com> wrote:

> hi folks,
>
> I recently merged https://github.com/apache/arrow/pull/4826 containing
> the bulk of the Parquet C++ encrypted file implementation:
>
>
> https://github.com/apache/arrow/commit/41753ace481a82dea651c54639ec4adbae169187
>
> This patch has been in progress for over a year with numerous rounds
> of review, so I wanted to thank everyone for their hard work on this
> project.
>
> I'm copying dev@arrow because I would guess this patch has various
> implications on packaging and build scripts and some JIRA issues may
> need to be opened.
>
> Note: I'm concerned about cross-implementation compatibility, so
> developing some automated compatibility tests that exercise the
> different modes of encryption (encrypted metadata, plaintext metadata,
> and so forth) seems like a good idea to me.
>
> Thanks,
> Wes
>