You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Wes McKinney <we...@gmail.com> on 2019/11/08 01:56:40 UTC
Merged C++ Parquet Encryption implementation PARQUET-1300
hi folks,
I recently merged https://github.com/apache/arrow/pull/4826 containing
the bulk of the Parquet C++ encrypted file implementation:
https://github.com/apache/arrow/commit/41753ace481a82dea651c54639ec4adbae169187
This patch has been in progress for over a year with numerous rounds
of review, so I wanted to thank everyone for their hard work on this
project.
I'm copying dev@arrow because I would guess this patch has various
implications on packaging and build scripts and some JIRA issues may
need to be opened.
Note: I'm concerned about cross-implementation compatibility, so
developing some automated compatibility tests that exercise the
different modes of encryption (encrypted metadata, plaintext metadata,
and so forth) seems like a good idea to me.
Thanks,
Wes
Re: Merged C++ Parquet Encryption implementation PARQUET-1300
Posted by Gidon Gershinsky <gg...@gmail.com>.
Wes,
Thank you for reviewing and merging this project.
Regarding the note - we'll have interop testers in parquet-mr, so that
cpp-written files, encrypted in various modes, would be tested by java
readers - and vice versa.
These manual tests could be run during development and ahead of releases.
For automation, we've added these files to the parquet-testing repository,
and fetch/parse them during the CI tests in arrow/parquet-cpp (will do the
same in parquet-mr) - in parallel with parsing the freshly written files,
created by the code being tested. But other suggestions for
automating cross-implementation compatibility are also welcome.
Cheers, Gidon
On Fri, Nov 8, 2019 at 3:57 AM Wes McKinney <we...@gmail.com> wrote:
> hi folks,
>
> I recently merged https://github.com/apache/arrow/pull/4826 containing
> the bulk of the Parquet C++ encrypted file implementation:
>
>
> https://github.com/apache/arrow/commit/41753ace481a82dea651c54639ec4adbae169187
>
> This patch has been in progress for over a year with numerous rounds
> of review, so I wanted to thank everyone for their hard work on this
> project.
>
> I'm copying dev@arrow because I would guess this patch has various
> implications on packaging and build scripts and some JIRA issues may
> need to be opened.
>
> Note: I'm concerned about cross-implementation compatibility, so
> developing some automated compatibility tests that exercise the
> different modes of encryption (encrypted metadata, plaintext metadata,
> and so forth) seems like a good idea to me.
>
> Thanks,
> Wes
>
Re: Merged C++ Parquet Encryption implementation PARQUET-1300
Posted by Gidon Gershinsky <gg...@gmail.com>.
Wes,
Thank you for reviewing and merging this project.
Regarding the note - we'll have interop testers in parquet-mr, so that
cpp-written files, encrypted in various modes, would be tested by java
readers - and vice versa.
These manual tests could be run during development and ahead of releases.
For automation, we've added these files to the parquet-testing repository,
and fetch/parse them during the CI tests in arrow/parquet-cpp (will do the
same in parquet-mr) - in parallel with parsing the freshly written files,
created by the code being tested. But other suggestions for
automating cross-implementation compatibility are also welcome.
Cheers, Gidon
On Fri, Nov 8, 2019 at 3:57 AM Wes McKinney <we...@gmail.com> wrote:
> hi folks,
>
> I recently merged https://github.com/apache/arrow/pull/4826 containing
> the bulk of the Parquet C++ encrypted file implementation:
>
>
> https://github.com/apache/arrow/commit/41753ace481a82dea651c54639ec4adbae169187
>
> This patch has been in progress for over a year with numerous rounds
> of review, so I wanted to thank everyone for their hard work on this
> project.
>
> I'm copying dev@arrow because I would guess this patch has various
> implications on packaging and build scripts and some JIRA issues may
> need to be opened.
>
> Note: I'm concerned about cross-implementation compatibility, so
> developing some automated compatibility tests that exercise the
> different modes of encryption (encrypted metadata, plaintext metadata,
> and so forth) seems like a good idea to me.
>
> Thanks,
> Wes
>