You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Julien Le Dem <ju...@dremio.com> on 2017/04/26 18:02:38 UTC
Parquet sync minutes
Attendance/Agenda:
Deepak (Vertica):
- indexing discussion
Wes (twosigma):
- indexing discussion
- parquet-cpp 1.1
Marcel (Cloudera Impala):
- Index proposal
- sort order clarification went in
Julien (Dremio):
- indexing
- protos
Lukas (parquet-proto):
- parquet-proto
Notes:
- parquet-proto:
- 3 changes on the way:
- issue with protos repeated field that often are not read by other
integrations
- add support for protos generic types (may break compatibility?)
- schema evolution using ids in photo fields.
- Lukas to send JIRAs
- would want to merge them soon and have a release
- Index proposal for improving point queries and range queries.
https://docs.google.com/document/d/1sBACp8Lbutuj1Zxdowvsrlm8ku4BFxf8U_Do5K2wSO4/edit#
- todo (Marcel): clarify mechanism to store OffsetIndex and ColumnIndex
outside the footer (probably just before).
- todo (Marcel): add other optional fields form statistics in
ColumnIndex (min, max, null_count, distinct_count)
- todo (everyone): iterate on the feedback
- impala prototype planned for June
- Logical types pull request:
https://github.com/apache/parquet-format/pull/51/files
- todo: give more feedback
--
Julien
Re: Parquet sync minutes
Posted by Marcel Kornacker <ma...@gmail.com>.
On Wed, Apr 26, 2017 at 11:02 AM, Julien Le Dem <ju...@dremio.com> wrote:
> Attendance/Agenda:
> Deepak (Vertica):
> - indexing discussion
> Wes (twosigma):
> - indexing discussion
> - parquet-cpp 1.1
> Marcel (Cloudera Impala):
> - Index proposal
> - sort order clarification went in
> Julien (Dremio):
> - indexing
> - protos
> Lukas (parquet-proto):
> - parquet-proto
>
> Notes:
> - parquet-proto:
> - 3 changes on the way:
> - issue with protos repeated field that often are not read by other
> integrations
> - add support for protos generic types (may break compatibility?)
> - schema evolution using ids in photo fields.
> - Lukas to send JIRAs
> - would want to merge them soon and have a release
>
> - Index proposal for improving point queries and range queries.
> https://docs.google.com/document/d/1sBACp8Lbutuj1Zxdowvsrlm8ku4BFxf8U_Do5K2wSO4/edit#
> - todo (Marcel): clarify mechanism to store OffsetIndex and ColumnIndex
> outside the footer (probably just before).
> - todo (Marcel): add other optional fields form statistics in
> ColumnIndex (min, max, null_count, distinct_count)
I made the requested edits.
> - todo (everyone): iterate on the feedback
> - impala prototype planned for June
>
> - Logical types pull request:
> https://github.com/apache/parquet-format/pull/51/files
> - todo: give more feedback
>
>
>
>
> --
> Julien