You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Julien Le Dem <ju...@dremio.com> on 2017/04/26 18:02:38 UTC

Parquet sync minutes

 Attendance/Agenda:
Deepak (Vertica):
 - indexing discussion
Wes (twosigma):
 - indexing discussion
 - parquet-cpp 1.1
Marcel (Cloudera Impala):
 - Index proposal
 - sort order clarification went in
Julien (Dremio):
 - indexing
 - protos
Lukas (parquet-proto):
 - parquet-proto

Notes:
 - parquet-proto:
   - 3 changes on the way:
     - issue with protos repeated field that often are not read by other
integrations
     - add support for protos generic types (may break compatibility?)
     - schema evolution using ids in photo fields.
   - Lukas to send JIRAs
   - would want to merge them soon and have a release

 - Index proposal for improving point queries and range queries.
https://docs.google.com/document/d/1sBACp8Lbutuj1Zxdowvsrlm8ku4BFxf8U_Do5K2wSO4/edit#
   - todo (Marcel): clarify mechanism to store OffsetIndex and ColumnIndex
outside the footer (probably just before).
   - todo (Marcel): add other optional fields form statistics in
ColumnIndex (min, max, null_count, distinct_count)
   - todo (everyone): iterate on the feedback
   - impala prototype planned for June

- Logical types pull request:
https://github.com/apache/parquet-format/pull/51/files
  - todo: give more feedback




-- 
Julien

Re: Parquet sync minutes

Posted by Marcel Kornacker <ma...@gmail.com>.
On Wed, Apr 26, 2017 at 11:02 AM, Julien Le Dem <ju...@dremio.com> wrote:
>  Attendance/Agenda:
> Deepak (Vertica):
>  - indexing discussion
> Wes (twosigma):
>  - indexing discussion
>  - parquet-cpp 1.1
> Marcel (Cloudera Impala):
>  - Index proposal
>  - sort order clarification went in
> Julien (Dremio):
>  - indexing
>  - protos
> Lukas (parquet-proto):
>  - parquet-proto
>
> Notes:
>  - parquet-proto:
>    - 3 changes on the way:
>      - issue with protos repeated field that often are not read by other
> integrations
>      - add support for protos generic types (may break compatibility?)
>      - schema evolution using ids in photo fields.
>    - Lukas to send JIRAs
>    - would want to merge them soon and have a release
>
>  - Index proposal for improving point queries and range queries.
> https://docs.google.com/document/d/1sBACp8Lbutuj1Zxdowvsrlm8ku4BFxf8U_Do5K2wSO4/edit#
>    - todo (Marcel): clarify mechanism to store OffsetIndex and ColumnIndex
> outside the footer (probably just before).
>    - todo (Marcel): add other optional fields form statistics in
> ColumnIndex (min, max, null_count, distinct_count)

I made the requested edits.

>    - todo (everyone): iterate on the feedback
>    - impala prototype planned for June
>
> - Logical types pull request:
> https://github.com/apache/parquet-format/pull/51/files
>   - todo: give more feedback
>
>
>
>
> --
> Julien