You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Julien Le Dem <ju...@dremio.com> on 2017/05/10 17:02:52 UTC

Parquet sync starting now

https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up

-- 
Julien

Re: Parquet sync starting now

Posted by Julien Le Dem <ju...@dremio.com>.
Notes:

Attendees and agenda building:

Ryan (Netflix):
  - new logical types representation
  - index proposal
Deepak (Vertica):
  - logical types for timestamps
Lars (Impala):
  - dummy ordering to test unknown ordering
  - implement new ordering in parquet-mr
Marcel (Impala):
  - index proposal
Uwe (Blue Yonder):
  - parquet cpp 1.1
Wes (twosigma):
  - parquet-cpp 1.1
  - indexing proposal
Zoltan (Cloudera - fileformats):
Julien (Dremio):
 - parquet-mr
 - indexing proposal: near footer of indexes.
 - new logical types

Discussion:
 - logical types: PARQUET-906
https://github.com/apache/parquet-format/pull/51
   - action: Marcel and Lars to give feedback
   - action: give feedback by next week
 - testing unknown ordering:
https://github.com/apache/parquet-format/pull/53/files
   - discussed pros and cons of approaches. Lars will follow up on the
JIRA/PR
 - parquet-cpp 1.1 release:
   - will include:
     - support for reading structs to arrow: (simple reader of one level
structs)
     - support for windows
     - reading and writing of lists of lists: (handles empty lists)
     - move arrow dependency from 0.2 to 0.3
   - rc coming soon.
   - todo: make summary/release notes
 - index proposal: PARQUET-922
    - action Julien: open jira to implement footer reading optimization in
parquet-mr
    - The new index metadata is before the footer to not impact regular
scan read.
    - We will make pages stop on row boundaries when the index is present
      - add row_count to page v1
    - discussion: do we need compression?
      - to be addressed later. We should prototype something first
    - Deepak: open Jira for limiting stats size in parquet-cpp







On Wed, May 10, 2017 at 10:02 AM, Julien Le Dem <ju...@dremio.com> wrote:

> https://hangouts.google.com/hangouts/_/dremio.com/parquet-sync-up
>
> --
> Julien
>



-- 
Julien