You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Jason Altekruse <al...@gmail.com> on 2016/01/05 18:56:27 UTC

Hangout starting in 5 minutes

Come join the Drill community in our weekly hangout meeting to find out
what is going on with Drill right now.

https://plus.google.com/hangouts/_/dremio.com/drillhangout

Some items I would like to discuss this week:
- 1.5 release, issues left to fix, when would we like to target for a vote
- Drill parquet date bug: https://issues.apache.org/jira/browse/DRILL-4203

Feel free to respond with other items you would like to have discussed, or
just jump on the call.

Re: Hangout starting in 5 minutes

Posted by Jason Altekruse <al...@gmail.com>.
Notes: Drill hangout - 1/5/2016

Vicky, Andries, Hakim, Aman, Julien, Jason, Charles


Drill 1.5 release thread, number of outstanding issues to solve


Parquet dates

    - metadata migration may be needed for old files

    - check migration tool to make sure it doesn't update already known
versions to a newer one

    - can use some combination of a whitelist of known good writers as well
as checking the year

      on some of the values and that the flag to auto-correct bad dates is
set


Allocator bugs

    - CTAS with partitioning was failing, sort running out of memory

    - Flatten was having issues

        - Functional suite

        - intermittent

    - Unit test, jacques did not file a bug

        - sent an e-mail to parth and Hakim

    - advanced suite, out of memory failures

        - See if Dremio infrastructure is running advanced tests

        - checked with Jacques, Dremio is not currently running the
advanced suite


Amit's branch - testing by Vicky

    - blocked on the sort bug

        - his fix is on top of 1.5 which fails with out of memory in sort

          before it reaches the merge join operator


Aman hash skew - Drill-4237

    - sting of length 32 or more chars

        - bad skew due to use of signed rather than unsigned long in the C
implementation

        - revert to old hash functions

        - performance of the hash is a little slower, need to measure how
much

    - already tried using Guava UnsignedLong

        - does not have required operations for the xxhash algorithm

            - bit shifts


Hakim

- warming from parquet library

    - parquet "corrupt statistics" message is showing with Drill 1.5

    - issue with partitioned files

        - partitioned by on date column seems to be the issue

On Tue, Jan 5, 2016 at 11:56 AM, Jason Altekruse <al...@gmail.com>
wrote:

> Come join the Drill community in our weekly hangout meeting to find out
> what is going on with Drill right now.
>
> https://plus.google.com/hangouts/_/dremio.com/drillhangout
>
> Some items I would like to discuss this week:
> - 1.5 release, issues left to fix, when would we like to target for a vote
> - Drill parquet date bug: https://issues.apache.org/jira/browse/DRILL-4203
>
> Feel free to respond with other items you would like to have discussed, or
> just jump on the call.
>
>
>

Re: Hangout starting in 5 minutes

Posted by Jason Altekruse <al...@gmail.com>.
Notes: Drill hangout - 1/5/2016

Vicky, Andries, Hakim, Aman, Julien, Jason, Charles


Drill 1.5 release thread, number of outstanding issues to solve


Parquet dates

    - metadata migration may be needed for old files

    - check migration tool to make sure it doesn't update already known
versions to a newer one

    - can use some combination of a whitelist of known good writers as well
as checking the year

      on some of the values and that the flag to auto-correct bad dates is
set


Allocator bugs

    - CTAS with partitioning was failing, sort running out of memory

    - Flatten was having issues

        - Functional suite

        - intermittent

    - Unit test, jacques did not file a bug

        - sent an e-mail to parth and Hakim

    - advanced suite, out of memory failures

        - See if Dremio infrastructure is running advanced tests

        - checked with Jacques, Dremio is not currently running the
advanced suite


Amit's branch - testing by Vicky

    - blocked on the sort bug

        - his fix is on top of 1.5 which fails with out of memory in sort

          before it reaches the merge join operator


Aman hash skew - Drill-4237

    - sting of length 32 or more chars

        - bad skew due to use of signed rather than unsigned long in the C
implementation

        - revert to old hash functions

        - performance of the hash is a little slower, need to measure how
much

    - already tried using Guava UnsignedLong

        - does not have required operations for the xxhash algorithm

            - bit shifts


Hakim

- warming from parquet library

    - parquet "corrupt statistics" message is showing with Drill 1.5

    - issue with partitioned files

        - partitioned by on date column seems to be the issue

On Tue, Jan 5, 2016 at 11:56 AM, Jason Altekruse <al...@gmail.com>
wrote:

> Come join the Drill community in our weekly hangout meeting to find out
> what is going on with Drill right now.
>
> https://plus.google.com/hangouts/_/dremio.com/drillhangout
>
> Some items I would like to discuss this week:
> - 1.5 release, issues left to fix, when would we like to target for a vote
> - Drill parquet date bug: https://issues.apache.org/jira/browse/DRILL-4203
>
> Feel free to respond with other items you would like to have discussed, or
> just jump on the call.
>
>
>