You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Jason Altekruse <al...@gmail.com> on 2016/02/24 22:52:00 UTC

Hangout notes from the last few meetings

Hey guys,

Sorry I haven't been sending these out, I keep meaning to go back to them
and clean them up before sending them out and I don't get around to it. I
will just post the raw notes after the meeting going forward and provide
clarification on the thread if anyone has questions.


Drill Hangout - 2/9/2016

- Attendees: Yulia, Sean, Vicky, Sudheesh, Neeraja, Karol Potocki, Arina,
Aman, Hakim, Jinfeng

- New community members, Welcome!

    - Arina

        - working with MapR team

    - Karol

        - tiny contribution to allow spacial queries in Drill

        - interested in sparking interest in geo locations

        - PR outstanding for shapefile format

        - Neeraja - would be nice for simple doc for users to start

            - examples in PR and Karol's github repo

            - he could write a blog post for the apache repo

- Discussion topics

    - Sudheesh

        - 4281 - client impersonation

        - post a design doc soon

        - some drill deployments

            - tableau desktop is presentation layer on top of Drill

            - users only use tableau desktop, talking through tableau server

            - want to pass user from tableau desktop through the tableau
server

              so that impersonation works correctly

            - requires a change in Tableau as well, working with the team
there

    - Yulia

        - 4132 - simple queries in parallel

        - design doc on JIRA

        - 2 goals

            - separate planning from execution

            - separate fragment plans so that they can be run independently

        - those available please review design doc and PR

    - 1.5.0

        - new vote out soon

    - Jinfeng

        - 2517 - directory pruning in calcite logical

        - vicky seems to have found a bug

        - follow up work

            - need to separate the rules and run them individually to

              improve planning performance

    - Drill user survey

        - other projects list who is using them

        - just a google survey

        - simple questions, I assume all will be considered optional

            - current drill version in use

            - cluster size

            - datasources used

            - clients: sqlline, REST, Applications, JDBC, ODBC, BI Tools

            - what is your use case?

            - why Drill?

            - data formats, data types

            - are you using any of the security features of Drill to
restrict access of some data to users?

                - view chaining, impersonation, Web UI security

            - SQL features you would like to see as enhancements soon?

            - how many users are querying your Drill cluster

            - have you written a storage plugin, UDF or format plugin?

    - issues with the build

        - jdbc-all jar size enforcement

        - jacques made changes to remove proguard and generally fix up
jdbc-all JAR

        - 1.4.0 has a large JDBC-all jar that wasn't excluding what it was
supposed to

    - Aman

        - Dechang - perf regressions on rc2 metadata cache


Drill Hangout - 2/16/2016

- Attendees: Parth, Andries, Arina, Jason, Vitalii

- Topics for discussion

    - Release

        - issues with publishing the web site

        - annoucnement should be up shortly

    - Jacques had mentioned Metadata caching

        - follow up if he wants to post thoughts

    - Discussion was short today


Drill Hangout - 2/23/2016

- Attendees: Jason, Minji, Laurent, Arina, Parth, Sudheesh, Zelaine


arina -- modify calcite, timestamp related function --> contact calcite
folks/julien


improve c++ client, better distribution of queries across cluster,
randomization routine not distributing uniformly.

session options not allowed since can't maintain sessions if uniformly
distributed

--> c++ client std c library rand() function not always good

--> different random number generator

--> new connnection in the pool, then need to keep track of all the
altersessions (temporary tables, new schema, etc.)

--> small number of clients, need foreman workload distributed more
(planning and so on)

--> ping jacques


impersonation--> client to impersonate other clients (Delegation?)

--> odbc/jdbc:  provide an api (c++/java) and how they will use it

--> waiting on comments


better testing for operator:  better tests for independent components

--> mock internal parts of systems

--> run operators in isolation (posting soon)

--> exchanges needs a bit more discussion (vector container) - separate way
to mock data coming in


juliens test changes to run tests on multiple drill bits (?)

--> This actually wasn't Julien's contribution as was in the meeting,
Sudheesh was actually referring to Andrew's PR here:
https://github.com/apache/drill/pull/135