You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@quickstep.apache.org by Jignesh Patel <jm...@gmail.com> on 2016/09/19 17:11:44 UTC

Re: Missing report + some thoughts

Will do Julian – Sorry I was traveling and missed this. 

PS: Everyone – please feel free to take the lead in drafting a report if you want to do that. 

As a general update, I’d like to welcome Tarun Bansal as a committer. He is helping make the data ingest part of Quickstep more robust (it is quite fragile right now and a corrupt record crashes the system). He has an initial PR: https://github.com/apache/incubator-quickstep/pull/99. Welcome Tarun! 

In other updates, Harshad with help from others is working on fixing the big performance issues that we have with Aggregation. The key change is combining different aggregate handles into one. Rathijit started work on this (Thanks Rathijit!) and has an open PR https://github.com/apache/incubator-quickstep/pull/90). There is a bunch of cleanup to do to make this work that Harshad is working on with Rathijit.

One issue that keeps coming up is the TypedValue (this shows up in the aggregate work too). It is too heavy-weight and a huge performance bottleneck. Marc is looking at how this could be removed from another part where it is a bottleneck, i.e. in the ValueAccessors (which themselves are now showing their age and issues with design). 

Essentially, between the TypeValue, ValueAccessors, and HashTables (for aggregation and joins) there are crucial design issues that need to be addressed to remove performance bottleneck from the core inner loops. The approach being considered is to remove TypedValues as much as possible, refactor code in the ValueAccessors and over time move to more transparent iterators that also allow for compilers to unroll loops, and allow for more efficient hash tables that have more flexibility in terms of the payload they can take. Any other ideas that anyone else has is welcome. Essentially, we want to spend a few months removing grunge code, make performance issues transparent, and improve performance. This will also help new developers approach the code. 

Tentatively, we should plan for a first release of Quickstep before the winter break. The first release can target the single node case, focusing on high-performance for SQL with high-memory and multi-core hardware. 

Anyone who has ideas on early adopters that we should approach, please share with the group. 

Cheers,
Jignesh 

On 9/13/16, 12:35 AM, "Julian Hyde" <jh...@apache.org> wrote:

    Hi everyone,
    
    Quickstep didn’t file a report for the Board meeting this month. Can you please be sure to file one next month?
    
    Julian