You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/07/07 23:56:09 UTC

[GitHub] [incubator-pinot] siddharthteotia opened a new issue #5664: Join support

siddharthteotia opened a new issue #5664:
URL: https://github.com/apache/incubator-pinot/issues/5664

There are two standard data-modeling approaches in analytical databases.

**Star Schema**

This is the de-facto standard to model data in data-warehouses to efficiently run OLAP (analytical / BI) style queries. We have a single fact table (containing measures / numeric data, e.g sales) surrounded by one or more dimension tables (e.g product). There is no relationship between dimension tables themselves. The only join path is between fact and dimension tables. In other words, you get a [1 : many] join since a single record in dimension table can be associated with multiple records in fact table.

One of the key things to note about modeling data this way is that dimension tables are denormalized (which again reiterates the fact that there is no relationship between two dimension tables).

**Snowflake Schema**

It is similar to star schema in the sense that there is a fact table and dimension tables. The key difference is that at least some of the dimension tables are normalized (thus leading to more dimension tables). This way you also establish relationship between multiple dimension tables as well.

The potential problem with this schema is normalization and thus resorting to join often even for simple queries. Not only that, writing a join query is way more complex in snowflake as compared to star. This is the main reason why star schema is generally the preferred choice in OLAP world. Managing the schema (I mean the collection of tables) is also complicated in this case.

There are few other ways to model as well but the above two are standard ones. The modeling step is going to be critical since it will dictate (to some extent) what kind of joins we support, complexity of such queries and the complexity at the user end to write such queries.

Standard Distributed Join Techniques

Now regardless of what we do above, there are couple of ways to implement join in distributed query engines like Pinot/Presto/Kusto/Spark etc and broadcast join is one of them.

**Broadcast join** is a common way to execute standard star schema join where we join a large fact table with smaller dimension table(s). The smaller table is then broadcasted to each server for the server to execute a local in-memory join (potentially a hash join by using the dimension table as the build side of the join and fact table as the probe side). The reducer/aggregator layer can do the final processing. In the spark community, this is commonly referred to as map-side join.

**Co-located join for partitioned tables**

Another way is to do a co-located join for partitioned tables. Let's say we are joining tables T1 and T2 on the join key column K. If both tables are partitioned on the join key column with the same uniform partition function, then each node essentially has all the data locally to execute it's side of the join without any data movement (as involved in broadcast join).

There are more known ways (shuffle) to do distributed join with varying degree of data movement, complexity etc. However, I feel the modeling approach that we will adopt along with any restrictions should be the first thing to get clarity on and this will also be determined by (to some extent) what our users currently expect from Pinot in terms of join. I feel we should start looking at join in Pinot with limited support for star schema join (where we try to restrict the number of dimension tables) as a reasonable starting point. We can also look at dimension-to-dimension join but we need to be more careful when modeling the data for such scenario.

I will start creating a proposal cum design document and share with the community. Meanwhile, we can use the issue to have some discussion on requirements etc.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org