You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2018/07/30 18:15:33 UTC

[GitHub] ilooner opened a new pull request #1408: DRILL-6453: Resolve deadlock when reading from build and probe sides simultaneously in HashJoin

ilooner opened a new pull request #1408: DRILL-6453: Resolve deadlock when reading from build and probe sides simultaneously in HashJoin
URL: https://github.com/apache/drill/pull/1408

# The Problem

Originally hash join sniffed the first data holding batch in the probe and build side. Using the size statistics from both sides, memory calculations were performed in order to determine when to spill data.

The issue with this is that fetching the first data holding batch from both sides can cause a deadlock in the exchange operators. Details of how this can happen have been included by others on the Jira.

# Theory of Operation

## Sniffing Batches

Batch sniffing is done in three phases.

1. Schema sniffing is done in buildSchema()
2. Before executing the build phase we sniff the first data holding build side batch and use the stats to decide the number of partitions and do memory calculations.
3. Before executing the probe phase we sniff the first data holding probe side batch, and use the size statistics to do memory calculations that decide when to spill.

## Memory Estimation

When sniffing the schema for the build and probe side, we may get lucky and get data for the probe side. If this is the case then we cause use the probe side data to estimate the optimal number of partitions to use in the join operator. If we don't have probe side data when computing the number of partitions to use we assume that the incoming probe batches will be less than or equal to the configured batch size.

Since the number of partitions must be configured upfront before we may have probe data, we may get stuck in a situation where we have too many partitions to effectively process the probe side. In order to avoid this scenario we also adjust the number of records in probe side partition batches after we receive data from the probe side.

## Corner cases

While implementing this many corner cases had to be handled.

- Empty build side
- Empty probe side
- Empty probe and build sides
- Getting probe side data when retrieving the probe schema
- Not getting probe side data when retrieving the probe schema

## Testing

I added unit tests for all the corner cases, and have extracted logic for predicting incoming and partition batch sizes into BatchSizePredictorImpl. In unit tests various corner cases are tested by providing mock implementation of BatchSizePredictorImpl.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services