You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@kudu.apache.org by Boris Tyukin <bo...@boristyukin.com> on 2018/10/11 19:13:38 UTC

Multi-level partitions question

Hi guys,
Read this doc
https://kudu.apache.org/docs/schema_design.html#multilevel-partitioning
and I have a question on this particular statement
"Scans on multilevel partitioned tables can take advantage of partition
pruning on any of the levels independently"

Does it mean, that both strategies below would be equivalent in terms of
performance (i.e. minimum scans)

partition by hash(shop_id), hash(customer_id)
vs.
partition by hash(customer_id), hash(shop_id)

60% of the queries are using both shop_id and customer_id but 40% of
queries need to pull all customers for a specific shop_id. And almost never
by customer_id alone (customer_id is not unique across shops and is
assigned per shop).

At the same time, if I partition by customer_id first,  partitions will be
distributed more evenly.

Thanks!
Boris

Re: Multi-level partitions question

Posted by Boris <bo...@gmail.com>.

Got it, appreciate your help, Dan

On Thu, Oct 11, 2018, 18:25 Dan Burkert <da...@cloudera.com> wrote:

> > Just to clarify, are you saying that partition by hash(shop_id),
> hash(customer_id) and partition by hash(customer_id), hash(shop_id) are
> totally equivalent in terms of having tablets being skewed and there is no
> difference at all?
>
> Yes, that's correct, there is no difference at all in any metric,
> including skew, read, or write performance.
>
> > I thought it would be more like with Hive/Impala HDFS, then nested
> partitions are grouped under different parent partitions. Does Kudu store
> them as a bunch of independent files instead and each file will have data
> for the specific hash of shop_id/customer_id?
>
> That's correct, there is no parent/child relationship.  The only minor
> clarification is that the individual hash pairs are a tablet, not a file.
>
> - Dan
>
> On Thu, Oct 11, 2018 at 1:31 PM Boris Tyukin <bo...@boristyukin.com>
> wrote:
>
>> Hi Dan,
>>
>> thanks for replying so quickly. Unfortunately, customer_id per shop is
>> indeed very skewed (some shops have way more customers than others) and I
>> was tempted to do  hash(shop_id, customer_id) but because 40% of queries
>> won't use customer_id, it would cause a full scan. Actually to be accurate,
>> then we join tables in Impala, first join might be limited on shop_id,
>> while other tables will be joined by shop_id and customer_id. Of course,
>> that first table will be very large (billions of rows).
>>
>> Just to clarify, are you saying that partition by hash(shop_id),
>> hash(customer_id) and partition by hash(customer_id), hash(shop_id) are
>> totally equivalent in terms of having tablets being skewed and there is no
>> difference at all?
>>
>> I thought it would be more like with Hive/Impala HDFS, then nested
>> partitions are grouped under different parent partitions. Does Kudu store
>> them as a bunch of independent files instead and each file will have data
>> for the specific hash of shop_id/customer_id?
>>
>> Boris
>>
>> On Thu, Oct 11, 2018 at 4:05 PM Dan Burkert <da...@apache.org>
>> wrote:
>>
>>> Hi Boris,
>>>
>>> The two examples you gave are exactly equivalent; the relative ordering
>>> of hash levels has no effect on query performance, hotspotting, or anything
>>> else.  Given that 60% of your queries don't specify a specific customer_id,
>>> it does make sense to use hash(shop_id), hash(customer_id) instead of
>>> combining them in a single hash level as hash(shop_id, customer_id),
>>> however the trade-off is that the hotspotting resistance isn't as good.  If
>>> the shop_id and customer_id columns aren't skewed to begin with that's not
>>> a concern, though.
>>>
>>> - Dan
>>>
>>> On Thu, Oct 11, 2018 at 12:14 PM Boris Tyukin <bo...@boristyukin.com>
>>> wrote:
>>>
>>>> Hi guys,
>>>> Read this doc
>>>> https://kudu.apache.org/docs/schema_design.html#multilevel-partitioning
>>>> and I have a question on this particular statement
>>>> "Scans on multilevel partitioned tables can take advantage of partition
>>>> pruning on any of the levels independently"
>>>>
>>>> Does it mean, that both strategies below would be equivalent in terms
>>>> of performance (i.e. minimum scans)
>>>>
>>>> partition by hash(shop_id), hash(customer_id)
>>>> vs.
>>>> partition by hash(customer_id), hash(shop_id)
>>>>
>>>> 60% of the queries are using both shop_id and customer_id but 40% of
>>>> queries need to pull all customers for a specific shop_id. And almost never
>>>> by customer_id alone (customer_id is not unique across shops and is
>>>> assigned per shop).
>>>>
>>>> At the same time, if I partition by customer_id first,  partitions will
>>>> be distributed more evenly.
>>>>
>>>> Thanks!
>>>> Boris
>>>>
>>>>
>>>>
>>>>

Re: Multi-level partitions question

Posted by Dan Burkert <da...@apache.org>.

One thing I'll add - you should consider trying out the following schema:

CREATE TABLE orders (
shop_id bigint,
customer_id bigint,
...,
PRIMARY KEY (shop_id, customer_id))
PARTITION BY HASH (shop_id, customer_id) PARTITIONS <n>;

The thinking there is you will get great skew resistance, the only way
there could be skew among hash buckets is if a particular customer at a
particular shop had an outsized number of orders.  In terms of scan
performance, when the query has equality on both shop_id and customer_id
all tablets can be pruned except the one containing the data.  When the
query has just equality on the shop_id there will be no partition pruning,
however the scans on every tablet should be quite fast because they can
take advantage of the PK to only scan the data related to the specific
shop.  Depending on your circumstances this could be ideal; if individual
shops have a lot of data you would actually want it to be parallelized
across many tablets, and therefore be able to take advantage of many
tservers to perform the scan.

- Dan

On Thu, Oct 11, 2018 at 3:25 PM Dan Burkert <da...@cloudera.com> wrote:

> > Just to clarify, are you saying that partition by hash(shop_id),
> hash(customer_id) and partition by hash(customer_id), hash(shop_id) are
> totally equivalent in terms of having tablets being skewed and there is no
> difference at all?
>
> Yes, that's correct, there is no difference at all in any metric,
> including skew, read, or write performance.
>
> > I thought it would be more like with Hive/Impala HDFS, then nested
> partitions are grouped under different parent partitions. Does Kudu store
> them as a bunch of independent files instead and each file will have data
> for the specific hash of shop_id/customer_id?
>
> That's correct, there is no parent/child relationship.  The only minor
> clarification is that the individual hash pairs are a tablet, not a file.
>
> - Dan
>
> On Thu, Oct 11, 2018 at 1:31 PM Boris Tyukin <bo...@boristyukin.com>
> wrote:
>
>> Hi Dan,
>>
>> thanks for replying so quickly. Unfortunately, customer_id per shop is
>> indeed very skewed (some shops have way more customers than others) and I
>> was tempted to do  hash(shop_id, customer_id) but because 40% of queries
>> won't use customer_id, it would cause a full scan. Actually to be accurate,
>> then we join tables in Impala, first join might be limited on shop_id,
>> while other tables will be joined by shop_id and customer_id. Of course,
>> that first table will be very large (billions of rows).
>>
>> Just to clarify, are you saying that partition by hash(shop_id),
>> hash(customer_id) and partition by hash(customer_id), hash(shop_id) are
>> totally equivalent in terms of having tablets being skewed and there is no
>> difference at all?
>>
>> I thought it would be more like with Hive/Impala HDFS, then nested
>> partitions are grouped under different parent partitions. Does Kudu store
>> them as a bunch of independent files instead and each file will have data
>> for the specific hash of shop_id/customer_id?
>>
>> Boris
>>
>> On Thu, Oct 11, 2018 at 4:05 PM Dan Burkert <da...@apache.org>
>> wrote:
>>
>>> Hi Boris,
>>>
>>> The two examples you gave are exactly equivalent; the relative ordering
>>> of hash levels has no effect on query performance, hotspotting, or anything
>>> else.  Given that 60% of your queries don't specify a specific customer_id,
>>> it does make sense to use hash(shop_id), hash(customer_id) instead of
>>> combining them in a single hash level as hash(shop_id, customer_id),
>>> however the trade-off is that the hotspotting resistance isn't as good.  If
>>> the shop_id and customer_id columns aren't skewed to begin with that's not
>>> a concern, though.
>>>
>>> - Dan
>>>
>>> On Thu, Oct 11, 2018 at 12:14 PM Boris Tyukin <bo...@boristyukin.com>
>>> wrote:
>>>
>>>> Hi guys,
>>>> Read this doc
>>>> https://kudu.apache.org/docs/schema_design.html#multilevel-partitioning
>>>> and I have a question on this particular statement
>>>> "Scans on multilevel partitioned tables can take advantage of partition
>>>> pruning on any of the levels independently"
>>>>
>>>> Does it mean, that both strategies below would be equivalent in terms
>>>> of performance (i.e. minimum scans)
>>>>
>>>> partition by hash(shop_id), hash(customer_id)
>>>> vs.
>>>> partition by hash(customer_id), hash(shop_id)
>>>>
>>>> 60% of the queries are using both shop_id and customer_id but 40% of
>>>> queries need to pull all customers for a specific shop_id. And almost never
>>>> by customer_id alone (customer_id is not unique across shops and is
>>>> assigned per shop).
>>>>
>>>> At the same time, if I partition by customer_id first,  partitions will
>>>> be distributed more evenly.
>>>>
>>>> Thanks!
>>>> Boris
>>>>
>>>>
>>>>
>>>>

Re: Multi-level partitions question

Posted by Dan Burkert <da...@cloudera.com>.

> Just to clarify, are you saying that partition by hash(shop_id),
hash(customer_id) and partition by hash(customer_id), hash(shop_id) are
totally equivalent in terms of having tablets being skewed and there is no
difference at all?

Yes, that's correct, there is no difference at all in any metric, including
skew, read, or write performance.

> I thought it would be more like with Hive/Impala HDFS, then nested
partitions are grouped under different parent partitions. Does Kudu store
them as a bunch of independent files instead and each file will have data
for the specific hash of shop_id/customer_id?

That's correct, there is no parent/child relationship.  The only minor
clarification is that the individual hash pairs are a tablet, not a file.

- Dan

On Thu, Oct 11, 2018 at 1:31 PM Boris Tyukin <bo...@boristyukin.com> wrote:

> Hi Dan,
>
> thanks for replying so quickly. Unfortunately, customer_id per shop is
> indeed very skewed (some shops have way more customers than others) and I
> was tempted to do  hash(shop_id, customer_id) but because 40% of queries
> won't use customer_id, it would cause a full scan. Actually to be accurate,
> then we join tables in Impala, first join might be limited on shop_id,
> while other tables will be joined by shop_id and customer_id. Of course,
> that first table will be very large (billions of rows).
>
> Just to clarify, are you saying that partition by hash(shop_id),
> hash(customer_id) and partition by hash(customer_id), hash(shop_id) are
> totally equivalent in terms of having tablets being skewed and there is no
> difference at all?
>
> I thought it would be more like with Hive/Impala HDFS, then nested
> partitions are grouped under different parent partitions. Does Kudu store
> them as a bunch of independent files instead and each file will have data
> for the specific hash of shop_id/customer_id?
>
> Boris
>
> On Thu, Oct 11, 2018 at 4:05 PM Dan Burkert <da...@apache.org> wrote:
>
>> Hi Boris,
>>
>> The two examples you gave are exactly equivalent; the relative ordering
>> of hash levels has no effect on query performance, hotspotting, or anything
>> else.  Given that 60% of your queries don't specify a specific customer_id,
>> it does make sense to use hash(shop_id), hash(customer_id) instead of
>> combining them in a single hash level as hash(shop_id, customer_id),
>> however the trade-off is that the hotspotting resistance isn't as good.  If
>> the shop_id and customer_id columns aren't skewed to begin with that's not
>> a concern, though.
>>
>> - Dan
>>
>> On Thu, Oct 11, 2018 at 12:14 PM Boris Tyukin <bo...@boristyukin.com>
>> wrote:
>>
>>> Hi guys,
>>> Read this doc
>>> https://kudu.apache.org/docs/schema_design.html#multilevel-partitioning
>>> and I have a question on this particular statement
>>> "Scans on multilevel partitioned tables can take advantage of partition
>>> pruning on any of the levels independently"
>>>
>>> Does it mean, that both strategies below would be equivalent in terms of
>>> performance (i.e. minimum scans)
>>>
>>> partition by hash(shop_id), hash(customer_id)
>>> vs.
>>> partition by hash(customer_id), hash(shop_id)
>>>
>>> 60% of the queries are using both shop_id and customer_id but 40% of
>>> queries need to pull all customers for a specific shop_id. And almost never
>>> by customer_id alone (customer_id is not unique across shops and is
>>> assigned per shop).
>>>
>>> At the same time, if I partition by customer_id first,  partitions will
>>> be distributed more evenly.
>>>
>>> Thanks!
>>> Boris
>>>
>>>
>>>
>>>

Re: Multi-level partitions question

Posted by Boris Tyukin <bo...@boristyukin.com>.

Hi Dan,

thanks for replying so quickly. Unfortunately, customer_id per shop is
indeed very skewed (some shops have way more customers than others) and I
was tempted to do  hash(shop_id, customer_id) but because 40% of queries
won't use customer_id, it would cause a full scan. Actually to be accurate,
then we join tables in Impala, first join might be limited on shop_id,
while other tables will be joined by shop_id and customer_id. Of course,
that first table will be very large (billions of rows).

Just to clarify, are you saying that partition by hash(shop_id),
hash(customer_id) and partition by hash(customer_id), hash(shop_id) are
totally equivalent in terms of having tablets being skewed and there is no
difference at all?

I thought it would be more like with Hive/Impala HDFS, then nested
partitions are grouped under different parent partitions. Does Kudu store
them as a bunch of independent files instead and each file will have data
for the specific hash of shop_id/customer_id?

Boris

On Thu, Oct 11, 2018 at 4:05 PM Dan Burkert <da...@apache.org> wrote:

> Hi Boris,
>
> The two examples you gave are exactly equivalent; the relative ordering of
> hash levels has no effect on query performance, hotspotting, or anything
> else.  Given that 60% of your queries don't specify a specific customer_id,
> it does make sense to use hash(shop_id), hash(customer_id) instead of
> combining them in a single hash level as hash(shop_id, customer_id),
> however the trade-off is that the hotspotting resistance isn't as good.  If
> the shop_id and customer_id columns aren't skewed to begin with that's not
> a concern, though.
>
> - Dan
>
> On Thu, Oct 11, 2018 at 12:14 PM Boris Tyukin <bo...@boristyukin.com>
> wrote:
>
>> Hi guys,
>> Read this doc
>> https://kudu.apache.org/docs/schema_design.html#multilevel-partitioning
>> and I have a question on this particular statement
>> "Scans on multilevel partitioned tables can take advantage of partition
>> pruning on any of the levels independently"
>>
>> Does it mean, that both strategies below would be equivalent in terms of
>> performance (i.e. minimum scans)
>>
>> partition by hash(shop_id), hash(customer_id)
>> vs.
>> partition by hash(customer_id), hash(shop_id)
>>
>> 60% of the queries are using both shop_id and customer_id but 40% of
>> queries need to pull all customers for a specific shop_id. And almost never
>> by customer_id alone (customer_id is not unique across shops and is
>> assigned per shop).
>>
>> At the same time, if I partition by customer_id first,  partitions will
>> be distributed more evenly.
>>
>> Thanks!
>> Boris
>>
>>
>>
>>

Re: Multi-level partitions question

Posted by Dan Burkert <da...@apache.org>.

Hi Boris,

The two examples you gave are exactly equivalent; the relative ordering of
hash levels has no effect on query performance, hotspotting, or anything
else.  Given that 60% of your queries don't specify a specific customer_id,
it does make sense to use hash(shop_id), hash(customer_id) instead of
combining them in a single hash level as hash(shop_id, customer_id),
however the trade-off is that the hotspotting resistance isn't as good.  If
the shop_id and customer_id columns aren't skewed to begin with that's not
a concern, though.

- Dan

On Thu, Oct 11, 2018 at 12:14 PM Boris Tyukin <bo...@boristyukin.com> wrote:

> Hi guys,
> Read this doc
> https://kudu.apache.org/docs/schema_design.html#multilevel-partitioning
> and I have a question on this particular statement
> "Scans on multilevel partitioned tables can take advantage of partition
> pruning on any of the levels independently"
>
> Does it mean, that both strategies below would be equivalent in terms of
> performance (i.e. minimum scans)
>
> partition by hash(shop_id), hash(customer_id)
> vs.
> partition by hash(customer_id), hash(shop_id)
>
> 60% of the queries are using both shop_id and customer_id but 40% of
> queries need to pull all customers for a specific shop_id. And almost never
> by customer_id alone (customer_id is not unique across shops and is
> assigned per shop).
>
> At the same time, if I partition by customer_id first,  partitions will be
> distributed more evenly.
>
> Thanks!
> Boris
>
>
>
>