You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@druid.apache.org by Julian Jaffe <jj...@pinterest.com.INVALID> on 2019/01/07 22:56:44 UTC

Pointers on implementing a new ShardSpec

Hey all,

Are there any major caveats or gotchas I should be aware of when
implementing a new ShardSpec? The context here is that we have a datasource
that is the combined result of multiple input jobs. We're trying to do
write-side joining by having all of the jobs write segments for the same
intervals (e.g. partitioning on both partition number and source pipeline).
For now, I've modified the Spark-Druid batch ingestor (
https://github.com/metamx/druid-spark-batch) to run in our various
pipelines and to write out segments with identifier form
`dataSource_startInterval_endInterval_version_sourceName_partitionNum. This
is working without issue for loading, querying, and deleting data, but the
metadata API reports the incorrect segment identifier, since it
reconstructs the identifier instead of reading from metadata (e.g. it
reports segment identifiers of the form
`dataSource_startInterval_endInterval_version_partitionNum`). Both because
we'd like this to be fully supported, and because we imagine that this
feature may be useful to others, I'd like to implement this via a ShardSpec.

Julian

Re: Pointers on implementing a new ShardSpec

Posted by Gian Merlino <gi...@apache.org>.

IMO, if you think it makes sense to contribute an interface to it too (some
option that triggers it) then I'd say write up a short proposal (motivation
/ proposed change as a GitHub issue) as the first step towards
contributing. I don't think I understand it well enough from your
description so an issue helps with that. I also don't feel that we've
totally decided on what the process should be there, but, that's the
direction the conversation seems to be going in the other thread as to how
new changes should start.

On Fri, Jan 11, 2019 at 1:26 PM Julian Jaffe <jj...@pinterest.com.invalid>
wrote:

> The new shard spec
>
> On Fri, Jan 11, 2019 at 8:37 AM Gian Merlino <gi...@apache.org> wrote:
>
> > Do you mean the modifications to the metamx/druid-spark-batch project or
> > the new ShardSpec you're working on?
> >
> > On Thu, Jan 10, 2019 at 3:09 PM Julian Jaffe
> <jjaffe@pinterest.com.invalid
> > >
> > wrote:
> >
> > > Thanks for the detailed pointers, Gian! In light of the ongoing
> > discussion
> > > around on-list development, does this seem like something that's
> > worthwhile
> > > to anyone else in the community?
> > >
> > > On Tue, Jan 8, 2019 at 10:32 AM Gian Merlino <gi...@apache.org> wrote:
> > >
> > > > Hey Julian,
> > > >
> > > > There aren't any gotchas that I can think of other than the fact that
> > > they
> > > > are not super well documented, and you might miss some features if
> > you're
> > > > just skimming the code. A couple points that might matter,
> > > >
> > > > 1) PartitionChunk is what allows a shard spec to contribute to the
> > > > definition of whether a set of segments for a time chunk is
> "complete".
> > > > It's an important concept since the broker will not query segment
> sets
> > > > unless the chunk is complete. The way the completeness check works is
> > > > basically that the broker will get all the ShardSpecs for all the
> > > segments
> > > > in a time chunk, order them by partitionNum, generate the partition
> > > chunks,
> > > > and check if (a) the first one is a starter based on "isStart", (b)
> any
> > > > subsequent ones until the end [based on "isEnd"] abut the previous
> one
> > > > [based on "abuts"]. Some ShardSpecs use nonsensical-at-first-glance
> > logic
> > > > from these methods to short circuit the completeness checks: time
> > chunks
> > > > with LinearShardSpecs are _always_ considered complete. Time chunks
> > with
> > > > NumberedShardSpecs can have "partitionNum" go beyond "partitions",
> and
> > > are
> > > > considered complete if the first "partitions" number of segments are
> > > > present.
> > > >
> > > > 2) "getDomainDimensions" and "possibleInDomain" are optional, but
> > useful
> > > > for powering broker-side segment pruning.
> > > >
> > > > 3) All segments in the same time chunk must have the same kind of
> > > > ShardSpec. However, it can vary from time chunk to time chunk within
> a
> > > > datasource.
> > > >
> > > > On Mon, Jan 7, 2019 at 2:56 PM Julian Jaffe
> > <jjaffe@pinterest.com.invalid
> > > >
> > > > wrote:
> > > >
> > > > > Hey all,
> > > > >
> > > > > Are there any major caveats or gotchas I should be aware of when
> > > > > implementing a new ShardSpec? The context here is that we have a
> > > > datasource
> > > > > that is the combined result of multiple input jobs. We're trying to
> > do
> > > > > write-side joining by having all of the jobs write segments for the
> > > same
> > > > > intervals (e.g. partitioning on both partition number and source
> > > > pipeline).
> > > > > For now, I've modified the Spark-Druid batch ingestor (
> > > > > https://github.com/metamx/druid-spark-batch) to run in our various
> > > > > pipelines and to write out segments with identifier form
> > > > >
> > `dataSource_startInterval_endInterval_version_sourceName_partitionNum.
> > > > This
> > > > > is working without issue for loading, querying, and deleting data,
> > but
> > > > the
> > > > > metadata API reports the incorrect segment identifier, since it
> > > > > reconstructs the identifier instead of reading from metadata (e.g.
> it
> > > > > reports segment identifiers of the form
> > > > > `dataSource_startInterval_endInterval_version_partitionNum`). Both
> > > > because
> > > > > we'd like this to be fully supported, and because we imagine that
> > this
> > > > > feature may be useful to others, I'd like to implement this via a
> > > > > ShardSpec.
> > > > >
> > > > > Julian
> > > > >
> > > >
> > >
> >
>

Re: Pointers on implementing a new ShardSpec

Posted by Julian Jaffe <jj...@pinterest.com.INVALID>.

The new shard spec

On Fri, Jan 11, 2019 at 8:37 AM Gian Merlino <gi...@apache.org> wrote:

> Do you mean the modifications to the metamx/druid-spark-batch project or
> the new ShardSpec you're working on?
>
> On Thu, Jan 10, 2019 at 3:09 PM Julian Jaffe <jjaffe@pinterest.com.invalid
> >
> wrote:
>
> > Thanks for the detailed pointers, Gian! In light of the ongoing
> discussion
> > around on-list development, does this seem like something that's
> worthwhile
> > to anyone else in the community?
> >
> > On Tue, Jan 8, 2019 at 10:32 AM Gian Merlino <gi...@apache.org> wrote:
> >
> > > Hey Julian,
> > >
> > > There aren't any gotchas that I can think of other than the fact that
> > they
> > > are not super well documented, and you might miss some features if
> you're
> > > just skimming the code. A couple points that might matter,
> > >
> > > 1) PartitionChunk is what allows a shard spec to contribute to the
> > > definition of whether a set of segments for a time chunk is "complete".
> > > It's an important concept since the broker will not query segment sets
> > > unless the chunk is complete. The way the completeness check works is
> > > basically that the broker will get all the ShardSpecs for all the
> > segments
> > > in a time chunk, order them by partitionNum, generate the partition
> > chunks,
> > > and check if (a) the first one is a starter based on "isStart", (b) any
> > > subsequent ones until the end [based on "isEnd"] abut the previous one
> > > [based on "abuts"]. Some ShardSpecs use nonsensical-at-first-glance
> logic
> > > from these methods to short circuit the completeness checks: time
> chunks
> > > with LinearShardSpecs are _always_ considered complete. Time chunks
> with
> > > NumberedShardSpecs can have "partitionNum" go beyond "partitions", and
> > are
> > > considered complete if the first "partitions" number of segments are
> > > present.
> > >
> > > 2) "getDomainDimensions" and "possibleInDomain" are optional, but
> useful
> > > for powering broker-side segment pruning.
> > >
> > > 3) All segments in the same time chunk must have the same kind of
> > > ShardSpec. However, it can vary from time chunk to time chunk within a
> > > datasource.
> > >
> > > On Mon, Jan 7, 2019 at 2:56 PM Julian Jaffe
> <jjaffe@pinterest.com.invalid
> > >
> > > wrote:
> > >
> > > > Hey all,
> > > >
> > > > Are there any major caveats or gotchas I should be aware of when
> > > > implementing a new ShardSpec? The context here is that we have a
> > > datasource
> > > > that is the combined result of multiple input jobs. We're trying to
> do
> > > > write-side joining by having all of the jobs write segments for the
> > same
> > > > intervals (e.g. partitioning on both partition number and source
> > > pipeline).
> > > > For now, I've modified the Spark-Druid batch ingestor (
> > > > https://github.com/metamx/druid-spark-batch) to run in our various
> > > > pipelines and to write out segments with identifier form
> > > >
> `dataSource_startInterval_endInterval_version_sourceName_partitionNum.
> > > This
> > > > is working without issue for loading, querying, and deleting data,
> but
> > > the
> > > > metadata API reports the incorrect segment identifier, since it
> > > > reconstructs the identifier instead of reading from metadata (e.g. it
> > > > reports segment identifiers of the form
> > > > `dataSource_startInterval_endInterval_version_partitionNum`). Both
> > > because
> > > > we'd like this to be fully supported, and because we imagine that
> this
> > > > feature may be useful to others, I'd like to implement this via a
> > > > ShardSpec.
> > > >
> > > > Julian
> > > >
> > >
> >
>

Re: Pointers on implementing a new ShardSpec

Posted by Gian Merlino <gi...@apache.org>.

Do you mean the modifications to the metamx/druid-spark-batch project or
the new ShardSpec you're working on?

On Thu, Jan 10, 2019 at 3:09 PM Julian Jaffe <jj...@pinterest.com.invalid>
wrote:

> Thanks for the detailed pointers, Gian! In light of the ongoing discussion
> around on-list development, does this seem like something that's worthwhile
> to anyone else in the community?
>
> On Tue, Jan 8, 2019 at 10:32 AM Gian Merlino <gi...@apache.org> wrote:
>
> > Hey Julian,
> >
> > There aren't any gotchas that I can think of other than the fact that
> they
> > are not super well documented, and you might miss some features if you're
> > just skimming the code. A couple points that might matter,
> >
> > 1) PartitionChunk is what allows a shard spec to contribute to the
> > definition of whether a set of segments for a time chunk is "complete".
> > It's an important concept since the broker will not query segment sets
> > unless the chunk is complete. The way the completeness check works is
> > basically that the broker will get all the ShardSpecs for all the
> segments
> > in a time chunk, order them by partitionNum, generate the partition
> chunks,
> > and check if (a) the first one is a starter based on "isStart", (b) any
> > subsequent ones until the end [based on "isEnd"] abut the previous one
> > [based on "abuts"]. Some ShardSpecs use nonsensical-at-first-glance logic
> > from these methods to short circuit the completeness checks: time chunks
> > with LinearShardSpecs are _always_ considered complete. Time chunks with
> > NumberedShardSpecs can have "partitionNum" go beyond "partitions", and
> are
> > considered complete if the first "partitions" number of segments are
> > present.
> >
> > 2) "getDomainDimensions" and "possibleInDomain" are optional, but useful
> > for powering broker-side segment pruning.
> >
> > 3) All segments in the same time chunk must have the same kind of
> > ShardSpec. However, it can vary from time chunk to time chunk within a
> > datasource.
> >
> > On Mon, Jan 7, 2019 at 2:56 PM Julian Jaffe <jjaffe@pinterest.com.invalid
> >
> > wrote:
> >
> > > Hey all,
> > >
> > > Are there any major caveats or gotchas I should be aware of when
> > > implementing a new ShardSpec? The context here is that we have a
> > datasource
> > > that is the combined result of multiple input jobs. We're trying to do
> > > write-side joining by having all of the jobs write segments for the
> same
> > > intervals (e.g. partitioning on both partition number and source
> > pipeline).
> > > For now, I've modified the Spark-Druid batch ingestor (
> > > https://github.com/metamx/druid-spark-batch) to run in our various
> > > pipelines and to write out segments with identifier form
> > > `dataSource_startInterval_endInterval_version_sourceName_partitionNum.
> > This
> > > is working without issue for loading, querying, and deleting data, but
> > the
> > > metadata API reports the incorrect segment identifier, since it
> > > reconstructs the identifier instead of reading from metadata (e.g. it
> > > reports segment identifiers of the form
> > > `dataSource_startInterval_endInterval_version_partitionNum`). Both
> > because
> > > we'd like this to be fully supported, and because we imagine that this
> > > feature may be useful to others, I'd like to implement this via a
> > > ShardSpec.
> > >
> > > Julian
> > >
> >
>

Re: Pointers on implementing a new ShardSpec

Posted by Julian Jaffe <jj...@pinterest.com.INVALID>.

Thanks for the detailed pointers, Gian! In light of the ongoing discussion
around on-list development, does this seem like something that's worthwhile
to anyone else in the community?

On Tue, Jan 8, 2019 at 10:32 AM Gian Merlino <gi...@apache.org> wrote:

> Hey Julian,
>
> There aren't any gotchas that I can think of other than the fact that they
> are not super well documented, and you might miss some features if you're
> just skimming the code. A couple points that might matter,
>
> 1) PartitionChunk is what allows a shard spec to contribute to the
> definition of whether a set of segments for a time chunk is "complete".
> It's an important concept since the broker will not query segment sets
> unless the chunk is complete. The way the completeness check works is
> basically that the broker will get all the ShardSpecs for all the segments
> in a time chunk, order them by partitionNum, generate the partition chunks,
> and check if (a) the first one is a starter based on "isStart", (b) any
> subsequent ones until the end [based on "isEnd"] abut the previous one
> [based on "abuts"]. Some ShardSpecs use nonsensical-at-first-glance logic
> from these methods to short circuit the completeness checks: time chunks
> with LinearShardSpecs are _always_ considered complete. Time chunks with
> NumberedShardSpecs can have "partitionNum" go beyond "partitions", and are
> considered complete if the first "partitions" number of segments are
> present.
>
> 2) "getDomainDimensions" and "possibleInDomain" are optional, but useful
> for powering broker-side segment pruning.
>
> 3) All segments in the same time chunk must have the same kind of
> ShardSpec. However, it can vary from time chunk to time chunk within a
> datasource.
>
> On Mon, Jan 7, 2019 at 2:56 PM Julian Jaffe <jj...@pinterest.com.invalid>
> wrote:
>
> > Hey all,
> >
> > Are there any major caveats or gotchas I should be aware of when
> > implementing a new ShardSpec? The context here is that we have a
> datasource
> > that is the combined result of multiple input jobs. We're trying to do
> > write-side joining by having all of the jobs write segments for the same
> > intervals (e.g. partitioning on both partition number and source
> pipeline).
> > For now, I've modified the Spark-Druid batch ingestor (
> > https://github.com/metamx/druid-spark-batch) to run in our various
> > pipelines and to write out segments with identifier form
> > `dataSource_startInterval_endInterval_version_sourceName_partitionNum.
> This
> > is working without issue for loading, querying, and deleting data, but
> the
> > metadata API reports the incorrect segment identifier, since it
> > reconstructs the identifier instead of reading from metadata (e.g. it
> > reports segment identifiers of the form
> > `dataSource_startInterval_endInterval_version_partitionNum`). Both
> because
> > we'd like this to be fully supported, and because we imagine that this
> > feature may be useful to others, I'd like to implement this via a
> > ShardSpec.
> >
> > Julian
> >
>

Re: Pointers on implementing a new ShardSpec

Posted by Gian Merlino <gi...@apache.org>.

Hey Julian,

There aren't any gotchas that I can think of other than the fact that they
are not super well documented, and you might miss some features if you're
just skimming the code. A couple points that might matter,

1) PartitionChunk is what allows a shard spec to contribute to the
definition of whether a set of segments for a time chunk is "complete".
It's an important concept since the broker will not query segment sets
unless the chunk is complete. The way the completeness check works is
basically that the broker will get all the ShardSpecs for all the segments
in a time chunk, order them by partitionNum, generate the partition chunks,
and check if (a) the first one is a starter based on "isStart", (b) any
subsequent ones until the end [based on "isEnd"] abut the previous one
[based on "abuts"]. Some ShardSpecs use nonsensical-at-first-glance logic
from these methods to short circuit the completeness checks: time chunks
with LinearShardSpecs are _always_ considered complete. Time chunks with
NumberedShardSpecs can have "partitionNum" go beyond "partitions", and are
considered complete if the first "partitions" number of segments are
present.

2) "getDomainDimensions" and "possibleInDomain" are optional, but useful
for powering broker-side segment pruning.

3) All segments in the same time chunk must have the same kind of
ShardSpec. However, it can vary from time chunk to time chunk within a
datasource.

On Mon, Jan 7, 2019 at 2:56 PM Julian Jaffe <jj...@pinterest.com.invalid>
wrote:

> Hey all,
>
> Are there any major caveats or gotchas I should be aware of when
> implementing a new ShardSpec? The context here is that we have a datasource
> that is the combined result of multiple input jobs. We're trying to do
> write-side joining by having all of the jobs write segments for the same
> intervals (e.g. partitioning on both partition number and source pipeline).
> For now, I've modified the Spark-Druid batch ingestor (
> https://github.com/metamx/druid-spark-batch) to run in our various
> pipelines and to write out segments with identifier form
> `dataSource_startInterval_endInterval_version_sourceName_partitionNum. This
> is working without issue for loading, querying, and deleting data, but the
> metadata API reports the incorrect segment identifier, since it
> reconstructs the identifier instead of reading from metadata (e.g. it
> reports segment identifiers of the form
> `dataSource_startInterval_endInterval_version_partitionNum`). Both because
> we'd like this to be fully supported, and because we imagine that this
> feature may be useful to others, I'd like to implement this via a
> ShardSpec.
>
> Julian
>