You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by Gian Merlino <gi...@imply.io> on 2018/03/30 21:12:03 UTC

Re: [druid-dev] Load rule doesn't honor intervals properly

Hi Pala,

That sounds like a bug to me - a patch would be welcome!

Btw, since we are trying to migrate the dev mailing list to Apache, please
cross post this sort of thing with dev@druid.incubator.apache.org, or even
only post to that list.

Gian

On Thu, Mar 29, 2018 at 5:43 PM, 'Pala Muthiah' via Druid Development <
druid-development@googlegroups.com> wrote:

> Hello folks,
>
> Anybody have insight on the below? Curious to know if there would be
> unforeseen side effects if we count even partial overlap as valid.
>
> On Mon, Mar 19, 2018 at 10:54 AM, pala.muthiah via Druid Development <
> druid-development@googlegroups.com> wrote:
>
>> Hi,
>>
>> In our deployment, we enabled background segment merging and found that
>> some of the data within the load period was actually getting dropped.
>>
>> My suspicion was that when a merged segment only partially overlaps with
>> a period (e.g: Rule says keep data from Jan 1st onwards, and i have a
>> segment that spans Dec 25th - Jan 2nd), for correctness that segment should
>> be kept but current implementation seems to drop it.
>>
>> I checked the code and found indeed Rules.eligibleForLoad() only keeps
>> segments that overlap fully.
>>
>> Is this a bug, or is there other reason behind this? In our case, we do
>> have data sources that are highly aggregated and therefore a single segment
>> could span a month for example.
>>
>> I can submit a patch but wanted to get proper context.
>>
>>
>> Thanks,
>> pala
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Druid Development" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>> pic/druid-development/QYMhjGup2RI/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> druid-development+unsubscribe@googlegroups.com.
>> To post to this group, send email to druid-development@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/druid-development/d8e3df1d-699e-4747-a681-ebba91327388%4
>> 0googlegroups.com
>> <https://groups.google.com/d/msgid/druid-development/d8e3df1d-699e-4747-a681-ebba91327388%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscribe@googlegroups.com.
> To post to this group, send email to druid-development@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/druid-development/CALxr%3D2Wo9KKTYmFjbusc%2BsU0dt7K4k%3D
> P4JQX1T0sFe%3DLbBHXAA%40mail.gmail.com
> <https://groups.google.com/d/msgid/druid-development/CALxr%3D2Wo9KKTYmFjbusc%2BsU0dt7K4k%3DP4JQX1T0sFe%3DLbBHXAA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

Re: [druid-dev] Load rule doesn't honor intervals properly

Posted by Gian Merlino <gi...@imply.io>.
Looks like it is under review right now. Thanks for the patch.

Gian

On Sun, Apr 8, 2018 at 8:04 PM, 'Pala Muthiah' via Druid Development <
druid-development@googlegroups.com> wrote:

> Hi Gian,
>
> Thanks for following up. I have submitted a patch: https://github.com/
> druid-io/druid/pull/5595.
>
> Whoever is the right owner please take a look - let me know if i should @
> a specific person and i can do that.
>
>
> Thanks,
> pala
>
>
>
> On Fri, Mar 30, 2018 at 2:12 PM, Gian Merlino <gi...@imply.io> wrote:
>
>> Hi Pala,
>>
>> That sounds like a bug to me - a patch would be welcome!
>>
>> Btw, since we are trying to migrate the dev mailing list to Apache,
>> please cross post this sort of thing with dev@druid.incubator.apache.org,
>> or even only post to that list.
>>
>> Gian
>>
>> On Thu, Mar 29, 2018 at 5:43 PM, 'Pala Muthiah' via Druid Development <
>> druid-development@googlegroups.com> wrote:
>>
>>> Hello folks,
>>>
>>> Anybody have insight on the below? Curious to know if there would be
>>> unforeseen side effects if we count even partial overlap as valid.
>>>
>>> On Mon, Mar 19, 2018 at 10:54 AM, pala.muthiah via Druid Development <
>>> druid-development@googlegroups.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> In our deployment, we enabled background segment merging and found that
>>>> some of the data within the load period was actually getting dropped.
>>>>
>>>> My suspicion was that when a merged segment only partially overlaps
>>>> with a period (e.g: Rule says keep data from Jan 1st onwards, and i have a
>>>> segment that spans Dec 25th - Jan 2nd), for correctness that segment should
>>>> be kept but current implementation seems to drop it.
>>>>
>>>> I checked the code and found indeed Rules.eligibleForLoad() only keeps
>>>> segments that overlap fully.
>>>>
>>>> Is this a bug, or is there other reason behind this? In our case, we do
>>>> have data sources that are highly aggregated and therefore a single segment
>>>> could span a month for example.
>>>>
>>>> I can submit a patch but wanted to get proper context.
>>>>
>>>>
>>>> Thanks,
>>>> pala
>>>>
>>>> --
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "Druid Development" group.
>>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>>> pic/druid-development/QYMhjGup2RI/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> druid-development+unsubscribe@googlegroups.com.
>>>> To post to this group, send email to druid-development@googlegroups.com
>>>> .
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/druid-development/d8e3df1d-699e-4747-a681-ebba91327388%4
>>>> 0googlegroups.com
>>>> <https://groups.google.com/d/msgid/druid-development/d8e3df1d-699e-4747-a681-ebba91327388%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Druid Development" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to druid-development+unsubscribe@googlegroups.com.
>>> To post to this group, send email to druid-development@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/druid-development/CALxr%3D2Wo9KKTYmFjbusc%2BsU0dt7K4k%3D
>>> P4JQX1T0sFe%3DLbBHXAA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/druid-development/CALxr%3D2Wo9KKTYmFjbusc%2BsU0dt7K4k%3DP4JQX1T0sFe%3DLbBHXAA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Druid Development" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>> pic/druid-development/QYMhjGup2RI/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> druid-development+unsubscribe@googlegroups.com.
>> To post to this group, send email to druid-development@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/druid-development/CACZNdYDNQNMmuxZ67Uk5q%3DwzK92EvqtozLW
>> %2BeK1cm8OaaYoR4Q%40mail.gmail.com
>> <https://groups.google.com/d/msgid/druid-development/CACZNdYDNQNMmuxZ67Uk5q%3DwzK92EvqtozLW%2BeK1cm8OaaYoR4Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Druid Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to druid-development+unsubscribe@googlegroups.com.
> To post to this group, send email to druid-development@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/druid-development/CALxr%3D2VQPnPi4RFCbV4FKE8uwdjgSc2QE
> TS9_6GCFA1TsG0WsQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/druid-development/CALxr%3D2VQPnPi4RFCbV4FKE8uwdjgSc2QETS9_6GCFA1TsG0WsQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

Re: [druid-dev] Load rule doesn't honor intervals properly

Posted by Pala Muthiah <pa...@airbnb.com.INVALID>.
Hi Gian,

Thanks for following up. I have submitted a patch:
https://github.com/druid-io/druid/pull/5595.

Whoever is the right owner please take a look - let me know if i should @ a
specific person and i can do that.


Thanks,
pala



On Fri, Mar 30, 2018 at 2:12 PM, Gian Merlino <gi...@imply.io> wrote:

> Hi Pala,
>
> That sounds like a bug to me - a patch would be welcome!
>
> Btw, since we are trying to migrate the dev mailing list to Apache, please
> cross post this sort of thing with dev@druid.incubator.apache.org, or
> even only post to that list.
>
> Gian
>
> On Thu, Mar 29, 2018 at 5:43 PM, 'Pala Muthiah' via Druid Development <
> druid-development@googlegroups.com> wrote:
>
>> Hello folks,
>>
>> Anybody have insight on the below? Curious to know if there would be
>> unforeseen side effects if we count even partial overlap as valid.
>>
>> On Mon, Mar 19, 2018 at 10:54 AM, pala.muthiah via Druid Development <
>> druid-development@googlegroups.com> wrote:
>>
>>> Hi,
>>>
>>> In our deployment, we enabled background segment merging and found that
>>> some of the data within the load period was actually getting dropped.
>>>
>>> My suspicion was that when a merged segment only partially overlaps with
>>> a period (e.g: Rule says keep data from Jan 1st onwards, and i have a
>>> segment that spans Dec 25th - Jan 2nd), for correctness that segment should
>>> be kept but current implementation seems to drop it.
>>>
>>> I checked the code and found indeed Rules.eligibleForLoad() only keeps
>>> segments that overlap fully.
>>>
>>> Is this a bug, or is there other reason behind this? In our case, we do
>>> have data sources that are highly aggregated and therefore a single segment
>>> could span a month for example.
>>>
>>> I can submit a patch but wanted to get proper context.
>>>
>>>
>>> Thanks,
>>> pala
>>>
>>> --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "Druid Development" group.
>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>> pic/druid-development/QYMhjGup2RI/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> druid-development+unsubscribe@googlegroups.com.
>>> To post to this group, send email to druid-development@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/druid-development/d8e3df1d-699e-4747-a681-ebba91327388%4
>>> 0googlegroups.com
>>> <https://groups.google.com/d/msgid/druid-development/d8e3df1d-699e-4747-a681-ebba91327388%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Druid Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to druid-development+unsubscribe@googlegroups.com.
>> To post to this group, send email to druid-development@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/druid-development/CALxr%3D2Wo9KKTYmFjbusc%2BsU0dt7K4k%3D
>> P4JQX1T0sFe%3DLbBHXAA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/druid-development/CALxr%3D2Wo9KKTYmFjbusc%2BsU0dt7K4k%3DP4JQX1T0sFe%3DLbBHXAA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Druid Development" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/druid-development/QYMhjGup2RI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> druid-development+unsubscribe@googlegroups.com.
> To post to this group, send email to druid-development@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/druid-development/CACZNdYDNQNMmuxZ67Uk5q%3DwzK92EvqtozLW%
> 2BeK1cm8OaaYoR4Q%40mail.gmail.com
> <https://groups.google.com/d/msgid/druid-development/CACZNdYDNQNMmuxZ67Uk5q%3DwzK92EvqtozLW%2BeK1cm8OaaYoR4Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>