You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by Mario Copperfield <xw...@gmail.com> on 2016/09/13 07:11:17 UTC

Fwd: About Kafka in Kylin

Dear all,
       I am using kylin streaming build, and when i read the code about
this module, i found that kylin use binary search to find the offset which
is the closest adjust to the starttamp. I doubt that is that work if the
data in kafka is not order?
       Thanks and waits.


-- 
Best regards,
Amuro Copperfield

Re: Fwd: About Kafka in Kylin

Posted by Mario Copperfield <xw...@gmail.com>.
OK, I'll try

On 9/13/16, ShaoFeng Shi <sh...@apache.org> wrote:
> sorry one correction:
>
>
> " if some messages are arrived late than the margin, it will not be lost"
> should be:
> " if some messages are arrived late than the margin, it will be lost"
>
> Mario, you can try to set a bigger margin value to reduce the possibility.
>
> 2016-09-13 22:05 GMT+08:00 ShaoFeng Shi <sh...@apache.org>:
>
>> In 1.5.x streaming OLAP, kylin uses a timestamp range to seek the
>> start/end offset in kafka, which is binary search; It allows a margin
>> window, but if some messages are arrived late than the margin, it will
>> not
>> be lost;
>>
>> Now we're working on a new implementation, which will strictly use offset
>> to fetch the new messages each time, so there will not be message lost.
>>
>>
>> 2016-09-13 15:53 GMT+08:00 Billy(Yiming) Liu <li...@gmail.com>:
>>
>>> The current design is still an experimental approach. Kafka could not
>>> guarantee the global order, so we have to find other solution. The new
>>> design Streaming OLAP solution will relay on the Kafka partition order,
>>> instead of app timestamp. The code is under KYLIN-1726 branch still.
>>>
>>> 2016-09-13 15:46 GMT+08:00 Mario Copperfield <xw...@gmail.com>:
>>>
>>> > OK, Thank you
>>> >
>>> > On Tue, Sep 13, 2016 at 3:27 PM, Sarnath K <st...@gmail.com> wrote:
>>> >
>>> > > Yes,  that's true.  If you are looking at an app timestamp(event
>>> origin
>>> > > time),  then We can't binary search on it. Though Binary search may
>>> be a
>>> > > good approximation for the common case.
>>> > > Not sure what Kylin is designed for. Let's wait to hear from the
>>> experts!
>>> > >
>>> > > On Sep 13, 2016 12:49, "Mario Copperfield" <xw...@gmail.com>
>>> wrote:
>>> > >
>>> > > > It's true that data appears in order in Kafka, but it can't assert
>>> that
>>> > > the
>>> > > > timestamp of data is ordered, in fact, in real time it always
>>> > > > appear
>>> > > > without order
>>> > > >
>>> > > > On Tue, Sep 13, 2016 at 3:14 PM, Sarnath K <st...@gmail.com>
>>> wrote:
>>> > > >
>>> > > > > I am not sure about what Kylin does.  But I know that data
>>> appears in
>>> > > > order
>>> > > > > in Kafka broker. But the consumer can consume in any order that
>>> > > > > it
>>> > > likes.
>>> > > > > So,  offsets are more driven by Consumers and Kafka does not
>>> > > > > have
>>> a
>>> > say
>>> > > > on
>>> > > > > it.
>>> > > > > Sharing this based on my preliminary understanding of how Kafka
>>> > works.
>>> > > > > Best,
>>> > > > > Sarnath
>>> > > > >
>>> > > > > On Sep 13, 2016 12:41, "Mario Copperfield"
>>> > > > > <xw...@gmail.com>
>>> > > wrote:
>>> > > > >
>>> > > > > > Dear all,
>>> > > > > >        I am using kylin streaming build, and when i read the
>>> code
>>> > > about
>>> > > > > > this module, i found that kylin use binary search to find the
>>> > offset
>>> > > > > which
>>> > > > > > is the closest adjust to the starttamp. I doubt that is that
>>> work
>>> > if
>>> > > > the
>>> > > > > > data in kafka is not order?
>>> > > > > >        Thanks and waits.
>>> > > > > >
>>> > > > > >
>>> > > > > > --
>>> > > > > > Best regards,
>>> > > > > > Amuro Copperfield
>>> > > > > >
>>> > > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > > --
>>> > > > Best regards,
>>> > > > Amuro Copperfield
>>> > > >
>>> > >
>>> >
>>> >
>>> >
>>> > --
>>> > Best regards,
>>> > Amuro Copperfield
>>> >
>>>
>>>
>>>
>>> --
>>> With Warm regards
>>>
>>> Yiming Liu (刘一鸣)
>>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi
>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>


-- 
Best regards,
Amuro Copperfield

Re: Fwd: About Kafka in Kylin

Posted by ShaoFeng Shi <sh...@apache.org>.
sorry one correction:


" if some messages are arrived late than the margin, it will not be lost"
should be:
" if some messages are arrived late than the margin, it will be lost"

Mario, you can try to set a bigger margin value to reduce the possibility.

2016-09-13 22:05 GMT+08:00 ShaoFeng Shi <sh...@apache.org>:

> In 1.5.x streaming OLAP, kylin uses a timestamp range to seek the
> start/end offset in kafka, which is binary search; It allows a margin
> window, but if some messages are arrived late than the margin, it will not
> be lost;
>
> Now we're working on a new implementation, which will strictly use offset
> to fetch the new messages each time, so there will not be message lost.
>
>
> 2016-09-13 15:53 GMT+08:00 Billy(Yiming) Liu <li...@gmail.com>:
>
>> The current design is still an experimental approach. Kafka could not
>> guarantee the global order, so we have to find other solution. The new
>> design Streaming OLAP solution will relay on the Kafka partition order,
>> instead of app timestamp. The code is under KYLIN-1726 branch still.
>>
>> 2016-09-13 15:46 GMT+08:00 Mario Copperfield <xw...@gmail.com>:
>>
>> > OK, Thank you
>> >
>> > On Tue, Sep 13, 2016 at 3:27 PM, Sarnath K <st...@gmail.com> wrote:
>> >
>> > > Yes,  that's true.  If you are looking at an app timestamp(event
>> origin
>> > > time),  then We can't binary search on it. Though Binary search may
>> be a
>> > > good approximation for the common case.
>> > > Not sure what Kylin is designed for. Let's wait to hear from the
>> experts!
>> > >
>> > > On Sep 13, 2016 12:49, "Mario Copperfield" <xw...@gmail.com>
>> wrote:
>> > >
>> > > > It's true that data appears in order in Kafka, but it can't assert
>> that
>> > > the
>> > > > timestamp of data is ordered, in fact, in real time it always appear
>> > > > without order
>> > > >
>> > > > On Tue, Sep 13, 2016 at 3:14 PM, Sarnath K <st...@gmail.com>
>> wrote:
>> > > >
>> > > > > I am not sure about what Kylin does.  But I know that data
>> appears in
>> > > > order
>> > > > > in Kafka broker. But the consumer can consume in any order that it
>> > > likes.
>> > > > > So,  offsets are more driven by Consumers and Kafka does not have
>> a
>> > say
>> > > > on
>> > > > > it.
>> > > > > Sharing this based on my preliminary understanding of how Kafka
>> > works.
>> > > > > Best,
>> > > > > Sarnath
>> > > > >
>> > > > > On Sep 13, 2016 12:41, "Mario Copperfield" <xw...@gmail.com>
>> > > wrote:
>> > > > >
>> > > > > > Dear all,
>> > > > > >        I am using kylin streaming build, and when i read the
>> code
>> > > about
>> > > > > > this module, i found that kylin use binary search to find the
>> > offset
>> > > > > which
>> > > > > > is the closest adjust to the starttamp. I doubt that is that
>> work
>> > if
>> > > > the
>> > > > > > data in kafka is not order?
>> > > > > >        Thanks and waits.
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Best regards,
>> > > > > > Amuro Copperfield
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Best regards,
>> > > > Amuro Copperfield
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Best regards,
>> > Amuro Copperfield
>> >
>>
>>
>>
>> --
>> With Warm regards
>>
>> Yiming Liu (刘一鸣)
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>
>


-- 
Best regards,

Shaofeng Shi

Re: Fwd: About Kafka in Kylin

Posted by ShaoFeng Shi <sh...@apache.org>.
In 1.5.x streaming OLAP, kylin uses a timestamp range to seek the start/end
offset in kafka, which is binary search; It allows a margin window, but if
some messages are arrived late than the margin, it will not be lost;

Now we're working on a new implementation, which will strictly use offset
to fetch the new messages each time, so there will not be message lost.


2016-09-13 15:53 GMT+08:00 Billy(Yiming) Liu <li...@gmail.com>:

> The current design is still an experimental approach. Kafka could not
> guarantee the global order, so we have to find other solution. The new
> design Streaming OLAP solution will relay on the Kafka partition order,
> instead of app timestamp. The code is under KYLIN-1726 branch still.
>
> 2016-09-13 15:46 GMT+08:00 Mario Copperfield <xw...@gmail.com>:
>
> > OK, Thank you
> >
> > On Tue, Sep 13, 2016 at 3:27 PM, Sarnath K <st...@gmail.com> wrote:
> >
> > > Yes,  that's true.  If you are looking at an app timestamp(event origin
> > > time),  then We can't binary search on it. Though Binary search may be
> a
> > > good approximation for the common case.
> > > Not sure what Kylin is designed for. Let's wait to hear from the
> experts!
> > >
> > > On Sep 13, 2016 12:49, "Mario Copperfield" <xw...@gmail.com>
> wrote:
> > >
> > > > It's true that data appears in order in Kafka, but it can't assert
> that
> > > the
> > > > timestamp of data is ordered, in fact, in real time it always appear
> > > > without order
> > > >
> > > > On Tue, Sep 13, 2016 at 3:14 PM, Sarnath K <st...@gmail.com>
> wrote:
> > > >
> > > > > I am not sure about what Kylin does.  But I know that data appears
> in
> > > > order
> > > > > in Kafka broker. But the consumer can consume in any order that it
> > > likes.
> > > > > So,  offsets are more driven by Consumers and Kafka does not have a
> > say
> > > > on
> > > > > it.
> > > > > Sharing this based on my preliminary understanding of how Kafka
> > works.
> > > > > Best,
> > > > > Sarnath
> > > > >
> > > > > On Sep 13, 2016 12:41, "Mario Copperfield" <xw...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Dear all,
> > > > > >        I am using kylin streaming build, and when i read the code
> > > about
> > > > > > this module, i found that kylin use binary search to find the
> > offset
> > > > > which
> > > > > > is the closest adjust to the starttamp. I doubt that is that work
> > if
> > > > the
> > > > > > data in kafka is not order?
> > > > > >        Thanks and waits.
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > > Amuro Copperfield
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Amuro Copperfield
> > > >
> > >
> >
> >
> >
> > --
> > Best regards,
> > Amuro Copperfield
> >
>
>
>
> --
> With Warm regards
>
> Yiming Liu (刘一鸣)
>



-- 
Best regards,

Shaofeng Shi

Re: Fwd: About Kafka in Kylin

Posted by "Billy(Yiming) Liu" <li...@gmail.com>.
The current design is still an experimental approach. Kafka could not
guarantee the global order, so we have to find other solution. The new
design Streaming OLAP solution will relay on the Kafka partition order,
instead of app timestamp. The code is under KYLIN-1726 branch still.

2016-09-13 15:46 GMT+08:00 Mario Copperfield <xw...@gmail.com>:

> OK, Thank you
>
> On Tue, Sep 13, 2016 at 3:27 PM, Sarnath K <st...@gmail.com> wrote:
>
> > Yes,  that's true.  If you are looking at an app timestamp(event origin
> > time),  then We can't binary search on it. Though Binary search may be a
> > good approximation for the common case.
> > Not sure what Kylin is designed for. Let's wait to hear from the experts!
> >
> > On Sep 13, 2016 12:49, "Mario Copperfield" <xw...@gmail.com> wrote:
> >
> > > It's true that data appears in order in Kafka, but it can't assert that
> > the
> > > timestamp of data is ordered, in fact, in real time it always appear
> > > without order
> > >
> > > On Tue, Sep 13, 2016 at 3:14 PM, Sarnath K <st...@gmail.com> wrote:
> > >
> > > > I am not sure about what Kylin does.  But I know that data appears in
> > > order
> > > > in Kafka broker. But the consumer can consume in any order that it
> > likes.
> > > > So,  offsets are more driven by Consumers and Kafka does not have a
> say
> > > on
> > > > it.
> > > > Sharing this based on my preliminary understanding of how Kafka
> works.
> > > > Best,
> > > > Sarnath
> > > >
> > > > On Sep 13, 2016 12:41, "Mario Copperfield" <xw...@gmail.com>
> > wrote:
> > > >
> > > > > Dear all,
> > > > >        I am using kylin streaming build, and when i read the code
> > about
> > > > > this module, i found that kylin use binary search to find the
> offset
> > > > which
> > > > > is the closest adjust to the starttamp. I doubt that is that work
> if
> > > the
> > > > > data in kafka is not order?
> > > > >        Thanks and waits.
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Amuro Copperfield
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Amuro Copperfield
> > >
> >
>
>
>
> --
> Best regards,
> Amuro Copperfield
>



-- 
With Warm regards

Yiming Liu (刘一鸣)

Re: Fwd: About Kafka in Kylin

Posted by Mario Copperfield <xw...@gmail.com>.
OK, Thank you

On Tue, Sep 13, 2016 at 3:27 PM, Sarnath K <st...@gmail.com> wrote:

> Yes,  that's true.  If you are looking at an app timestamp(event origin
> time),  then We can't binary search on it. Though Binary search may be a
> good approximation for the common case.
> Not sure what Kylin is designed for. Let's wait to hear from the experts!
>
> On Sep 13, 2016 12:49, "Mario Copperfield" <xw...@gmail.com> wrote:
>
> > It's true that data appears in order in Kafka, but it can't assert that
> the
> > timestamp of data is ordered, in fact, in real time it always appear
> > without order
> >
> > On Tue, Sep 13, 2016 at 3:14 PM, Sarnath K <st...@gmail.com> wrote:
> >
> > > I am not sure about what Kylin does.  But I know that data appears in
> > order
> > > in Kafka broker. But the consumer can consume in any order that it
> likes.
> > > So,  offsets are more driven by Consumers and Kafka does not have a say
> > on
> > > it.
> > > Sharing this based on my preliminary understanding of how Kafka works.
> > > Best,
> > > Sarnath
> > >
> > > On Sep 13, 2016 12:41, "Mario Copperfield" <xw...@gmail.com>
> wrote:
> > >
> > > > Dear all,
> > > >        I am using kylin streaming build, and when i read the code
> about
> > > > this module, i found that kylin use binary search to find the offset
> > > which
> > > > is the closest adjust to the starttamp. I doubt that is that work if
> > the
> > > > data in kafka is not order?
> > > >        Thanks and waits.
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Amuro Copperfield
> > > >
> > >
> >
> >
> >
> > --
> > Best regards,
> > Amuro Copperfield
> >
>



-- 
Best regards,
Amuro Copperfield

Re: Fwd: About Kafka in Kylin

Posted by Sarnath K <st...@gmail.com>.
Yes,  that's true.  If you are looking at an app timestamp(event origin
time),  then We can't binary search on it. Though Binary search may be a
good approximation for the common case.
Not sure what Kylin is designed for. Let's wait to hear from the experts!

On Sep 13, 2016 12:49, "Mario Copperfield" <xw...@gmail.com> wrote:

> It's true that data appears in order in Kafka, but it can't assert that the
> timestamp of data is ordered, in fact, in real time it always appear
> without order
>
> On Tue, Sep 13, 2016 at 3:14 PM, Sarnath K <st...@gmail.com> wrote:
>
> > I am not sure about what Kylin does.  But I know that data appears in
> order
> > in Kafka broker. But the consumer can consume in any order that it likes.
> > So,  offsets are more driven by Consumers and Kafka does not have a say
> on
> > it.
> > Sharing this based on my preliminary understanding of how Kafka works.
> > Best,
> > Sarnath
> >
> > On Sep 13, 2016 12:41, "Mario Copperfield" <xw...@gmail.com> wrote:
> >
> > > Dear all,
> > >        I am using kylin streaming build, and when i read the code about
> > > this module, i found that kylin use binary search to find the offset
> > which
> > > is the closest adjust to the starttamp. I doubt that is that work if
> the
> > > data in kafka is not order?
> > >        Thanks and waits.
> > >
> > >
> > > --
> > > Best regards,
> > > Amuro Copperfield
> > >
> >
>
>
>
> --
> Best regards,
> Amuro Copperfield
>

Re: Fwd: About Kafka in Kylin

Posted by Mario Copperfield <xw...@gmail.com>.
It's true that data appears in order in Kafka, but it can't assert that the
timestamp of data is ordered, in fact, in real time it always appear
without order

On Tue, Sep 13, 2016 at 3:14 PM, Sarnath K <st...@gmail.com> wrote:

> I am not sure about what Kylin does.  But I know that data appears in order
> in Kafka broker. But the consumer can consume in any order that it likes.
> So,  offsets are more driven by Consumers and Kafka does not have a say on
> it.
> Sharing this based on my preliminary understanding of how Kafka works.
> Best,
> Sarnath
>
> On Sep 13, 2016 12:41, "Mario Copperfield" <xw...@gmail.com> wrote:
>
> > Dear all,
> >        I am using kylin streaming build, and when i read the code about
> > this module, i found that kylin use binary search to find the offset
> which
> > is the closest adjust to the starttamp. I doubt that is that work if the
> > data in kafka is not order?
> >        Thanks and waits.
> >
> >
> > --
> > Best regards,
> > Amuro Copperfield
> >
>



-- 
Best regards,
Amuro Copperfield

Re: Fwd: About Kafka in Kylin

Posted by Sarnath K <st...@gmail.com>.
I am not sure about what Kylin does.  But I know that data appears in order
in Kafka broker. But the consumer can consume in any order that it likes.
So,  offsets are more driven by Consumers and Kafka does not have a say on
it.
Sharing this based on my preliminary understanding of how Kafka works.
Best,
Sarnath

On Sep 13, 2016 12:41, "Mario Copperfield" <xw...@gmail.com> wrote:

> Dear all,
>        I am using kylin streaming build, and when i read the code about
> this module, i found that kylin use binary search to find the offset which
> is the closest adjust to the starttamp. I doubt that is that work if the
> data in kafka is not order?
>        Thanks and waits.
>
>
> --
> Best regards,
> Amuro Copperfield
>