You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Manjunath Shivakumar <Ma...@betfair.com> on 2014/09/02 19:21:26 UTC

Offset Request with timestamp

Hi,

My usecase is to fetch the offsets for a given topic from X milliseconds ago.
If I use the offset api

https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetAPI

to do this and pass in a timestamp of (now() - X), I get the earliest offset in the current log segment and not the offset from X milliseconds ago.

Is this the correct usage or behaviour?

Thanks,
Manju

________________________________________________________________________
In order to protect our email recipients, Betfair Group use SkyScan from 
MessageLabs to scan all Incoming and Outgoing mail for viruses.

________________________________________________________________________

Re: Offset Request with timestamp

Posted by Neha Narkhede <ne...@gmail.com>.
I added this question to the FAQ as it frequently comes up -
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIaccuratelygetoffsetsofmessagesforacertaintimestampusingOffsetFetchRequest
?


On Tue, Sep 2, 2014 at 1:48 PM, Guozhang Wang <wa...@gmail.com> wrote:

> The semantic of the offset API is to "return the latest possible offset of
> the message that is appended no later than the given timestamp". For
> implementation, it will get the starting offset of the log segment that is
> created no later than the given timestamp, and hence if your log segment
> contains data for a long period of time, then the offset API may return you
> just the starting offset of the current log segment.
>
> If your traffic is small and you still want a finer grained offset
> response, you can try to reduce the log segment size (default to 1 GB);
> however doing so will increase the number of file handlers with more
> frequent log segment rolling.
>
> Guozhang
>
>
> On Tue, Sep 2, 2014 at 10:21 AM, Manjunath Shivakumar <
> Manjunath.Shivakumar@betfair.com> wrote:
>
> > Hi,
> >
> > My usecase is to fetch the offsets for a given topic from X milliseconds
> > ago.
> > If I use the offset api
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetAPI
> >
> > to do this and pass in a timestamp of (now() - X), I get the earliest
> > offset in the current log segment and not the offset from X milliseconds
> > ago.
> >
> > Is this the correct usage or behaviour?
> >
> > Thanks,
> > Manju
> >
> > ________________________________________________________________________
> > In order to protect our email recipients, Betfair Group use SkyScan from
> > MessageLabs to scan all Incoming and Outgoing mail for viruses.
> >
> > ________________________________________________________________________
>
>
>
>
> --
> -- Guozhang
>

Re: Offset Request with timestamp

Posted by Guozhang Wang <wa...@gmail.com>.
The semantic of the offset API is to "return the latest possible offset of
the message that is appended no later than the given timestamp". For
implementation, it will get the starting offset of the log segment that is
created no later than the given timestamp, and hence if your log segment
contains data for a long period of time, then the offset API may return you
just the starting offset of the current log segment.

If your traffic is small and you still want a finer grained offset
response, you can try to reduce the log segment size (default to 1 GB);
however doing so will increase the number of file handlers with more
frequent log segment rolling.

Guozhang


On Tue, Sep 2, 2014 at 10:21 AM, Manjunath Shivakumar <
Manjunath.Shivakumar@betfair.com> wrote:

> Hi,
>
> My usecase is to fetch the offsets for a given topic from X milliseconds
> ago.
> If I use the offset api
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetAPI
>
> to do this and pass in a timestamp of (now() - X), I get the earliest
> offset in the current log segment and not the offset from X milliseconds
> ago.
>
> Is this the correct usage or behaviour?
>
> Thanks,
> Manju
>
> ________________________________________________________________________
> In order to protect our email recipients, Betfair Group use SkyScan from
> MessageLabs to scan all Incoming and Outgoing mail for viruses.
>
> ________________________________________________________________________




-- 
-- Guozhang