You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by srimugunthan dhandapani <sr...@gmail.com> on 2018/01/22 14:59:51 UTC

offsetsForTimes API performance

 Hi all,

We use kafka as our store and  every one of our record is associated with a
timeStamp. We pull data from kafka by seeking to a timeStamp offset
everytime and then get the records by polling. We use KafkaConsumer's
offsetsForTimes (
https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#offsetsForTimes(java.util.Map)

) API to  find offset and seek to a particular time offset.

 We see that using the offsetsForTimes API and the subsequent seek to the
offset takes anything from 17 milliseconds to 500millisec per iteration.

I would like to know if anybody has done any performance testing of the
offsetsForTimes API and what does the performance of the API depend on?
Will the API be slower if there is more data in the kafka?


thanks,
mugunthan

Re: offsetsForTimes API performance

Posted by srimugunthan dhandapani <sr...@gmail.com>.
Does the performance of kafka APIs (
https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
) depend on how geographically apart the caller of the API is from the
kafka cluster?
Do all APIs perform faster if the calls are  made from a machine co-located
in the kafka cluster?


On Mon, Jan 22, 2018 at 8:54 PM, Andrew Otto <ot...@wikimedia.org> wrote:

> Speaking of, has there been any talk of combining those two requests into a
> single API call?  I’d assume that offsetForTimes + consumer seek is
> probably the most common use case of offsetForTimes.  Maybe a round trip
> could be avoided if the broker could just auto-assign the consumer to the
> offset for a timestamp.
>
>
> On Mon, Jan 22, 2018 at 9:59 AM, srimugunthan dhandapani <
> srimugunthan.dhandapani@gmail.com> wrote:
>
> >  Hi all,
> >
> > We use kafka as our store and  every one of our record is associated
> with a
> > timeStamp. We pull data from kafka by seeking to a timeStamp offset
> > everytime and then get the records by polling. We use KafkaConsumer's
> > offsetsForTimes (
> > https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/
> > KafkaConsumer.html#offsetsForTimes(java.util.Map)
> >
> > ) API to  find offset and seek to a particular time offset.
> >
> >  We see that using the offsetsForTimes API and the subsequent seek to the
> > offset takes anything from 17 milliseconds to 500millisec per iteration.
> >
> > I would like to know if anybody has done any performance testing of the
> > offsetsForTimes API and what does the performance of the API depend on?
> > Will the API be slower if there is more data in the kafka?
> >
> >
> > thanks,
> > mugunthan
> >
>

Re: offsetsForTimes API performance

Posted by srimugunthan dhandapani <sr...@gmail.com>.
Does the performance of kafka APIs (
https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html
) depend on how geographically apart the caller of the API is from the
kafka cluster?
Do all APIs perform faster if the calls are  made from a machine co-located
in the kafka cluster?


On Mon, Jan 22, 2018 at 8:54 PM, Andrew Otto <ot...@wikimedia.org> wrote:

> Speaking of, has there been any talk of combining those two requests into a
> single API call?  I’d assume that offsetForTimes + consumer seek is
> probably the most common use case of offsetForTimes.  Maybe a round trip
> could be avoided if the broker could just auto-assign the consumer to the
> offset for a timestamp.
>
>
> On Mon, Jan 22, 2018 at 9:59 AM, srimugunthan dhandapani <
> srimugunthan.dhandapani@gmail.com> wrote:
>
> >  Hi all,
> >
> > We use kafka as our store and  every one of our record is associated
> with a
> > timeStamp. We pull data from kafka by seeking to a timeStamp offset
> > everytime and then get the records by polling. We use KafkaConsumer's
> > offsetsForTimes (
> > https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/
> > KafkaConsumer.html#offsetsForTimes(java.util.Map)
> >
> > ) API to  find offset and seek to a particular time offset.
> >
> >  We see that using the offsetsForTimes API and the subsequent seek to the
> > offset takes anything from 17 milliseconds to 500millisec per iteration.
> >
> > I would like to know if anybody has done any performance testing of the
> > offsetsForTimes API and what does the performance of the API depend on?
> > Will the API be slower if there is more data in the kafka?
> >
> >
> > thanks,
> > mugunthan
> >
>

Re: offsetsForTimes API performance

Posted by Andrew Otto <ot...@wikimedia.org>.
Speaking of, has there been any talk of combining those two requests into a
single API call?  I’d assume that offsetForTimes + consumer seek is
probably the most common use case of offsetForTimes.  Maybe a round trip
could be avoided if the broker could just auto-assign the consumer to the
offset for a timestamp.


On Mon, Jan 22, 2018 at 9:59 AM, srimugunthan dhandapani <
srimugunthan.dhandapani@gmail.com> wrote:

>  Hi all,
>
> We use kafka as our store and  every one of our record is associated with a
> timeStamp. We pull data from kafka by seeking to a timeStamp offset
> everytime and then get the records by polling. We use KafkaConsumer's
> offsetsForTimes (
> https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/
> KafkaConsumer.html#offsetsForTimes(java.util.Map)
>
> ) API to  find offset and seek to a particular time offset.
>
>  We see that using the offsetsForTimes API and the subsequent seek to the
> offset takes anything from 17 milliseconds to 500millisec per iteration.
>
> I would like to know if anybody has done any performance testing of the
> offsetsForTimes API and what does the performance of the API depend on?
> Will the API be slower if there is more data in the kafka?
>
>
> thanks,
> mugunthan
>

Re: offsetsForTimes API performance

Posted by Andrew Otto <ot...@wikimedia.org>.
Speaking of, has there been any talk of combining those two requests into a
single API call?  I’d assume that offsetForTimes + consumer seek is
probably the most common use case of offsetForTimes.  Maybe a round trip
could be avoided if the broker could just auto-assign the consumer to the
offset for a timestamp.


On Mon, Jan 22, 2018 at 9:59 AM, srimugunthan dhandapani <
srimugunthan.dhandapani@gmail.com> wrote:

>  Hi all,
>
> We use kafka as our store and  every one of our record is associated with a
> timeStamp. We pull data from kafka by seeking to a timeStamp offset
> everytime and then get the records by polling. We use KafkaConsumer's
> offsetsForTimes (
> https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/
> KafkaConsumer.html#offsetsForTimes(java.util.Map)
>
> ) API to  find offset and seek to a particular time offset.
>
>  We see that using the offsetsForTimes API and the subsequent seek to the
> offset takes anything from 17 milliseconds to 500millisec per iteration.
>
> I would like to know if anybody has done any performance testing of the
> offsetsForTimes API and what does the performance of the API depend on?
> Will the API be slower if there is more data in the kafka?
>
>
> thanks,
> mugunthan
>