You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Vadim Chekan <ko...@gmail.com> on 2014/09/25 20:40:56 UTC

Can Fetch Request cause delay of Produce Request?

Hi all,

I'm working on my own kafka client implementation and I noticed strange
situation.

Time(s) | Action
0.0    MetadataRequest
0.0    MetadataResponse
0.2    OffsetRequest
0.2    OffsetResponse
0.3    FetchRequest(MaxWaitTime=20sec)
6.0    ProduceRequest
31.0  FetchResponse
^^^^ notice 25sec gap!
31.0  ProduceResponse

What is important, it seems to me that Fetch Request with long poll will
block processing of all following requests for the duration of timeout,
given that there is no new data. But new data *are* flowing in and Produce
Request is waiting right behind the Fetch one in server side processing
queue.

If what I describe is correct then I see the following problems.
1. From client's point of view, if they have a listener and a sender, then
sending data should produce immediate event on listener.

2. If I want clean shutdown of my application within 5sec, but my listeners
are configured with 20sec timeout, than I am risking losing the data. My
application shutdown time now becomes at least as long as listener's
pooling time, which is unreasonable.

I do understand why it was implemented that way, probably because of the
specification saying
====
"The server guarantees that on a single TCP connection, requests will be
processed in the order they are sent and responses will return in that
order as well"
====
But to achieve proper functioning of the client, I need to allocate another
tcp connection for listeners now?

Also, I'm aware about new protocol proposal for the next version of Kafka,
would this issue be resolved there?

Thanks,
Vadim.

-- 
>From RFC 2631: In ASN.1, EXPLICIT tagging is implicit unless IMPLICIT is
explicitly specified

Re: Can Fetch Request cause delay of Produce Request?

Posted by Vadim Chekan <ko...@gmail.com>.
Thank you Neha,

Allocating dedicated connection for fetching client worked fine and did not
require any changes to kafka client library code.

Thinking about it, server perhaps could process Fetch and Produce requests
in parallel, while still returning result in order as in the spec. What
needs to be prevented is multiple producers or multiple fetchers
simultaneous processing. I hope I'll have time to play with server a little
bit.

Vadim.

On Thu, Sep 25, 2014 at 12:13 PM, Neha Narkhede <ne...@gmail.com>
wrote:

> "The server guarantees that on a single TCP connection, requests will be
> processed in the order they are sent and responses will return in that
> order as well"
> ====
> But to achieve proper functioning of the client, I need to allocate another
> tcp connection for listeners now?
>
> Yes. The easiest way to send and receive is to use a producer client and a
> consumer client, each of which will have it's own TCP connection to the
> server.
>
> On Thu, Sep 25, 2014 at 11:40 AM, Vadim Chekan <ko...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I'm working on my own kafka client implementation and I noticed strange
> > situation.
> >
> > Time(s) | Action
> > 0.0    MetadataRequest
> > 0.0    MetadataResponse
> > 0.2    OffsetRequest
> > 0.2    OffsetResponse
> > 0.3    FetchRequest(MaxWaitTime=20sec)
> > 6.0    ProduceRequest
> > 31.0  FetchResponse
> > ^^^^ notice 25sec gap!
> > 31.0  ProduceResponse
> >
> > What is important, it seems to me that Fetch Request with long poll will
> > block processing of all following requests for the duration of timeout,
> > given that there is no new data. But new data *are* flowing in and
> Produce
> > Request is waiting right behind the Fetch one in server side processing
> > queue.
> >
> > If what I describe is correct then I see the following problems.
> > 1. From client's point of view, if they have a listener and a sender,
> then
> > sending data should produce immediate event on listener.
> >
> > 2. If I want clean shutdown of my application within 5sec, but my
> listeners
> > are configured with 20sec timeout, than I am risking losing the data. My
> > application shutdown time now becomes at least as long as listener's
> > pooling time, which is unreasonable.
> >
> > I do understand why it was implemented that way, probably because of the
> > specification saying
> > ====
> > "The server guarantees that on a single TCP connection, requests will be
> > processed in the order they are sent and responses will return in that
> > order as well"
> > ====
> > But to achieve proper functioning of the client, I need to allocate
> another
> > tcp connection for listeners now?
> >
> > Also, I'm aware about new protocol proposal for the next version of
> Kafka,
> > would this issue be resolved there?
> >
> > Thanks,
> > Vadim.
> >
> > --
> > From RFC 2631: In ASN.1, EXPLICIT tagging is implicit unless IMPLICIT is
> > explicitly specified
> >
>



-- 
>From RFC 2631: In ASN.1, EXPLICIT tagging is implicit unless IMPLICIT is
explicitly specified

Re: Can Fetch Request cause delay of Produce Request?

Posted by Neha Narkhede <ne...@gmail.com>.
"The server guarantees that on a single TCP connection, requests will be
processed in the order they are sent and responses will return in that
order as well"
====
But to achieve proper functioning of the client, I need to allocate another
tcp connection for listeners now?

Yes. The easiest way to send and receive is to use a producer client and a
consumer client, each of which will have it's own TCP connection to the
server.

On Thu, Sep 25, 2014 at 11:40 AM, Vadim Chekan <ko...@gmail.com>
wrote:

> Hi all,
>
> I'm working on my own kafka client implementation and I noticed strange
> situation.
>
> Time(s) | Action
> 0.0    MetadataRequest
> 0.0    MetadataResponse
> 0.2    OffsetRequest
> 0.2    OffsetResponse
> 0.3    FetchRequest(MaxWaitTime=20sec)
> 6.0    ProduceRequest
> 31.0  FetchResponse
> ^^^^ notice 25sec gap!
> 31.0  ProduceResponse
>
> What is important, it seems to me that Fetch Request with long poll will
> block processing of all following requests for the duration of timeout,
> given that there is no new data. But new data *are* flowing in and Produce
> Request is waiting right behind the Fetch one in server side processing
> queue.
>
> If what I describe is correct then I see the following problems.
> 1. From client's point of view, if they have a listener and a sender, then
> sending data should produce immediate event on listener.
>
> 2. If I want clean shutdown of my application within 5sec, but my listeners
> are configured with 20sec timeout, than I am risking losing the data. My
> application shutdown time now becomes at least as long as listener's
> pooling time, which is unreasonable.
>
> I do understand why it was implemented that way, probably because of the
> specification saying
> ====
> "The server guarantees that on a single TCP connection, requests will be
> processed in the order they are sent and responses will return in that
> order as well"
> ====
> But to achieve proper functioning of the client, I need to allocate another
> tcp connection for listeners now?
>
> Also, I'm aware about new protocol proposal for the next version of Kafka,
> would this issue be resolved there?
>
> Thanks,
> Vadim.
>
> --
> From RFC 2631: In ASN.1, EXPLICIT tagging is implicit unless IMPLICIT is
> explicitly specified
>

Re: Can Fetch Request cause delay of Produce Request?

Posted by Jun Rao <ju...@gmail.com>.
Vadim,

I assume that you are using 0.8.1.1. In trunk, we fixed an issue
(KAFKA-1430) that can cause a fetch response to be unnecessarily delayed.
The fix will be included in the next 0.8.2 release.

Thanks,

Jun

On Thu, Sep 25, 2014 at 11:40 AM, Vadim Chekan <ko...@gmail.com>
wrote:

> Hi all,
>
> I'm working on my own kafka client implementation and I noticed strange
> situation.
>
> Time(s) | Action
> 0.0    MetadataRequest
> 0.0    MetadataResponse
> 0.2    OffsetRequest
> 0.2    OffsetResponse
> 0.3    FetchRequest(MaxWaitTime=20sec)
> 6.0    ProduceRequest
> 31.0  FetchResponse
> ^^^^ notice 25sec gap!
> 31.0  ProduceResponse
>
> What is important, it seems to me that Fetch Request with long poll will
> block processing of all following requests for the duration of timeout,
> given that there is no new data. But new data *are* flowing in and Produce
> Request is waiting right behind the Fetch one in server side processing
> queue.
>
> If what I describe is correct then I see the following problems.
> 1. From client's point of view, if they have a listener and a sender, then
> sending data should produce immediate event on listener.
>
> 2. If I want clean shutdown of my application within 5sec, but my listeners
> are configured with 20sec timeout, than I am risking losing the data. My
> application shutdown time now becomes at least as long as listener's
> pooling time, which is unreasonable.
>
> I do understand why it was implemented that way, probably because of the
> specification saying
> ====
> "The server guarantees that on a single TCP connection, requests will be
> processed in the order they are sent and responses will return in that
> order as well"
> ====
> But to achieve proper functioning of the client, I need to allocate another
> tcp connection for listeners now?
>
> Also, I'm aware about new protocol proposal for the next version of Kafka,
> would this issue be resolved there?
>
> Thanks,
> Vadim.
>
> --
> From RFC 2631: In ASN.1, EXPLICIT tagging is implicit unless IMPLICIT is
> explicitly specified
>