You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by graham sanderson <gr...@vast.com> on 2012/07/13 04:08:42 UTC

Guarantees

Hi, so I happened to be going to demo a prototype built with kafka in a borrowed large room which I discovered had insufficient/flaky wireless. Was using zookeeper config, and getting lots of timeouts etc. Since this was the first time I had used kafka and I hadn't done any off path testing, my first course of action was to find a hard wire, which I did and all the timeouts disappeared. The demo was great. Note that even with the flaky wireless, messages generally still seemed to be getting delivered, but not always as far as I could tell (or perhaps with high latency - was more focused on having a working demo than debugging)

I'm using 0.7 atm, though I'm not sure if that matters.

My somewhat question is, given a simple scenario using kafka/zookeeper (prior to all the exciting fault tolerance work going on right now):

1) Lets say I have zookeeper server, kafka server, producer, and consumer running on a perfect network. And I successfully send a message from producer to consumer
2) All JVMs stay up, however I lose network connectivity between some or all of them for some time
3) The network becomes perfect again.
4) I wait for some time for everyone to reconnect/re-negociate to their best ability

Following that, should I expect a new message from the producer to reach the consumer, or can the system get into a broken state?… I swear I saw such a message not delivered, but I can't say for sure… I can certainly investigate further by trying to reproduce again and wading thru the many logged errors, but if someone already knows the answer that'd be awesome!

Thanks,

Graham.


Re: Guarantees

Posted by Neha Narkhede <ne...@gmail.com>.
Once the network has recovered, you should see new messages.

Thanks,
Neha

On Fri, Jul 20, 2012 at 4:49 PM, graham sanderson <gr...@vast.com> wrote:

> Thanks Neha, I really meant, yes I may lose some messages in the
> meanwhile, but should I expect new messages after everything gets back to
> normal to be delivered (unless my code throws an exception and kills a
> worker thread, which it wasn't)
>
> On Jul 20, 2012, at 6:00 PM, Neha Narkhede wrote:
>
> > Graham,
> >
> > It really depends on what sort of network outage. The producer, whether
> zk
> > or not, can be configured to retry couple of times. If it runs out of
> > retries during this outage, it will drop the messages and they will be
> > lost.
> >
> > Thanks,
> > Neha
> >
> > On Sat, Jul 14, 2012 at 11:32 AM, Jun Rao <ju...@gmail.com> wrote:
> >
> >> The pipeline is supposed to recover from the network outage. There
> could be
> >> bugs, especially in the ZK-based producer since it's relatively new.
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Thu, Jul 12, 2012 at 7:08 PM, graham sanderson <gr...@vast.com>
> wrote:
> >>
> >>> Hi, so I happened to be going to demo a prototype built with kafka in a
> >>> borrowed large room which I discovered had insufficient/flaky wireless.
> >> Was
> >>> using zookeeper config, and getting lots of timeouts etc. Since this
> was
> >>> the first time I had used kafka and I hadn't done any off path testing,
> >> my
> >>> first course of action was to find a hard wire, which I did and all the
> >>> timeouts disappeared. The demo was great. Note that even with the flaky
> >>> wireless, messages generally still seemed to be getting delivered, but
> >> not
> >>> always as far as I could tell (or perhaps with high latency - was more
> >>> focused on having a working demo than debugging)
> >>>
> >>> I'm using 0.7 atm, though I'm not sure if that matters.
> >>>
> >>> My somewhat question is, given a simple scenario using kafka/zookeeper
> >>> (prior to all the exciting fault tolerance work going on right now):
> >>>
> >>> 1) Lets say I have zookeeper server, kafka server, producer, and
> consumer
> >>> running on a perfect network. And I successfully send a message from
> >>> producer to consumer
> >>> 2) All JVMs stay up, however I lose network connectivity between some
> or
> >>> all of them for some time
> >>> 3) The network becomes perfect again.
> >>> 4) I wait for some time for everyone to reconnect/re-negociate to their
> >>> best ability
> >>>
> >>> Following that, should I expect a new message from the producer to
> reach
> >>> the consumer, or can the system get into a broken state?… I swear I saw
> >>> such a message not delivered, but I can't say for sure… I can certainly
> >>> investigate further by trying to reproduce again and wading thru the
> many
> >>> logged errors, but if someone already knows the answer that'd be
> awesome!
> >>>
> >>> Thanks,
> >>>
> >>> Graham.
> >>>
> >>>
> >>
>
>

Re: Guarantees

Posted by graham sanderson <gr...@vast.com>.
Thanks Neha, I really meant, yes I may lose some messages in the meanwhile, but should I expect new messages after everything gets back to normal to be delivered (unless my code throws an exception and kills a worker thread, which it wasn't)

On Jul 20, 2012, at 6:00 PM, Neha Narkhede wrote:

> Graham,
> 
> It really depends on what sort of network outage. The producer, whether zk
> or not, can be configured to retry couple of times. If it runs out of
> retries during this outage, it will drop the messages and they will be
> lost.
> 
> Thanks,
> Neha
> 
> On Sat, Jul 14, 2012 at 11:32 AM, Jun Rao <ju...@gmail.com> wrote:
> 
>> The pipeline is supposed to recover from the network outage. There could be
>> bugs, especially in the ZK-based producer since it's relatively new.
>> 
>> Thanks,
>> 
>> Jun
>> 
>> On Thu, Jul 12, 2012 at 7:08 PM, graham sanderson <gr...@vast.com> wrote:
>> 
>>> Hi, so I happened to be going to demo a prototype built with kafka in a
>>> borrowed large room which I discovered had insufficient/flaky wireless.
>> Was
>>> using zookeeper config, and getting lots of timeouts etc. Since this was
>>> the first time I had used kafka and I hadn't done any off path testing,
>> my
>>> first course of action was to find a hard wire, which I did and all the
>>> timeouts disappeared. The demo was great. Note that even with the flaky
>>> wireless, messages generally still seemed to be getting delivered, but
>> not
>>> always as far as I could tell (or perhaps with high latency - was more
>>> focused on having a working demo than debugging)
>>> 
>>> I'm using 0.7 atm, though I'm not sure if that matters.
>>> 
>>> My somewhat question is, given a simple scenario using kafka/zookeeper
>>> (prior to all the exciting fault tolerance work going on right now):
>>> 
>>> 1) Lets say I have zookeeper server, kafka server, producer, and consumer
>>> running on a perfect network. And I successfully send a message from
>>> producer to consumer
>>> 2) All JVMs stay up, however I lose network connectivity between some or
>>> all of them for some time
>>> 3) The network becomes perfect again.
>>> 4) I wait for some time for everyone to reconnect/re-negociate to their
>>> best ability
>>> 
>>> Following that, should I expect a new message from the producer to reach
>>> the consumer, or can the system get into a broken state?… I swear I saw
>>> such a message not delivered, but I can't say for sure… I can certainly
>>> investigate further by trying to reproduce again and wading thru the many
>>> logged errors, but if someone already knows the answer that'd be awesome!
>>> 
>>> Thanks,
>>> 
>>> Graham.
>>> 
>>> 
>> 


Re: Guarantees

Posted by Neha Narkhede <ne...@gmail.com>.
Graham,

It really depends on what sort of network outage. The producer, whether zk
or not, can be configured to retry couple of times. If it runs out of
retries during this outage, it will drop the messages and they will be
lost.

Thanks,
Neha

On Sat, Jul 14, 2012 at 11:32 AM, Jun Rao <ju...@gmail.com> wrote:

> The pipeline is supposed to recover from the network outage. There could be
> bugs, especially in the ZK-based producer since it's relatively new.
>
> Thanks,
>
> Jun
>
> On Thu, Jul 12, 2012 at 7:08 PM, graham sanderson <gr...@vast.com> wrote:
>
> > Hi, so I happened to be going to demo a prototype built with kafka in a
> > borrowed large room which I discovered had insufficient/flaky wireless.
> Was
> > using zookeeper config, and getting lots of timeouts etc. Since this was
> > the first time I had used kafka and I hadn't done any off path testing,
> my
> > first course of action was to find a hard wire, which I did and all the
> > timeouts disappeared. The demo was great. Note that even with the flaky
> > wireless, messages generally still seemed to be getting delivered, but
> not
> > always as far as I could tell (or perhaps with high latency - was more
> > focused on having a working demo than debugging)
> >
> > I'm using 0.7 atm, though I'm not sure if that matters.
> >
> > My somewhat question is, given a simple scenario using kafka/zookeeper
> > (prior to all the exciting fault tolerance work going on right now):
> >
> > 1) Lets say I have zookeeper server, kafka server, producer, and consumer
> > running on a perfect network. And I successfully send a message from
> > producer to consumer
> > 2) All JVMs stay up, however I lose network connectivity between some or
> > all of them for some time
> > 3) The network becomes perfect again.
> > 4) I wait for some time for everyone to reconnect/re-negociate to their
> > best ability
> >
> > Following that, should I expect a new message from the producer to reach
> > the consumer, or can the system get into a broken state?… I swear I saw
> > such a message not delivered, but I can't say for sure… I can certainly
> > investigate further by trying to reproduce again and wading thru the many
> > logged errors, but if someone already knows the answer that'd be awesome!
> >
> > Thanks,
> >
> > Graham.
> >
> >
>

Re: Guarantees

Posted by Jun Rao <ju...@gmail.com>.
The pipeline is supposed to recover from the network outage. There could be
bugs, especially in the ZK-based producer since it's relatively new.

Thanks,

Jun

On Thu, Jul 12, 2012 at 7:08 PM, graham sanderson <gr...@vast.com> wrote:

> Hi, so I happened to be going to demo a prototype built with kafka in a
> borrowed large room which I discovered had insufficient/flaky wireless. Was
> using zookeeper config, and getting lots of timeouts etc. Since this was
> the first time I had used kafka and I hadn't done any off path testing, my
> first course of action was to find a hard wire, which I did and all the
> timeouts disappeared. The demo was great. Note that even with the flaky
> wireless, messages generally still seemed to be getting delivered, but not
> always as far as I could tell (or perhaps with high latency - was more
> focused on having a working demo than debugging)
>
> I'm using 0.7 atm, though I'm not sure if that matters.
>
> My somewhat question is, given a simple scenario using kafka/zookeeper
> (prior to all the exciting fault tolerance work going on right now):
>
> 1) Lets say I have zookeeper server, kafka server, producer, and consumer
> running on a perfect network. And I successfully send a message from
> producer to consumer
> 2) All JVMs stay up, however I lose network connectivity between some or
> all of them for some time
> 3) The network becomes perfect again.
> 4) I wait for some time for everyone to reconnect/re-negociate to their
> best ability
>
> Following that, should I expect a new message from the producer to reach
> the consumer, or can the system get into a broken state?… I swear I saw
> such a message not delivered, but I can't say for sure… I can certainly
> investigate further by trying to reproduce again and wading thru the many
> logged errors, but if someone already knows the answer that'd be awesome!
>
> Thanks,
>
> Graham.
>
>