You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by Karts <ka...@gmail.com> on 2015/02/19 20:15:43 UTC

data corruption like behavior

I have noticed some strange patterns when testing with the 0.8.1 build and
the 0.8.2 builds, and are listed below.
1. So I setup a brand new cluster [3 kafka nodes with 3 zookeepers],
created 2 topics via the API calls, everything went fine and was
successfully able to view my messages in my consumers. There were no
messages lost. All is happy. Now, I change my setup to just have 1
zookeeper. and do my test again, i lose some messages. I have checked that
all my configs are pointing to just 1 zookeeper and there was no mention of
the other 2 offline zookeepers. any idea why ?
2. I revert back my settings to the original config, all 3 nodes are
online, no errors, send messages to same old topic, and i am still loosing
some messages. I deleted all the old topic files [to follow the 'cleanup'
process], create a new topic, and i am successfully able to receive all
messages. no loss whatsoever.
3. Now in this state, i upgrade to 0.8.2, and try sending messages to the
topic that was made after the above cleanup, and i am losing messages
again.

Am i making sense? I mean this is a very strange behavior, and if anyone
can comment on this [please correct me if i have done something 'very'
wrong]..

Thanks..

Re: data corruption like behavior

Posted by Karts <ka...@gmail.com>.

actually i take that back. it reads from where the last offset left off.

On Thu, Feb 19, 2015 at 4:20 PM, Karts <ka...@gmail.com> wrote:

> yes i did.
>
> On Thu, Feb 19, 2015 at 2:42 PM, Jun Rao <ju...@confluent.io> wrote:
>
>> Did you consume the messages from the beginning of the log?
>>
>> Thanks,
>>
>> Jun
>>
>> On Thu, Feb 19, 2015 at 12:18 PM, Karts <ka...@gmail.com> wrote:
>>
>> > but they have always been up. I mean when i was testing, all the
>> zookeepers
>> > were up. and all the kafka nodes were up. its just that I changed the
>> > number of zookeeper nodes in my first test iteration. second and third
>> were
>> > still the same. not sure why the topics were losing some messages.
>> >
>> > On Thu, Feb 19, 2015 at 11:39 AM, Jun Rao <ju...@confluent.io> wrote:
>> >
>> > > Zookeeper requires a majority of the nodes to be up for the service
>> to be
>> > > available. Kafka relies on Zookeeper to be always available.
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > > On Thu, Feb 19, 2015 at 11:15 AM, Karts <ka...@gmail.com> wrote:
>> > >
>> > > > I have noticed some strange patterns when testing with the 0.8.1
>> build
>> > > and
>> > > > the 0.8.2 builds, and are listed below.
>> > > > 1. So I setup a brand new cluster [3 kafka nodes with 3 zookeepers],
>> > > > created 2 topics via the API calls, everything went fine and was
>> > > > successfully able to view my messages in my consumers. There were no
>> > > > messages lost. All is happy. Now, I change my setup to just have 1
>> > > > zookeeper. and do my test again, i lose some messages. I have
>> checked
>> > > that
>> > > > all my configs are pointing to just 1 zookeeper and there was no
>> > mention
>> > > of
>> > > > the other 2 offline zookeepers. any idea why ?
>> > > > 2. I revert back my settings to the original config, all 3 nodes are
>> > > > online, no errors, send messages to same old topic, and i am still
>> > > loosing
>> > > > some messages. I deleted all the old topic files [to follow the
>> > 'cleanup'
>> > > > process], create a new topic, and i am successfully able to receive
>> all
>> > > > messages. no loss whatsoever.
>> > > > 3. Now in this state, i upgrade to 0.8.2, and try sending messages
>> to
>> > the
>> > > > topic that was made after the above cleanup, and i am losing
>> messages
>> > > > again.
>> > > >
>> > > > Am i making sense? I mean this is a very strange behavior, and if
>> > anyone
>> > > > can comment on this [please correct me if i have done something
>> 'very'
>> > > > wrong]..
>> > > >
>> > > > Thanks..
>> > > >
>> > >
>> >
>>
>
>

Re: data corruption like behavior

Posted by Karts <ka...@gmail.com>.

[2015-02-05 14:21:09,708] ERROR [ReplicaFetcherThread-2-1], Error in fetch
Name: FetchRequest; Version: 0; CorrelationId: 147301; ClientId:
ReplicaFetcherThread-2-1; ReplicaId: 3; MaxWait: 500 ms; MinBytes: 1 bytes;
RequestInfo: [site.db.people,6] ->
PartitionFetchInfo(0,1048576),[site.db.main,4] ->
PartitionFetchInfo(0,1048576),[site.db.school,7] ->
PartitionFetchInfo(0,1048576),[site.db.people,2] ->
PartitionFetchInfo(0,1048576),[k3.hydra,6] ->
PartitionFetchInfo(3,1048576),[site.db.school,3] ->
PartitionFetchInfo(0,1048576),[site.db.main,0] ->
PartitionFetchInfo(0,1048576),[site.db.cmphotos,2] ->
PartitionFetchInfo(2245,1048576),[site.db.cmphotos,6] ->
PartitionFetchInfo(2220,1048576) (kafka.server.ReplicaFetcherThread)
java.net.ConnectException: Connection refused

These were some of the errors from the server log. didnt find any on the
producer side of things.

On Thu, Feb 19, 2015 at 4:30 PM, Jun Rao <ju...@confluent.io> wrote:

> Is there any error in the producer log? Is there any pattern in the
> messages being lost?
>
> Thanks,
>
> Jun
>
> On Thu, Feb 19, 2015 at 4:20 PM, Karts <ka...@gmail.com> wrote:
>
> > yes i did.
> >
> > On Thu, Feb 19, 2015 at 2:42 PM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Did you consume the messages from the beginning of the log?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Feb 19, 2015 at 12:18 PM, Karts <ka...@gmail.com> wrote:
> > >
> > > > but they have always been up. I mean when i was testing, all the
> > > zookeepers
> > > > were up. and all the kafka nodes were up. its just that I changed the
> > > > number of zookeeper nodes in my first test iteration. second and
> third
> > > were
> > > > still the same. not sure why the topics were losing some messages.
> > > >
> > > > On Thu, Feb 19, 2015 at 11:39 AM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Zookeeper requires a majority of the nodes to be up for the service
> > to
> > > be
> > > > > available. Kafka relies on Zookeeper to be always available.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Thu, Feb 19, 2015 at 11:15 AM, Karts <ka...@gmail.com>
> wrote:
> > > > >
> > > > > > I have noticed some strange patterns when testing with the 0.8.1
> > > build
> > > > > and
> > > > > > the 0.8.2 builds, and are listed below.
> > > > > > 1. So I setup a brand new cluster [3 kafka nodes with 3
> > zookeepers],
> > > > > > created 2 topics via the API calls, everything went fine and was
> > > > > > successfully able to view my messages in my consumers. There were
> > no
> > > > > > messages lost. All is happy. Now, I change my setup to just have
> 1
> > > > > > zookeeper. and do my test again, i lose some messages. I have
> > checked
> > > > > that
> > > > > > all my configs are pointing to just 1 zookeeper and there was no
> > > > mention
> > > > > of
> > > > > > the other 2 offline zookeepers. any idea why ?
> > > > > > 2. I revert back my settings to the original config, all 3 nodes
> > are
> > > > > > online, no errors, send messages to same old topic, and i am
> still
> > > > > loosing
> > > > > > some messages. I deleted all the old topic files [to follow the
> > > > 'cleanup'
> > > > > > process], create a new topic, and i am successfully able to
> receive
> > > all
> > > > > > messages. no loss whatsoever.
> > > > > > 3. Now in this state, i upgrade to 0.8.2, and try sending
> messages
> > to
> > > > the
> > > > > > topic that was made after the above cleanup, and i am losing
> > messages
> > > > > > again.
> > > > > >
> > > > > > Am i making sense? I mean this is a very strange behavior, and if
> > > > anyone
> > > > > > can comment on this [please correct me if i have done something
> > > 'very'
> > > > > > wrong]..
> > > > > >
> > > > > > Thanks..
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: data corruption like behavior

Posted by Jun Rao <ju...@confluent.io>.

Is there any error in the producer log? Is there any pattern in the
messages being lost?

Thanks,

Jun

On Thu, Feb 19, 2015 at 4:20 PM, Karts <ka...@gmail.com> wrote:

> yes i did.
>
> On Thu, Feb 19, 2015 at 2:42 PM, Jun Rao <ju...@confluent.io> wrote:
>
> > Did you consume the messages from the beginning of the log?
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Feb 19, 2015 at 12:18 PM, Karts <ka...@gmail.com> wrote:
> >
> > > but they have always been up. I mean when i was testing, all the
> > zookeepers
> > > were up. and all the kafka nodes were up. its just that I changed the
> > > number of zookeeper nodes in my first test iteration. second and third
> > were
> > > still the same. not sure why the topics were losing some messages.
> > >
> > > On Thu, Feb 19, 2015 at 11:39 AM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Zookeeper requires a majority of the nodes to be up for the service
> to
> > be
> > > > available. Kafka relies on Zookeeper to be always available.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Thu, Feb 19, 2015 at 11:15 AM, Karts <ka...@gmail.com> wrote:
> > > >
> > > > > I have noticed some strange patterns when testing with the 0.8.1
> > build
> > > > and
> > > > > the 0.8.2 builds, and are listed below.
> > > > > 1. So I setup a brand new cluster [3 kafka nodes with 3
> zookeepers],
> > > > > created 2 topics via the API calls, everything went fine and was
> > > > > successfully able to view my messages in my consumers. There were
> no
> > > > > messages lost. All is happy. Now, I change my setup to just have 1
> > > > > zookeeper. and do my test again, i lose some messages. I have
> checked
> > > > that
> > > > > all my configs are pointing to just 1 zookeeper and there was no
> > > mention
> > > > of
> > > > > the other 2 offline zookeepers. any idea why ?
> > > > > 2. I revert back my settings to the original config, all 3 nodes
> are
> > > > > online, no errors, send messages to same old topic, and i am still
> > > > loosing
> > > > > some messages. I deleted all the old topic files [to follow the
> > > 'cleanup'
> > > > > process], create a new topic, and i am successfully able to receive
> > all
> > > > > messages. no loss whatsoever.
> > > > > 3. Now in this state, i upgrade to 0.8.2, and try sending messages
> to
> > > the
> > > > > topic that was made after the above cleanup, and i am losing
> messages
> > > > > again.
> > > > >
> > > > > Am i making sense? I mean this is a very strange behavior, and if
> > > anyone
> > > > > can comment on this [please correct me if i have done something
> > 'very'
> > > > > wrong]..
> > > > >
> > > > > Thanks..
> > > > >
> > > >
> > >
> >
>

Re: data corruption like behavior

Posted by Karts <ka...@gmail.com>.

yes i did.

On Thu, Feb 19, 2015 at 2:42 PM, Jun Rao <ju...@confluent.io> wrote:

> Did you consume the messages from the beginning of the log?
>
> Thanks,
>
> Jun
>
> On Thu, Feb 19, 2015 at 12:18 PM, Karts <ka...@gmail.com> wrote:
>
> > but they have always been up. I mean when i was testing, all the
> zookeepers
> > were up. and all the kafka nodes were up. its just that I changed the
> > number of zookeeper nodes in my first test iteration. second and third
> were
> > still the same. not sure why the topics were losing some messages.
> >
> > On Thu, Feb 19, 2015 at 11:39 AM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Zookeeper requires a majority of the nodes to be up for the service to
> be
> > > available. Kafka relies on Zookeeper to be always available.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Feb 19, 2015 at 11:15 AM, Karts <ka...@gmail.com> wrote:
> > >
> > > > I have noticed some strange patterns when testing with the 0.8.1
> build
> > > and
> > > > the 0.8.2 builds, and are listed below.
> > > > 1. So I setup a brand new cluster [3 kafka nodes with 3 zookeepers],
> > > > created 2 topics via the API calls, everything went fine and was
> > > > successfully able to view my messages in my consumers. There were no
> > > > messages lost. All is happy. Now, I change my setup to just have 1
> > > > zookeeper. and do my test again, i lose some messages. I have checked
> > > that
> > > > all my configs are pointing to just 1 zookeeper and there was no
> > mention
> > > of
> > > > the other 2 offline zookeepers. any idea why ?
> > > > 2. I revert back my settings to the original config, all 3 nodes are
> > > > online, no errors, send messages to same old topic, and i am still
> > > loosing
> > > > some messages. I deleted all the old topic files [to follow the
> > 'cleanup'
> > > > process], create a new topic, and i am successfully able to receive
> all
> > > > messages. no loss whatsoever.
> > > > 3. Now in this state, i upgrade to 0.8.2, and try sending messages to
> > the
> > > > topic that was made after the above cleanup, and i am losing messages
> > > > again.
> > > >
> > > > Am i making sense? I mean this is a very strange behavior, and if
> > anyone
> > > > can comment on this [please correct me if i have done something
> 'very'
> > > > wrong]..
> > > >
> > > > Thanks..
> > > >
> > >
> >
>

Re: data corruption like behavior

Posted by Jun Rao <ju...@confluent.io>.

Did you consume the messages from the beginning of the log?

Thanks,

Jun

On Thu, Feb 19, 2015 at 12:18 PM, Karts <ka...@gmail.com> wrote:

> but they have always been up. I mean when i was testing, all the zookeepers
> were up. and all the kafka nodes were up. its just that I changed the
> number of zookeeper nodes in my first test iteration. second and third were
> still the same. not sure why the topics were losing some messages.
>
> On Thu, Feb 19, 2015 at 11:39 AM, Jun Rao <ju...@confluent.io> wrote:
>
> > Zookeeper requires a majority of the nodes to be up for the service to be
> > available. Kafka relies on Zookeeper to be always available.
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Feb 19, 2015 at 11:15 AM, Karts <ka...@gmail.com> wrote:
> >
> > > I have noticed some strange patterns when testing with the 0.8.1 build
> > and
> > > the 0.8.2 builds, and are listed below.
> > > 1. So I setup a brand new cluster [3 kafka nodes with 3 zookeepers],
> > > created 2 topics via the API calls, everything went fine and was
> > > successfully able to view my messages in my consumers. There were no
> > > messages lost. All is happy. Now, I change my setup to just have 1
> > > zookeeper. and do my test again, i lose some messages. I have checked
> > that
> > > all my configs are pointing to just 1 zookeeper and there was no
> mention
> > of
> > > the other 2 offline zookeepers. any idea why ?
> > > 2. I revert back my settings to the original config, all 3 nodes are
> > > online, no errors, send messages to same old topic, and i am still
> > loosing
> > > some messages. I deleted all the old topic files [to follow the
> 'cleanup'
> > > process], create a new topic, and i am successfully able to receive all
> > > messages. no loss whatsoever.
> > > 3. Now in this state, i upgrade to 0.8.2, and try sending messages to
> the
> > > topic that was made after the above cleanup, and i am losing messages
> > > again.
> > >
> > > Am i making sense? I mean this is a very strange behavior, and if
> anyone
> > > can comment on this [please correct me if i have done something 'very'
> > > wrong]..
> > >
> > > Thanks..
> > >
> >
>

Re: data corruption like behavior

Posted by Karts <ka...@gmail.com>.

but they have always been up. I mean when i was testing, all the zookeepers
were up. and all the kafka nodes were up. its just that I changed the
number of zookeeper nodes in my first test iteration. second and third were
still the same. not sure why the topics were losing some messages.

On Thu, Feb 19, 2015 at 11:39 AM, Jun Rao <ju...@confluent.io> wrote:

> Zookeeper requires a majority of the nodes to be up for the service to be
> available. Kafka relies on Zookeeper to be always available.
>
> Thanks,
>
> Jun
>
> On Thu, Feb 19, 2015 at 11:15 AM, Karts <ka...@gmail.com> wrote:
>
> > I have noticed some strange patterns when testing with the 0.8.1 build
> and
> > the 0.8.2 builds, and are listed below.
> > 1. So I setup a brand new cluster [3 kafka nodes with 3 zookeepers],
> > created 2 topics via the API calls, everything went fine and was
> > successfully able to view my messages in my consumers. There were no
> > messages lost. All is happy. Now, I change my setup to just have 1
> > zookeeper. and do my test again, i lose some messages. I have checked
> that
> > all my configs are pointing to just 1 zookeeper and there was no mention
> of
> > the other 2 offline zookeepers. any idea why ?
> > 2. I revert back my settings to the original config, all 3 nodes are
> > online, no errors, send messages to same old topic, and i am still
> loosing
> > some messages. I deleted all the old topic files [to follow the 'cleanup'
> > process], create a new topic, and i am successfully able to receive all
> > messages. no loss whatsoever.
> > 3. Now in this state, i upgrade to 0.8.2, and try sending messages to the
> > topic that was made after the above cleanup, and i am losing messages
> > again.
> >
> > Am i making sense? I mean this is a very strange behavior, and if anyone
> > can comment on this [please correct me if i have done something 'very'
> > wrong]..
> >
> > Thanks..
> >
>

Re: data corruption like behavior

Posted by Jun Rao <ju...@confluent.io>.

Zookeeper requires a majority of the nodes to be up for the service to be
available. Kafka relies on Zookeeper to be always available.

Thanks,

Jun

On Thu, Feb 19, 2015 at 11:15 AM, Karts <ka...@gmail.com> wrote:

> I have noticed some strange patterns when testing with the 0.8.1 build and
> the 0.8.2 builds, and are listed below.
> 1. So I setup a brand new cluster [3 kafka nodes with 3 zookeepers],
> created 2 topics via the API calls, everything went fine and was
> successfully able to view my messages in my consumers. There were no
> messages lost. All is happy. Now, I change my setup to just have 1
> zookeeper. and do my test again, i lose some messages. I have checked that
> all my configs are pointing to just 1 zookeeper and there was no mention of
> the other 2 offline zookeepers. any idea why ?
> 2. I revert back my settings to the original config, all 3 nodes are
> online, no errors, send messages to same old topic, and i am still loosing
> some messages. I deleted all the old topic files [to follow the 'cleanup'
> process], create a new topic, and i am successfully able to receive all
> messages. no loss whatsoever.
> 3. Now in this state, i upgrade to 0.8.2, and try sending messages to the
> topic that was made after the above cleanup, and i am losing messages
> again.
>
> Am i making sense? I mean this is a very strange behavior, and if anyone
> can comment on this [please correct me if i have done something 'very'
> wrong]..
>
> Thanks..
>