You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Jun Rao <ju...@gmail.com> on 2013/11/04 01:34:13 UTC

Review Request 15201: tool for checking the consistency among replicas

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/
-----------------------------------------------------------

Review request for kafka.


Bugs: KAFKA-1117
    https://issues.apache.org/jira/browse/KAFKA-1117


Repository: kafka


Description
-------

kafka-1117


Diffs
-----

  core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 

Diff: https://reviews.apache.org/r/15201/diff/


Testing
-------


Thanks,

Jun Rao


Re: Review Request 15201: address previous review comments

Posted by Jun Rao <ju...@gmail.com>.

> On Nov. 5, 2013, 10:41 p.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 73
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line73>
> >
> >     We do not need () after withRequiredArg here.

done


> On Nov. 5, 2013, 10:41 p.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 115
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line115>
> >
> >     Ditto as above for these three.

done


> On Nov. 5, 2013, 10:41 p.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 120
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line120>
> >
> >     Why do we ask for all topics from brokers then filter them? Is this because the start-up brokers may not have metadata info for all the topics yet?

This is because the metadata api doesn't support get topics matching a regular expression.


> On Nov. 5, 2013, 10:41 p.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 274
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line274>
> >
> >     In think in this case the verification would stop and probably report the max lag for this topic-partition. This is also fine I think.

For an offset to move, all replicas must have received the message on that offset. Otherwise, the tool just keeps retrying.


> On Nov. 5, 2013, 10:41 p.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 361
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line361>
> >
> >     createNewVerificationBarrier here will set the count to 1, and then count it down to 0. So the next round the verification barrier count would be 1 and hence not holding other threads?

Not sure what your question is. Each fetch thread does the following in a loop:

1. fetch from the broker
2. wait for all threads to finish fetching
3. wait for verification to complete (only one threads does the verification)

The verification barrier is used so that all fetcher threads are in the right state in step 1.


- Jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28162
-----------------------------------------------------------


On Nov. 11, 2013, 4:44 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 11, 2013, 4:44 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: tool for checking the consistency among replicas

Posted by Guozhang Wang <gu...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28162
-----------------------------------------------------------



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54778>

    We do not need () after withRequiredArg here.



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54780>

    Ditto as above for these three.



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54785>

    Why do we ask for all topics from brokers then filter them? Is this because the start-up brokers may not have metadata info for all the topics yet?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54945>

    In think in this case the verification would stop and probably report the max lag for this topic-partition. This is also fine I think.



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54865>

    createNewVerificationBarrier here will set the count to 1, and then count it down to 0. So the next round the verification barrier count would be 1 and hence not holding other threads?


- Guozhang Wang


On Nov. 4, 2013, 12:34 a.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 4, 2013, 12:34 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address previous review comments

Posted by Jun Rao <ju...@gmail.com>.

> On Nov. 5, 2013, 6:27 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 66
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line66>
> >
> >     Default max.message.size is (1MB + header). Default fetch size should be a little larger than 1MB. How about 1.1 MB?

The fetch size is 1024 * 1024 (the default), which is larger than the default message size 1000000.


> On Nov. 5, 2013, 6:27 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 145
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line145>
> >
> >     This can throw an exception if the leader doesn't exist
> >     
> >     Replicas that don't have a leader when this tool is run cannot set the initial offset. Should we skip such partitions?

The tool assumes all leaders are available when the tool is started. Since this tool is mostly used for testing and we want to verify all replicas, I think this is a reasonable assumption. Once the tool is started, leaders can change or become unavailable.


> On Nov. 5, 2013, 6:27 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 245
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line245>
> >
> >     It will be useful to know the fetch offset of the chunk that is undergoing verification.

There is another debug logging that logs the offset.


> On Nov. 5, 2013, 6:27 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 247
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line247>
> >
> >     map{ -> map {

Not sure this is the standard. We do map(, instead of map (. So, it seems that we should do map{, instead of map {?


> On Nov. 5, 2013, 6:27 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 274
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line274>
> >
> >     What happens if one replica returns empty message set because it is not caught up and other replicas return some valid messages? Then ideally the fetch offset should not be updated since we need to repeat the verification for that fetch offset

For an offset to move, all replicas must have received the message on that offset. Otherwise, the tool just keeps retrying.


- Jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28186
-----------------------------------------------------------


On Nov. 11, 2013, 4:44 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 11, 2013, 4:44 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address previous review comments

Posted by Neha Narkhede <ne...@gmail.com>.

> On Nov. 5, 2013, 6:27 p.m., Neha Narkhede wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 247
> > <https://reviews.apache.org/r/15201/diff/1/?file=377064#file377064line247>
> >
> >     map{ -> map {
> 
> Jun Rao wrote:
>     Not sure this is the standard. We do map(, instead of map (. So, it seems that we should do map{, instead of map {?

So far, we have followed op<space><parantheses>. Let's change it everywhere or follow it for consistency.


- Neha


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28186
-----------------------------------------------------------


On Nov. 11, 2013, 4:44 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 11, 2013, 4:44 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: tool for checking the consistency among replicas

Posted by Neha Narkhede <ne...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28186
-----------------------------------------------------------



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54838>

    Default max.message.size is (1MB + header). Default fetch size should be a little larger than 1MB. How about 1.1 MB?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54840>

    This can throw an exception if the leader doesn't exist
    
    Replicas that don't have a leader when this tool is run cannot set the initial offset. Should we skip such partitions?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54847>

    It will be useful to know the fetch offset of the chunk that is undergoing verification. 



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54853>

    map{ -> map {



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54856>

    It will be good to include a relevant message with every assert



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54862>

    Let's add a comment explaining what this loop does. 



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54854>

    This is not a messageSet but a message. Can we rename this to messageInfoFromFirstReplica ?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54861>

    What happens if one replica returns empty message set because it is not caught up and other replicas return some valid messages? Then ideally the fetch offset should not be updated since we need to repeat the verification for that fetch offset



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment54858>

    Let's rename isDone to something meaningful. It is a little confusing that we update the fetch offset even when isDone is false.


- Neha Narkhede


On Nov. 4, 2013, 12:34 a.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 4, 2013, 12:34 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address previous review comments

Posted by Manoj Kunar <ma...@oracle.com>.
Hi Administrators,

Can you help me get out of this email-dlist? I do not see any URL to 
unsubscribe.

thanks
manoj


On 11/11/2013 11:06 AM, Neha Narkhede wrote:
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/#review28665
> -----------------------------------------------------------
>
>
>
> core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
> <https://reviews.apache.org/r/15201/#comment55581>
>
>      This is pretty difficult to read. What we've learnt in the past is that error codes are not useful alone and should be converted to the string error name. Also, it is useful to print the partition -> ErrorCodeString only for partitions that have a non zero error code
>
>
>
> core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
> <https://reviews.apache.org/r/15201/#comment55583>
>
>      expected ?
>
>
> - Neha Narkhede
>
>
> On Nov. 11, 2013, 4:44 p.m., Jun Rao wrote:
>> -----------------------------------------------------------
>> This is an automatically generated e-mail. To reply, visit:
>> https://reviews.apache.org/r/15201/
>> -----------------------------------------------------------
>>
>> (Updated Nov. 11, 2013, 4:44 p.m.)
>>
>>
>> Review request for kafka.
>>
>>
>> Bugs: KAFKA-1117
>>      https://issues.apache.org/jira/browse/KAFKA-1117
>>
>>
>> Repository: kafka
>>
>>
>> Description
>> -------
>>
>> kafka-1117; fix 2
>>
>>
>> kafka-1117; fix 1
>>
>>
>> kafka-1117
>>
>>
>> Diffs
>> -----
>>
>>    core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION
>>
>> Diff: https://reviews.apache.org/r/15201/diff/
>>
>>
>> Testing
>> -------
>>
>>
>> Thanks,
>>
>> Jun Rao
>>
>>
>


Re: Review Request 15201: address previous review comments

Posted by Neha Narkhede <ne...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28665
-----------------------------------------------------------



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55581>

    This is pretty difficult to read. What we've learnt in the past is that error codes are not useful alone and should be converted to the string error name. Also, it is useful to print the partition -> ErrorCodeString only for partitions that have a non zero error code



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55583>

    expected ?


- Neha Narkhede


On Nov. 11, 2013, 4:44 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 11, 2013, 4:44 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address previous review comments

Posted by Guozhang Wang <gu...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28715
-----------------------------------------------------------

Ship it!


Ship It!

- Guozhang Wang


On Nov. 11, 2013, 4:44 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 11, 2013, 4:44 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address more review comments

Posted by Jun Rao <ju...@gmail.com>.

> On Nov. 13, 2013, 7:43 a.m., Swapnil Ghike wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 70
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line70>
> >
> >     fetchsize -> fetch-size for consistency with other options?

changed.


> On Nov. 13, 2013, 7:43 a.m., Swapnil Ghike wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 74
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line74>
> >
> >     We can probably change this to ConsumerConfig.FetchSize. Anytime we change the max message size on the broker, we will probably change default fetch size on consumer, so that can serve as the source of truth for this tool as well.

changed.


> On Nov. 13, 2013, 7:43 a.m., Swapnil Ghike wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, lines 110-111
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line110>
> >
> >     Should we just do .replaceAll("""["']""", "") instead?

Not long used.


> On Nov. 13, 2013, 7:43 a.m., Swapnil Ghike wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, lines 165-181
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line165>
> >
> >     val fetcherThreads = for ((i, element) <- topicAndPartitionsPerBroker.view.zipWithIndex) yield new ReplicaFetcher(...., doVerification = if (i == 0) true else false) 
> >     
> >     to avoid variable = true and then variable = false?

Changed, in a different way.


> On Nov. 13, 2013, 7:43 a.m., Swapnil Ghike wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, lines 242-244
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line242>
> >
> >     Minor comment: Can we rename this to initialOffsetMap (we use the offsetMap name in ReplicaFetchThread), I got confused on the first glace..

done.


> On Nov. 13, 2013, 7:43 a.m., Swapnil Ghike wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 272
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line272>
> >
> >     I thought this loop is supposed to go through all the messages that can be returned by the messageIterator, but looks like it will check for only the first message obtained via each messageIterator.

This loop is just iterating one message from every replica. The outer loop iterates through all messages.


> On Nov. 13, 2013, 7:43 a.m., Swapnil Ghike wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, lines 291-295
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line291>
> >
> >     Wondering why we don't exit when checksums don't match?

It's probably better to report all occurrences of mismatched checksums.


> On Nov. 13, 2013, 7:43 a.m., Swapnil Ghike wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 376
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line376>
> >
> >     typo

changed.


> On Nov. 13, 2013, 7:43 a.m., Swapnil Ghike wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 385
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line385>
> >
> >     typo

changed.


On Nov. 13, 2013, 7:43 a.m., Jun Rao wrote:
> > Another caveat seems to be that the tool cannot handle changes in 1. partition leadership change 2. topic configuration change (number of partitions).

The tool doesn't care about leader changes since it has to read from every replica. It won't pick up newly created partitions or partitions moved by the reassignment tool. For those cases, we can just restart the tool.


- Jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28781
-----------------------------------------------------------


On Nov. 12, 2013, 4:34 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 12, 2013, 4:34 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   config/tools-log4j.properties 79240490149835656e2a013a9702c5aa41c104f1 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address more review comments

Posted by Swapnil Ghike <sg...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28781
-----------------------------------------------------------



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55792>

    fetchsize -> fetch-size for consistency with other options?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55793>

    We can probably change this to ConsumerConfig.FetchSize. Anytime we change the max message size on the broker, we will probably change default fetch size on consumer, so that can serve as the source of truth for this tool as well.



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55794>

    Should we just do .replaceAll("""["']""", "") instead?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55799>

    val fetcherThreads = for ((i, element) <- topicAndPartitionsPerBroker.view.zipWithIndex) yield new ReplicaFetcher(...., doVerification = if (i == 0) true else false) 
    
    to avoid variable = true and then variable = false? 



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55807>

    Minor comment: Can we rename this to initialOffsetMap (we use the offsetMap name in ReplicaFetchThread), I got confused on the first glace..



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55805>

    I thought this loop is supposed to go through all the messages that can be returned by the messageIterator, but looks like it will check for only the first message obtained via each messageIterator.



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55806>

    Wondering why we don't exit when checksums don't match?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55808>

    typo



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55809>

    typo


Another caveat seems to be that the tool cannot handle changes in 1. partition leadership change 2. topic configuration change (number of partitions).

- Swapnil Ghike


On Nov. 12, 2013, 4:34 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 12, 2013, 4:34 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   config/tools-log4j.properties 79240490149835656e2a013a9702c5aa41c104f1 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address more review comments

Posted by Jun Rao <ju...@gmail.com>.

> On Nov. 13, 2013, 2:09 a.m., Joel Koshy wrote:
> > Looks good overall - this is definitely a useful tool to have. I'm posting comments so far but I will come back to it tomorrow to finish up.
> > 
> > So what would be the recommended procedure when the tool reports an inconsistency? Would operations have to go in, stop the brokers and truncate the log segments to the last segments that are verified to be consistent? If so, it would be convenient if the tool could issue another OffsetRequest to report the offset boundaries of the last consistent segments so you don't have to work too hard to determine the segment file.
> > 
> > Also, unless I'm reading it incorrect additional caveats are that the tool can only keep working so long as: partitions are not being added (or reassigned); logs are not being compacted.
> >

The tool is meant for detecting inconsistent replicas during testing so that we can fix the bugs. Fixing inconsistent replicas in operation will be hard and will require manual work.

The limitation of the tools is that it won't pick up newly created partitions or partitions moved by the reassignment tool. For those cases, we can just restart the tool.


> On Nov. 13, 2013, 2:09 a.m., Joel Koshy wrote:
> > config/tools-log4j.properties, line 16
> > <https://reviews.apache.org/r/15201/diff/3/?file=382689#file382689line16>
> >
> >     Is this needed?

removed.


> On Nov. 13, 2013, 2:09 a.m., Joel Koshy wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 98
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line98>
> >
> >     This is becoming my most frequent review comment :)
> >     Can you use CommandLineUtils.checkRequiredArgs?

done.


> On Nov. 13, 2013, 2:09 a.m., Joel Koshy wrote:
> > core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala, line 106
> > <https://reviews.apache.org/r/15201/diff/3/?file=382691#file382691line106>
> >
> >     Any reason to not use kafka.consumer.Whitelist and the isTopicAllowed API that it provides?

Yes, it can be reused.


- Jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28744
-----------------------------------------------------------


On Nov. 12, 2013, 4:34 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 12, 2013, 4:34 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   config/tools-log4j.properties 79240490149835656e2a013a9702c5aa41c104f1 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address more review comments

Posted by Joel Koshy <jj...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28744
-----------------------------------------------------------


Looks good overall - this is definitely a useful tool to have. I'm posting comments so far but I will come back to it tomorrow to finish up.

So what would be the recommended procedure when the tool reports an inconsistency? Would operations have to go in, stop the brokers and truncate the log segments to the last segments that are verified to be consistent? If so, it would be convenient if the tool could issue another OffsetRequest to report the offset boundaries of the last consistent segments so you don't have to work too hard to determine the segment file.

Also, unless I'm reading it incorrect additional caveats are that the tool can only keep working so long as: partitions are not being added (or reassigned); logs are not being compacted.



config/tools-log4j.properties
<https://reviews.apache.org/r/15201/#comment55727>

    Is this needed?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55733>

    This is becoming my most frequent review comment :)
    Can you use CommandLineUtils.checkRequiredArgs?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment55734>

    Any reason to not use kafka.consumer.Whitelist and the isTopicAllowed API that it provides?


- Joel Koshy


On Nov. 12, 2013, 4:34 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 12, 2013, 4:34 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   config/tools-log4j.properties 79240490149835656e2a013a9702c5aa41c104f1 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address all review comments

Posted by Swapnil Ghike <sg...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28873
-----------------------------------------------------------

Ship it!


Ship It!

- Swapnil Ghike


On Nov. 14, 2013, 4:24 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 14, 2013, 4:24 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 4
> 
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address Neha's review comments

Posted by Neha Narkhede <ne...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review29067
-----------------------------------------------------------

Ship it!


Ship It!

- Neha Narkhede


On Nov. 18, 2013, 5:57 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 18, 2013, 5:57 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 5
> 
> 
> kafka-1117; fix 4
> 
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address Neha's review comments

Posted by Jun Rao <ju...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/
-----------------------------------------------------------

(Updated Nov. 18, 2013, 5:57 p.m.)


Review request for kafka.


Summary (updated)
-----------------

address Neha's review comments


Bugs: KAFKA-1117
    https://issues.apache.org/jira/browse/KAFKA-1117


Repository: kafka


Description (updated)
-------

kafka-1117; fix 5


kafka-1117; fix 4


kafka-1117; fix 3


kafka-1117; fix 2


kafka-1117; fix 1


kafka-1117


Diffs (updated)
-----

  core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
  core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 

Diff: https://reviews.apache.org/r/15201/diff/


Testing
-------


Thanks,

Jun Rao


Re: Review Request 15201: address all review comments

Posted by Neha Narkhede <ne...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review29023
-----------------------------------------------------------



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment56101>

    PartitionAndReplica instead of (String,Int,Int)



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment56104>

    It seems like the message iterator can have multiple messages, but we verify only the first message per replica?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment56102>

    It seems somewhat awkward that internal data structure of the replica buffer has leaked into the replica fetcher. How about adding an API to put the message set for a TopicAndPartition into the ReplicaBuffer?



core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala
<https://reviews.apache.org/r/15201/#comment56103>

    here is where we could use the ReplicaBuffer's API instead of accessing the message set cache directly


- Neha Narkhede


On Nov. 14, 2013, 4:24 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 14, 2013, 4:24 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 4
> 
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address all review comments

Posted by Guozhang Wang <gu...@linkedin.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/#review28872
-----------------------------------------------------------

Ship it!


Ship It!

- Guozhang Wang


On Nov. 14, 2013, 4:24 p.m., Jun Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15201/
> -----------------------------------------------------------
> 
> (Updated Nov. 14, 2013, 4:24 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1117
>     https://issues.apache.org/jira/browse/KAFKA-1117
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> kafka-1117; fix 4
> 
> 
> kafka-1117; fix 3
> 
> 
> kafka-1117; fix 2
> 
> 
> kafka-1117; fix 1
> 
> 
> kafka-1117
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
>   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15201/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jun Rao
> 
>


Re: Review Request 15201: address all review comments

Posted by Jun Rao <ju...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/
-----------------------------------------------------------

(Updated Nov. 14, 2013, 4:24 p.m.)


Review request for kafka.


Summary (updated)
-----------------

address all review comments


Bugs: KAFKA-1117
    https://issues.apache.org/jira/browse/KAFKA-1117


Repository: kafka


Description (updated)
-------

kafka-1117; fix 4


kafka-1117; fix 3


kafka-1117; fix 2


kafka-1117; fix 1


kafka-1117


Diffs (updated)
-----

  core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
  core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 

Diff: https://reviews.apache.org/r/15201/diff/


Testing
-------


Thanks,

Jun Rao


Re: Review Request 15201: address more review comments

Posted by Jun Rao <ju...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/
-----------------------------------------------------------

(Updated Nov. 12, 2013, 4:34 p.m.)


Review request for kafka.


Summary (updated)
-----------------

address more review comments


Bugs: KAFKA-1117
    https://issues.apache.org/jira/browse/KAFKA-1117


Repository: kafka


Description (updated)
-------

kafka-1117; fix 3


kafka-1117; fix 2


kafka-1117; fix 1


kafka-1117


Diffs (updated)
-----

  config/tools-log4j.properties 79240490149835656e2a013a9702c5aa41c104f1 
  core/src/main/scala/kafka/api/OffsetResponse.scala 08dc3cd3d166efba6b2b43f6e148f636b175affe 
  core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 

Diff: https://reviews.apache.org/r/15201/diff/


Testing
-------


Thanks,

Jun Rao


Re: Review Request 15201: address previous review comments

Posted by Jun Rao <ju...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15201/
-----------------------------------------------------------

(Updated Nov. 11, 2013, 4:44 p.m.)


Review request for kafka.


Summary (updated)
-----------------

address previous review comments


Bugs: KAFKA-1117
    https://issues.apache.org/jira/browse/KAFKA-1117


Repository: kafka


Description (updated)
-------

kafka-1117; fix 2


kafka-1117; fix 1


kafka-1117


Diffs (updated)
-----

  core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala PRE-CREATION 

Diff: https://reviews.apache.org/r/15201/diff/


Testing
-------


Thanks,

Jun Rao