You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (JIRA)" <ji...@apache.org> on 2012/05/26 04:27:22 UTC

[jira] [Created] (KAFKA-351) Refactor some new components introduced for replication

Neha Narkhede created KAFKA-351:
-----------------------------------

             Summary: Refactor some new components introduced for replication 
                 Key: KAFKA-351
                 URL: https://issues.apache.org/jira/browse/KAFKA-351
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.8
            Reporter: Neha Narkhede


Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Attachment: kafka-351_v2.patch

Attach patch v2 after rebase. Made one more change:
N. Make logEndOffset and highWaterMark in replica atomic long. This is because java doesn't guarantee consistency of long value if not synchronized. 
                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Thanks for the review. Addressed the issues in the last review and committed to 0.8.

Create kafka-476 to track using Option in Pool.

For priviate[this] var, it restricts the usage of a member field to only this instance of the class. This way, one is always forced to use the public api to access the member field in other instances of the class. 

Yes, Intellij seems to have an issue finding references of x_=(), which is inconvenient. Not sure if it has been addressed in a new version.
                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch, kafka-351_v4.patch, kafka-351_v5.patch, kafka-351_v6.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422562#comment-13422562 ] 

Neha Narkhede commented on KAFKA-351:
-------------------------------------

>From KAFKA-405 -

ReplicaManager:

This class has a very odd public interface. Instead of managing replicas it has a bunch of passive calls--addLocalReplica(), addRemoteReplica(), etc. Who calls these? KafkaServer seems to have its own wrapper for these. Then it passes these methods on as arguments to KafkaZookeeper. Does this make sense? I think it would be good if ReplicaManager handled replica management, even if that means it depends on zookeeper.

                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437994#comment-13437994 ] 

Neha Narkhede commented on KAFKA-351:
-------------------------------------

Thanks for incorporating review suggestions!

2.3 Agreed that it is a bit of work to get meaningful error codes in the client. Often this is ignored, but client contract should be very well thought out and easy to understand. It is best if we give the most descriptive error code, but if we feel it takes significant amount of work, we can start with a simple solution. We went through a pretty detailed review of the new request formats, but not error codes. It will be good to go through this before the release.

5.1 That change is correct. However, in Replica.scala, highWatermarkValue and logEndOffsetValue are synchronized via AtomicLong, but not logEndOffsetUpdateTime. Right now, like you said, there is only one API that updates/reads logEndOffsetUpdateTime and it synchronizes those accesses. But since these are Replica APIs, I'm pretty sure there will be more places in the code that will either update or read the logEndOffset/logEndOffsetUpdateTime and each of those APIs would have to synchronize those accesses. For what it's worth, changing it to AtomicLong actually protects us from future synchronization errors and is not much of a performance hit as well. 

8. HighwatermarkPersistenceTest. Fix fail error message to say KafkaException instead of IllegalStateException. I forgot to do this in my patch when I added this test, it will be great if you can include this minor change.

9. Minor comment - Probably better to rename UnknownTopicPartition to UnknownTopicOrPartitionException.

                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch, kafka-351_v4.patch, kafka-351_v5.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433322#comment-13433322 ] 

Jun Rao commented on KAFKA-351:
-------------------------------

I still see some transient failures in unit tests, most of which seem to be due to leader not ready. I will probably wait until kafka-369 is committed since it fixes some of those issues.
                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>         Attachments: kafka-351_v1.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438852#comment-13438852 ] 

Neha Narkhede commented on KAFKA-351:
-------------------------------------

+1 

Minor comments before checking it in - 

1. Fix comment in UnknownTopicOrPartitionException.scala. Right now it describes InvalidPartitionException
2. Replica - Fix error message "shouldn't set logEndOffset for replica %d topic %s partition %d since it's local" to first letter capital

                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch, kafka-351_v4.patch, kafka-351_v5.patch, kafka-351_v6.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Attachment: kafka-351_v3.patch

Attach patch v3. Just a rebase.
                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Attachment: kafka-351_v5.patch

Upload patch v5 to fix an svn rename issue.
                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch, kafka-351_v4.patch, kafka-351_v5.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao reassigned KAFKA-351:
-----------------------------

    Assignee: Jun Rao
    
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437038#comment-13437038 ] 

Neha Narkhede commented on KAFKA-351:
-------------------------------------

Thanks for the patch! Overall, this refactoring is a good change. Here are a few review comments -

1. TestUtils. How about renaming leaderLocalOnBroker to isLeaderLocalOnBroker ?

2. LogOffsetTest:
2.1. The change in testGetOffsetsForUnknownTopic doesn't look right. Since the topic "foo" doesn't exist, the client should get back UnknownTopicException. The partition is not invalid, it doesn't even exist.

3. SyncProducerTest
3.1 Same here, client should get back UnknownTopicException, not InvalidPartitionException

4. ISRExpirationTest
4.1 getLogWithLogEndOffset expected 6 calls for log.logEndOffset since the test exercised that API 6 times during correct operation. If you change it to anyTimes, it will hide problems with either the test or the code. Was it changed to
get rid of some transient test failure ?
4.2 Minor formatting: For consistency, you might want to change to first letter caps for error messages. So far, I don't think everyone quite followed this. So some log statements have first letter caps, others don't. I personally prefer
 first letter caps for all log statements.

5. Replica
5.1 Is there a reason why logEndOffsetUpdateTimeMs is not AtomicLong ? It's access is not protected by a lock.
5.2 What is the difference between private[this] var and private var ?
5.3 It's great that you changed logEndOffset to use the new getter/setter API convention. I think there is only one drawback to using that. I don't know a way to search the code to list all places that use the setter. Do you ?

6. Partition
6.1 Rename addReplicaIfNotExist to addReplicaIfNotExists.
6.2 In getOrCreateLog, it is better to use case match, since in Scala case match always evaluates to some value. Since this API needs to return the Replica object, using case match will protect against code bugs. Instead of if-else that checks isDefined, case-match handles Options naturally, since it forces you to handle all the cases. Same for makeLeader, makeFollower, checkEnoughReplicasReachAnOffset since they also return some value.
6.3 Looks like assignedReplicaMap is meant to be a map of replica_id->Replica. It might be a good idea to change Pool to handle Options. Options are much easier to use than handling null values. For example, getReplica can reduce to just returning assignedReplicaMap.get(replicaId), instead of the if-else checking for nulls.
6.5 Minor formatting comment same as 4.2
6.6. maybeIncrementLeaderHW: Since you are trying to access inSyncReplicas here, this method should be synchronized on the leaderAndIsrUpdateLock
6.7 getOutOfSyncReplicas, updateISR: Same as 6.6
6.8 checkEnoughReplicasReachAnOffset: 
6.8.1 We should probably rename this to checkEnoughReplicasReachOffset. 

7. ReplicaManager
7.1 I think leaderReplicas was a poorly chosen name by me in the past. It should be renamed to leaderPartitions since it is the set of partitions with their leader hosted on the local broker. Also, that would mean we should rename leaderReplicasLock to leaderPartitionsLock
7.2 Same as 6.3 for allPartitions. This will greatly simplify getOrCreatePartition
7.3 Same as 4.2 for some of the APIs
7.4 Fix typo: shuttedd down 
7.5 Fix identation and parenthesis style for checkpointHighWatermarks. 
7.6 Same as 6.2 for becomeLeaderOrFollower 
7.7 I wonder if it is better to rename leaderReplicaIfLocalOrException to getLeaderReplicaIfLocal ?

                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Status: Patch Available  (was: Open)
    
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>         Attachments: kafka-351_v1.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede updated KAFKA-351:
--------------------------------

    Comment: was deleted

(was: Thanks for rebasing Jun! But I still couldn't apply the patch on the latest 0.8. I'm actually ok with reviewing on a slightly older version. Please could you specify which version your v3 patch applies cleanly on ? 

nnarkhed-ld:kafka-351 nnarkhed$ patch -p0 -i ~/Projects/kafka-patches/kafka-351_v3.patch 
patching file system_test/single_host_multi_brokers/bin/run-test.sh
patching file core/src/test/scala/unit/kafka/utils/TestUtils.scala
patching file core/src/test/scala/unit/kafka/log/LogOffsetTest.scala
patching file core/src/test/scala/unit/kafka/integration/TopicMetadataTest.scala
Hunk #2 FAILED at 98.
Hunk #3 succeeded at 116 (offset -2 lines).
Hunk #4 succeeded at 131 (offset -2 lines).
1 out of 4 hunks FAILED -- saving rejects to file core/src/test/scala/unit/kafka/integration/TopicMetadataTest.scala.rej
patching file core/src/test/scala/unit/kafka/producer/SyncProducerTest.scala
patching file core/src/test/scala/unit/kafka/server/ISRExpirationTest.scala
patching file core/src/test/scala/unit/kafka/server/LogRecoveryTest.scala
patching file core/src/test/scala/unit/kafka/server/HighwatermarkPersistenceTest.scala
patching file core/src/test/scala/unit/kafka/consumer/TopicCountTest.scala
patching file core/src/test/scala/unit/kafka/consumer/ZookeeperConsumerConnectorTest.scala
patching file core/src/main/scala/kafka/cluster/Replica.scala
patching file core/src/main/scala/kafka/cluster/Partition.scala
patching file core/src/main/scala/kafka/producer/BrokerPartitionInfo.scala
patching file core/src/main/scala/kafka/producer/async/DefaultEventHandler.scala
patching file core/src/main/scala/kafka/server/KafkaZooKeeper.scala
patching file core/src/main/scala/kafka/server/KafkaServer.scala
patching file core/src/main/scala/kafka/server/ReplicaFetcherThread.scala
patching file core/src/main/scala/kafka/server/ReplicaManager.scala
Hunk #2 FAILED at 66.
1 out of 2 hunks FAILED -- saving rejects to file core/src/main/scala/kafka/server/ReplicaManager.scala.rej
patching file core/src/main/scala/kafka/server/KafkaApis.scala
Hunk #1 FAILED at 21.
Hunk #2 FAILED at 32.
Hunk #3 FAILED at 70.
Hunk #4 succeeded at 118 with fuzz 1 (offset -11 lines).
Hunk #5 succeeded at 206 with fuzz 1 (offset -19 lines).
Hunk #6 succeeded at 289 (offset -23 lines).
Hunk #7 succeeded at 311 (offset -25 lines).
Hunk #8 succeeded at 327 (offset -25 lines).
Hunk #9 FAILED at 365.
Hunk #10 FAILED at 397.
Hunk #11 succeeded at 390 with fuzz 1 (offset -30 lines).
Hunk #12 FAILED at 445.
Hunk #13 FAILED at 603.
7 out of 13 hunks FAILED -- saving rejects to file core/src/main/scala/kafka/server/KafkaApis.scala.rej
patching file core/src/main/scala/kafka/api/LeaderAndISRResponse.scala
)
    
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Attachment: kafka-351_v4.patch

Thanks for the review. 

2,3. The difficulty is that a broker currently doesn't cache all topic/partitions (only controller does that). It only knows about topic/partitions assigned to itself. So, it's hard for a broker distinguish between a missing topic and a missing partition. We could cache all topic/partitions in all brokers, but we need to add additional ZK logic in each broker. So, in this patch, just combined UnknownTopicException and InvailidPartitionException into a more general UnknowTopicPartitionException. It's not ideal, but probably not too painful for the user to understand.

5.1 That's a good point. Moved the update (which updates logEndOffsetUpdateTimeMs) of logEndOffset into Partition.updateLeaderHWAndMaybeExpandISR(). This way, both the reader and the writer of logEndOffsetUpdateTimeMs is synchronized by leaderISRUpdateLock. So, there is no need to make it an AtomicLong.

6.3 Just not to make this patch too large. I will create a separate jira for changing the Pool api to use Option.

6.6 and 6.7 Both methods are only called in Partition and the caller already synchronizes on leaderISRUpdateLock.

The rest of the comments have been addressed.
                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch, kafka-351_v4.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Attachment: kafka-351_v1.patch

Attach patch v1. An overview of the patch.
A. Use synchronized instead of lock for synchronization. The latter has more functionality, but more overhead. See
http://blog.rapleaf.com/dev/2011/06/16/java-performance-synchronized-vs-lock/

B. Partition: Consolidated all reads/writes to leader/ISR in Partition and all accessed are synchronized. This makes sure that leader/ISR values don't change while doing the following operations.
- makeLeader
- makerFollower
- maybeShrinkISR
- updateLeaderHWAndMaybeExpandISR (for maintaining remote replicas)
- checkEnoughReplicasReachAnOffset (for checking if a produce request is satisfied)
This means that Partition has to access ReplicaManager. In some sense, Partition probably should be a nested class under ReplicaManager since it needs to access several members of ReplicaManager. However, this will make ReplicaManager too big.

C. RepicaManager:
- Moved most per partition operations to Partition.
- Cleaned up public methods and added a few helper methods for getting partition and replica.
- Use ConcurrentHashMap for managing all partitions.

D. KafkaApis: Removed callbacks in the constructer. Instead, call methods in ReplicaManager directly.

E. Replica:
- Changed to the new getter/setter style for logEndOffset and highWatermark.
- Local replica doesn't need to set logEndOffset anymore since the logEndOffsetUpdateTime for local replica is never used.

F. BrokerPartitionInfo: Partition is now a complex structure and is supposed to be used only on the server side. Create a simpler PartitionAndLeader class for client usage.

G. KafkaZookeeper: Removed ensurePartitionLeaderOnThisBroker(). The checking of the existence of a leader is now done by replicaManager.leaderReplicaIfLocalOrException(), which is cheaper since it doesn't access ZK.

H. ISRExpirationTest: Remove the test testISRExpirationForMultiplePartitions. It doesn't seem to add additional value since ISR expiration is always done on a per partition basis.

I. SyncProducerTest: Remove the test testProducRequestForUnknowTopic since the logic is always covered by #1 in testProduceCorrectlyReceivesResponse.

J. TestUtils: Added a helper method leaderLocalOnBroker() that more reliably ensures that a leader exists on a broker.

K. TopicCounTest: removed since it's not doing any useful test.

L. ZookeeperConsumerConnectorTest.testCompressionSetConsumption(): removed the part that tests consumer timeout since it's covered in testBasic already.

                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>         Attachments: kafka-351_v1.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436784#comment-13436784 ] 

Neha Narkhede commented on KAFKA-351:
-------------------------------------

Thanks for rebasing Jun! But I still couldn't apply the patch on the latest 0.8. I'm actually ok with reviewing on a slightly older version. Please could you specify which version your v3 patch applies cleanly on ? 

nnarkhed-ld:kafka-351 nnarkhed$ patch -p0 -i ~/Projects/kafka-patches/kafka-351_v3.patch 
patching file system_test/single_host_multi_brokers/bin/run-test.sh
patching file core/src/test/scala/unit/kafka/utils/TestUtils.scala
patching file core/src/test/scala/unit/kafka/log/LogOffsetTest.scala
patching file core/src/test/scala/unit/kafka/integration/TopicMetadataTest.scala
Hunk #2 FAILED at 98.
Hunk #3 succeeded at 116 (offset -2 lines).
Hunk #4 succeeded at 131 (offset -2 lines).
1 out of 4 hunks FAILED -- saving rejects to file core/src/test/scala/unit/kafka/integration/TopicMetadataTest.scala.rej
patching file core/src/test/scala/unit/kafka/producer/SyncProducerTest.scala
patching file core/src/test/scala/unit/kafka/server/ISRExpirationTest.scala
patching file core/src/test/scala/unit/kafka/server/LogRecoveryTest.scala
patching file core/src/test/scala/unit/kafka/server/HighwatermarkPersistenceTest.scala
patching file core/src/test/scala/unit/kafka/consumer/TopicCountTest.scala
patching file core/src/test/scala/unit/kafka/consumer/ZookeeperConsumerConnectorTest.scala
patching file core/src/main/scala/kafka/cluster/Replica.scala
patching file core/src/main/scala/kafka/cluster/Partition.scala
patching file core/src/main/scala/kafka/producer/BrokerPartitionInfo.scala
patching file core/src/main/scala/kafka/producer/async/DefaultEventHandler.scala
patching file core/src/main/scala/kafka/server/KafkaZooKeeper.scala
patching file core/src/main/scala/kafka/server/KafkaServer.scala
patching file core/src/main/scala/kafka/server/ReplicaFetcherThread.scala
patching file core/src/main/scala/kafka/server/ReplicaManager.scala
Hunk #2 FAILED at 66.
1 out of 2 hunks FAILED -- saving rejects to file core/src/main/scala/kafka/server/ReplicaManager.scala.rej
patching file core/src/main/scala/kafka/server/KafkaApis.scala
Hunk #1 FAILED at 21.
Hunk #2 FAILED at 32.
Hunk #3 FAILED at 70.
Hunk #4 succeeded at 118 with fuzz 1 (offset -11 lines).
Hunk #5 succeeded at 206 with fuzz 1 (offset -19 lines).
Hunk #6 succeeded at 289 (offset -23 lines).
Hunk #7 succeeded at 311 (offset -25 lines).
Hunk #8 succeeded at 327 (offset -25 lines).
Hunk #9 FAILED at 365.
Hunk #10 FAILED at 397.
Hunk #11 succeeded at 390 with fuzz 1 (offset -30 lines).
Hunk #12 FAILED at 445.
Hunk #13 FAILED at 603.
7 out of 13 hunks FAILED -- saving rejects to file core/src/main/scala/kafka/server/KafkaApis.scala.rej
patching file core/src/main/scala/kafka/api/LeaderAndISRResponse.scala

                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Labels: optimization  (was: )
    
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>         Attachments: kafka-351_v1.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Fix Version/s: 0.8
    
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433385#comment-13433385 ] 

Jun Rao commented on KAFKA-351:
-------------------------------

Another change in the patch:
M. Currently, on leaderAndISR request, the broker gets and creates all assigned replicas. In this patch, the broker only creates replicas in ISR (since they are required in the logic for shrinking ISR). Other remote replicas are created on demand during the handling for follower fetch requests. This will make implementing kafka-42 a bit easier since newly bootstrapped replicas can be added on demand.
                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>         Attachments: kafka-351_v1.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-351:
--------------------------

    Attachment: kafka-351_v6.patch

Thanks for the review. Attach patch v6.

5.1 Changed logEndOffsetUpdateTime to AtomicLong.

8,9: Fixed.

Optimized Replica.getOutOfSyncReplicas() a bit to avoid the unnecessary check for leader replica.

Rebased.
                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch, kafka-351_v4.patch, kafka-351_v5.patch, kafka-351_v6.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao closed KAFKA-351.
-------------------------

    
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: kafka-351_v1.patch, kafka-351_v2.patch, kafka-351_v3.patch, kafka-351_v4.patch, kafka-351_v5.patch, kafka-351_v6.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-351) Refactor some new components introduced for replication

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434292#comment-13434292 ] 

Neha Narkhede commented on KAFKA-351:
-------------------------------------

Jun, The patch didn't apply cleanly on a fresh checkout of the 0.8 branch. Do you mind uploading another patch after rebasing ?
                
> Refactor some new components introduced for replication 
> --------------------------------------------------------
>
>                 Key: KAFKA-351
>                 URL: https://issues.apache.org/jira/browse/KAFKA-351
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Jun Rao
>         Attachments: kafka-351_v1.patch
>
>
> Jay had some good refactoring suggestions as part of the review for KAFKA-46. I'd like to file this umbrella JIRA with individual sub tasks to cover those suggestions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira