You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Joel Koshy (JIRA)" <ji...@apache.org> on 2012/07/02 20:33:43 UTC

[jira] [Created] (KAFKA-391) Producer request and response classes should use maps

Joel Koshy created KAFKA-391:
--------------------------------

             Summary: Producer request and response classes should use maps
                 Key: KAFKA-391
                 URL: https://issues.apache.org/jira/browse/KAFKA-391
             Project: Kafka
          Issue Type: Bug
            Reporter: Joel Koshy
             Fix For: 0.8


Producer request contains an array of TopicData which contains arrays of PartitionData.

Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.

It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454140#comment-13454140 ] 

Jun Rao commented on KAFKA-391:
-------------------------------

Thanks for patch v2. Some comments.

20. It's a good idea to create a case class of TopicPartition. This helps 2 things: (1) Tuple doesn't exist in java and javaapi.ProducerRequest currently uses tuple. (2) This avoids things like x._1._1, which is harder to understand. Similarly, should we create a case class of ProducerResponsStatus that wraps (errrocode, offset)?

21. javaapi.ProduceRequest should use java map, instead of scala map. javaapi.SyncProducer should return a java version of ProducerResponse. Thinking about it. Should we even provide a javaapi for SyncProducer since everyone should really be using the high level Producer api? In other words, SyncProducer probably is not our public api for clients. For consumers, we likely still need to provide a java version of SimpleConsumer since there may be applications that want to control the fetch offset. 

23. It doesn't look like that we have a java version of FetchRequest. We should add that  and use it in the java version of SimpleConsumer.

24. FetchResponse: We probably should add a helper method errorCode(topic, partition)?

25. AbstractFetchThread: can use the pattern response.data.foreach{ case(key, partitionData) => }

26. KafkaApis.handleFetchRequest(): can use the pattern in #25 too.


                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy updated KAFKA-391:
-----------------------------

    Description: 
Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.

It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).

We can probably do the same in the fetch request/response.

  was: 


(Accidentally deleted the description.)
                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy updated KAFKA-391:
-----------------------------

    Priority: Blocker  (was: Major)

Marking as blocker since I ended up changing the wire format.
                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456710#comment-13456710 ] 

Jun Rao commented on KAFKA-391:
-------------------------------

Thanks for patch v4. +1 Could you fix the following minor issues before checking in?

41. scala version of FetchResponse: We throw an exception in errorCode if the map key doesn't exist. To be consistent, we should do the same for messageSet and highWatermark.

42. PartitionStatus: Since this is a case class, there is no need to define requiredOffset as val.

43. DefaultEventHandler.serialize(): This is not introduced in this patch, but could you change  error("Error serializing message " + t) to error("Error serializing message ", t)





                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, KAFKA-391-v3.patch, KAFKA-391-v4.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449113#comment-13449113 ] 

Joel Koshy commented on KAFKA-391:
----------------------------------

(1) Yes that's a good point. I should have thought through it more carefully. We only need the sorted property once (serialization).
(2) I had considered this, but decided to punt to see how the producer-side came out. I'll take a stab at the fetch side as well as part of this jira.
(3) Nice.

So I'll rebase now and incorporate the above.

                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch
>
>
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448915#comment-13448915 ] 

Jun Rao commented on KAFKA-391:
-------------------------------

Thanks for the patch. Some comments:

1. We probably shouldn't use scala sortedMap in kafka.javaapi.ProducerRequest. On that thought, why can't ProducerRequest take a regular map in the constructor? If we want some ordering on the serialized data, we can sort the map before serialization. SortedMap seems to reveal an implementation detail that clients don't (and shouldn't) really care. Ditto for ProducerResponse.

2. To be consistent, should we change FetchResponse (and maybe FetchRequest) to use map, instead of array too?

3. ProducerRequest.writeTo() can use foreach like the following:
   groupedData.foreach{ case(topic, TopciAndPartitionData) => ... }

                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch
>
>
> Producer request contains an array of TopicData which contains arrays of PartitionData.
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455566#comment-13455566 ] 

Jun Rao commented on KAFKA-391:
-------------------------------

V3 looks good overall. Some additional comments:

30. remove javaapi.ProducerRequest
31. We probably should call TopicPartition TopicAndPartition.
32. javaapi.SimpleConsumer: It's bit confusing to the scala version of send here. Let's at least explain in the comment how to use it.

                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, KAFKA-391-v3.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy reassigned KAFKA-391:
--------------------------------

    Assignee: Joel Koshy
    
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>
> Producer request contains an array of TopicData which contains arrays of PartitionData.
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455928#comment-13455928 ] 

Joel Koshy commented on KAFKA-391:
----------------------------------

Ok nm - I see it's because we removed SyncProducer and only need ProducerData. Ok - so I'll rebase now and upload the final patch in a bit.
                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, KAFKA-391-v3.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy closed KAFKA-391.
----------------------------

    
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, KAFKA-391-v3.patch, KAFKA-391-v4.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy updated KAFKA-391:
-----------------------------

    Attachment: KAFKA-391-v3.patch

Overview of changes in v3:

(3.1) - (20) - This actually caused bulk of the changes in v3. I did this 
for just the topic-partition pairs in producer/consumer request handling;
same for producer response status. There are a lot more places where we
could move from tuples to case classes. I think (as mentioned on the mailing
list) it would be good to do this as part of cleanup but we should defer
that for later since such changes cut across a lot of files. Going forward I
think this brings out a convention that we might want to follow. The "scala
book" has a reasonable guideline. I'll send out an email to kafka-dev for
discussion and add it to our coding convention depending on how that pans
out.

(3.2) (21)-(23) - Thanks for catching the javaapi issue. Couple of changes
here: 
a - Added javaapi for FetchRequest. (I needed to provide both java/non-java
  fetchrequest to the simpleconsumer since FetchBuilder returns a scala
  FetchRequest.) 
b - Java map for all the Fetch/Produce request/response in the javaapi.
c - Removed SyncProducer: agreed that is unnecessary since Producer supports
  both sync and async; made the hadoop producer code use the high level
  producer. I think that's safe - i.e., I don't see a good reason why anyone
  would "depend" on the data going to a single broker.
d - Got rid of the unreferenced ProducerConsumerTestHarness in javaapi.
e - Fixed the equals method in javaapi ProducerRequest; added one to
FetchResponse - actually we can abstract this out into a trait, but that is
a minor improvement that I punted on.
f - Made the ProducerRequest use Java map in javaapi.
g - (I did not add Java versions of ProducerResponse since the SyncProducer
  has been removed.)

(3.3) (24) - added the helper method, although I don't think I'm using it
anywhere.

(3.4) - Got rid of OffsetDetail.

For (25)(26) - I tried to use the pattern wherever I could, but may have 
missed a few.

I did not rebase this (i.e., it still applies on r1381858). I'll rebase 
after we are done with reviews.

                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, KAFKA-391-v3.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448180#comment-13448180 ] 

Joel Koshy commented on KAFKA-391:
----------------------------------

One more comment I forgot to add above: we can/should probably get rid of the global
error code in ProducerResponse as it is unused and does not seem to make sense
anyway in the absence of a "generic error" code.
                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch
>
>
> Producer request contains an array of TopicData which contains arrays of PartitionData.
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy updated KAFKA-391:
-----------------------------

    Attachment: KAFKA-391-v2.patch

2.1 - Rebased after the first review, but I will need to rebase again. Unit
  tests and system tests pass. This patch applies cleanly to svn revision
  1381858 - it would be great if this can be reviewed against that revision.   I can provide an incremental patch if that helps after I rebase.

2.2 - Removed all the equals methods in classes that were using Arrays earlier.

2.3 - Switched fetch request/response to maps. It makes the code a little
  cleaner, but I think the improvement was greater with the refactoring on
  the producer side. Anyway, this makes the APIs more consistent.  One small
  observation: previously, fetch requests would throw a
  FetchRequestFormatException if the TopicData array for a fetch request
  contained the same topic multiple times. Right now we don't do any checks
  since we use a map. We can add it to the FetchRequestBuilder, but not sure
  if it is required/worth it. Also, I think it previously would allow
  fetches from different offsets in the same partition in the same fetch
  request. That is no longer allowed, although I don't know why anyone would
  need that.

2.4 - Removed the global error code.

Another minor detail: I wonder if it would help to have a case class for
TopicAndPartition. We use (topic, partition) and the associated tuple
addressing all over the place. That would make the Map declarations slightly
clearer, although now we're accustomed to understanding that (String, Int)
must mean (topic, partitionId).

                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454106#comment-13454106 ] 

Joel Koshy commented on KAFKA-391:
----------------------------------

BTW, I forgot to remove OffsetDetail from FetchRequest - we don't need that anymore.
                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449271#comment-13449271 ] 

Jun Rao commented on KAFKA-391:
-------------------------------

Also, I agree that we can remove the global errorcode from both the fetch and the produce responses.
                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455927#comment-13455927 ] 

Joel Koshy commented on KAFKA-391:
----------------------------------

30 - Why should it be removed?
                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, KAFKA-391-v3.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy updated KAFKA-391:
-----------------------------

    Attachment: KAFKA-391-draft-r1374069.patch

Attached a draft patch. I was out of action for the last two weeks and 0.8
has moved along quite a bit, so this will need a significant rebase. (This
patch should cleanly apply on r1374069.) Before I spend time rebasing I was
hoping to get some high-level feedback on this change. I think it makes the
code clearer and likely more efficient (although an extensive perf test is
pending).

It is pretty straightforward and should be quick to review at least for this
round. The main changes are as follows:

- Switched to maps for the data/status in the producer request/response
  classes.
- I'm using SortedMap as that helps with the serializing to the wire format.
  The maps of course are only in memory. The wire format is mostly the same
  although I modified the response format to include topic names (similar to
  the request format). This is not strictly required but I think it is
  clearer this way.
- Made some of the related code simpler/clearer (e.g., produceToLocalLog,
  DelayedProduce.respond, etc.)
- Minor fixes in the tests due to the above changes.

                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch
>
>
> Producer request contains an array of TopicData which contains arrays of PartitionData.
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy updated KAFKA-391:
-----------------------------

    Description:    (was: Producer request contains an array of TopicData which contains arrays of PartitionData.

Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.

It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
)
    
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch
>
>
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy updated KAFKA-391:
-----------------------------

    Labels: optimization  (was: )
    
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>              Labels: optimization
>             Fix For: 0.8
>
>
> Producer request contains an array of TopicData which contains arrays of PartitionData.
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy updated KAFKA-391:
-----------------------------

    Attachment: KAFKA-391-v4.patch

Here is the rebased patch. I also had to include a small edit to ReplicaFetcherThread to address the issue in KAFKA-517 which affects our system tests.
                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, KAFKA-391-v3.patch, KAFKA-391-v4.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (KAFKA-391) Producer request and response classes should use maps

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Koshy resolved KAFKA-391.
------------------------------

    Resolution: Fixed

Thanks for the review. Checked-in to 0.8 after addressing the minor issues.

                
> Producer request and response classes should use maps
> -----------------------------------------------------
>
>                 Key: KAFKA-391
>                 URL: https://issues.apache.org/jira/browse/KAFKA-391
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Joel Koshy
>            Assignee: Joel Koshy
>            Priority: Blocker
>              Labels: optimization
>             Fix For: 0.8
>
>         Attachments: KAFKA-391-draft-r1374069.patch, KAFKA-391-v2.patch, KAFKA-391-v3.patch, KAFKA-391-v4.patch
>
>
> Producer response contains two arrays of error codes and offsets - the ordering in these arrays correspond to the flattened ordering of the request arrays.
> It would be better to switch to maps in the request and response as this would make the code clearer and more efficient (right now, linear scans are used in handling producer acks).
> We can probably do the same in the fetch request/response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira