You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Ignacio Solis <is...@igso.net> on 2016/12/14 19:36:28 UTC

[DISCUSS] Control Messages - [Was: KIP-82 - Add Record Headers]

I'm renaming this thread in case we start deep diving.

I'm in favor of so called "control messages", at least the notion of
those.  However, I'm not sure about the design.

What I understood from the original mail:

A. Provide a message that does not get returned by poll()
B. Provide a way for applications to consume these messages (sign up?)
C. Control messages would be associated with a topic.
D. Control messages should be _in_ the topic.



1. The first thing to point out is that this can be done with headers.
I assume that's why you sent it on the header thread. As you state, if
we had headers, you would not require a separate KIP.  So, in a way,
you're trying to provide a concrete use case for headers.  I wanted to
separate the discussion to a separate thread mostly because while I
like the idea, and I like the fact that it can be done by headers,
people might want to discuss alternatives.

2. I'm also assuming that you're intentionally trying to preserve
order. Headers could do this natively of course. You could also
achieve this with the separate topic given identifiers, sequence
numbers, headers, etc.  However...

3. There are a few use cases where ordering is important but
out-of-band is even more important. We have a few large workloads
where this is of interest to us.  Obviously we can achieve this with a
separate topic, but having a control channel for a topic that can send
high priority data would be interesting.   And yes, we would learn a
lot form the TCP experiences with the urgent pointer (
https://tools.ietf.org/html/rfc6093 ) and other out-of-band
communication techniques.

You have an example of a "shutdown marker".  This works ok as a
terminator, however, it is not very fast.  If I have 4 TB of data
because of asynchronous processing, then a shutdown marker at the end
of the 4TB is not as useful as having an out-of-band message that will
tell me immediately that those 4TB should not be processed.   So, from
this perspective, I prefer to have a separate topic and not embed
control messages with the data.

If the messages are part of the data, or associated to specific data,
then they should be in the data. If they are about process, we need an
out-of-band mechanism.


4. The general feeling I have gotten from a few people on the list is:
Why not just do this above the kafka clients?  After all, you could
have a system to ignore certain schemas.

Effectively, if we had headers, it would be done from a client
perspective, without the need to modify anything major.

If we wanted to do it with a separate topic, that could also be done
without any broker changes. But you could imagine wanting some broker
changes if the broker understands that 2 streams are tied together
then it may make decisions based on that.  This would be similar to
the handling of file system forks (
https://en.wikipedia.org/wiki/Fork_(file_system) )


5. Also heard on discussions about headers: we don't know if this is
generally useful. Maybe only a couple of institutions?  It may not be
worth it to modify the whole stack for that.

I would again say that with headers you could pull it off easily, even
if only for a subset of clients/applications wanted to use it.


So, in summary. I like the idea.  I see benefits in implementing it
through headers, but I also see benefits of having it as a separate
stream.  I'm not too in favor of having a separate message handling
pipeline for the same topic though.

Nacho





On Wed, Dec 14, 2016 at 9:51 AM, Matthias J. Sax <ma...@confluent.io> wrote:
> Yes and no. I did overload the term "control message".
>
> EOS control messages are for client-broker communication and thus never
> exposed to any application. And I think this is a good design because
> broker needs to understand those control messages. Thus, this should be
> a protocol change.
>
> The type of control messages I have in mind are for client-client
> (application-application) communication and the broker is agnostic to
> them. Thus, it should not be a protocol change.
>
>
> -Matthias
>
>
>
> On 12/14/16 9:42 AM, radai wrote:
>> arent control messages getting pushed as their own top level protocol
>> change (and a fairly massive one) for the transactions KIP ?
>>
>> On Tue, Dec 13, 2016 at 5:54 PM, Matthias J. Sax <ma...@confluent.io>
>> wrote:
>>
>>> Hi,
>>>
>>> I want to add a completely new angle to this discussion. For this, I
>>> want to propose an extension for the headers feature that enables new
>>> uses cases -- and those new use cases might convince people to support
>>> headers (of course including the larger scoped proposal).
>>>
>>> Extended Proposal:
>>>
>>> Allow messages with a certain header key to be special "control
>>> messages" (w/ o w/o payload) that are not exposed to an application via
>>> .poll().
>>>
>>> Thus, a consumer client would automatically skip over those messages. If
>>> an application knows about embedded control messages, it can "sing up"
>>> to those messages by the consumer client and either get a callback or
>>> the consumer auto-drop for this messages gets disabled (allowing to
>>> consumer those messages via poll()).
>>>
>>> (The details need further considerations/discussion. I just want to
>>> sketch the main idea.)
>>>
>>> Usage:
>>>
>>> There is a shared topic (ie, used by multiple applications) and a
>>> producer application wants to embed a special message in the topic for a
>>> dedicated consumer application. Because only one application will
>>> understand this message, it cannot be a regular message as this would
>>> break all applications that do not understand this message. The producer
>>> application would set a special metadata key and no consumer application
>>> would see this control message by default because they did not enable
>>> their consumer client to return this message in poll() (and the client
>>> would just drop this message with special metadata key). Only the single
>>> application that should receive this message, will subscribe to this
>>> message on its consumer client and process it.
>>>
>>>
>>> Concrete Use Case: Kafka Streams
>>>
>>> In Kafka Streams, we would like to propagate "control messages" from
>>> subtopology to subtopology. There are multiple scenarios for which this
>>> would be useful. For example, currently we do not guarantee a
>>> "consistent shutdown" of an application. By this, I mean that input
>>> records might not be completely processed by the whole topology because
>>> the application shutdown happens "in between" and an intermediate result
>>> topic gets "stock" in an intermediate topic. Thus, a user would see an
>>> committed offset of the source topic of the application, but no
>>> corresponding result record in the output topic.
>>>
>>> Having "shutdown markers" would allow us, to first stop the upstream
>>> subtopology and write this marker into the intermediate topic and the
>>> downstream subtopology would only shut down itself after is sees the
>>> "shutdown marker". Thus, we can guarantee on shutdown, that no
>>> "in-flight" messages got stuck in intermediate topics.
>>>
>>>
>>> A similar usage would be for KIP-95 (Incremental Batch Processing).
>>> There was a discussion about the proposed metadata topic, and we could
>>> avoid this metadata topic if we would have "control messages".
>>>
>>>
>>> Right now, we cannot insert an "application control message" because
>>> Kafka Streams does not own all topics it read/writes and thus might
>>> break other consumer application (as described above) if we inject
>>> random messages that are not understood by other apps.
>>>
>>>
>>> Of course, one can work around "embedded control messaged" by using an
>>> additional topic to propagate control messaged between application (as
>>> suggestion in KIP-95 via a metadata topic for Kafka Streams). But there
>>> are major concerns about adding this metadata topic in the KIP and this
>>> shows that other application that need a similar pattern might profit
>>> from topic embedded "control messages", too.
>>>
>>>
>>> One last important consideration: those "control messages" are used for
>>> client to client communication and are not understood by the broker.
>>> Thus, those messages should not be enabled within the message format
>>> (c.f. tombstone flag -- KIP-87). However, "client land" record headers
>>> would be a nice way to implement them. Because KIP-82 did consider key
>>> namespaces for metatdata keys, this extension should not be an own KIP
>>> but should be included in KIP-82 to reserve a namespace for "control
>>> message" in the first place.
>>>
>>>
>>> Sorry for the long email... Looking forward to your feedback.
>>>
>>>
>>> -Matthias
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 12/8/16 12:12 AM, Michael Pearce wrote:
>>>> Hi Jun
>>>>
>>>> 100) each time a transaction exits a jvm for a remote system (HTTP/JMS/
>>> Hopefully one day kafka) the APM tools stich in a unique id (though I
>>> believe it contains the end2end uuid embedded in this id), on receiving the
>>> message at the receiving JVM the apm code takes this out, and continues its
>>> tracing on the that new thread. Both JVM’s (and other languages the APM
>>> tool supports) send this data async back to the central controllers where
>>> the stiching togeather occurs. For this they need some header space for
>>> them to put this id.
>>>>
>>>> 101) Yes indeed we have a business transaction Id in the payload. Though
>>> this is a system level tracing, that we need to have marry up. Also as per
>>> note on end2end encryption we’d be unable to prove the flow if the payload
>>> is encrypted as we’d not have access to this at certain points of the flow
>>> through the infrastructure/platform.
>>>>
>>>>
>>>> 103) As said we use this mechanism in IG very successfully, as stated
>>> per key we guarantee the transaction producing app to handle the
>>> transaction of a key at one DC unless at point of critical failure where we
>>> have to flip processing to another. We care about key ordering.
>>>> I disagree on the offset comment for the partition solution unless you
>>> do full ISR, or expensive full XA transactions even with partitions you
>>> cannot fully guarantee offsets would match.
>>>>
>>>> 105) Very much so, I need to have access at the platform level to the
>>> other meta data all mentioned, without having to need to have access to the
>>> encryption keys of the payload.
>>>>
>>>> 106)
>>>> Techincally yes for AZ/Region/Cluster, but then we’d need to have a
>>> global producerId register which would be very hard to enforce/ensure is
>>> current and correct, just to understand the message origins of its
>>> region/az/cluster for routing.
>>>> The client wrapper version, producerId can be the same, as obviously the
>>> producer could upgrade its wrapper, as such we need to know what wrapper
>>> version the message is created with.
>>>> Likewise the IP address, as stated we can have our producer move, where
>>> its IP would change.
>>>>
>>>> 107)
>>>> UUID is set on the message by interceptors before actual producer
>>> transport send. This is for platform level message dedupe guarantee, the
>>> business payload should be agnostic to this. Please see
>>> https://activemq.apache.org/artemis/docs/1.5.0/duplicate-detection.html
>>> note this is not touching business payloads.
>>>>
>>>>
>>>>
>>>> On 06/12/2016, 18:22, "Jun Rao" <ju...@confluent.io> wrote:
>>>>
>>>>     Hi, Michael,
>>>>
>>>>     Thanks for the reply. I find it very helpful.
>>>>
>>>>     Data lineage:
>>>>     100. I'd like to understand the APM use case a bit more. It sounds
>>> like
>>>>     that those APM plugins can generate a transaction id that we could
>>>>     potentially put in the header of every message. How would you
>>> typically
>>>>     make use of such transaction ids? Are there other metadata
>>> associated with
>>>>     the transaction id and if so, how are they propagated downstream?
>>>>
>>>>     101. For the finance use case, if the concept of transaction is
>>> important,
>>>>     wouldn't it be typically included in the message payload instead of
>>> as an
>>>>     optional header field?
>>>>
>>>>     102. The data lineage that Altas and Navigator support seems to be
>>> at the
>>>>     dataset level, not per record level? So, not sure if per message
>>> headers
>>>>     are relevant there.
>>>>
>>>>     Mirroring:
>>>>     103. The benefit of using separate partitions is that it potentially
>>> makes
>>>>     it easy to preserve offsets during mirroring. This will make it
>>> easier for
>>>>     consumer to switch clusters. Currently, the consumers can switch
>>> clusters
>>>>     by using the timestampToOffset() api, but it has to deal with
>>> duplicates.
>>>>     Good point on the issue with log compact and I am not sure how to
>>> address
>>>>     this. However, even if we mirror into the existing partitions, the
>>> ordering
>>>>     for messages generated from different clusters seems
>>> non-deterministic
>>>>     anyway. So, it seems that the consumers already have to deal with
>>> that? If
>>>>     a topic is compacted, does that mean which messages are preserved is
>>> also
>>>>     non-deterministic across clusters?
>>>>
>>>>     104. Good point on partition key.
>>>>
>>>>     End-to-end encryption:
>>>>     105. So, it seems end-to-end encryption is useful. Are headers
>>> useful there?
>>>>
>>>>     Auditing:
>>>>     106. It seems other than the UUID, all other metadata are per
>>> producer?
>>>>
>>>>     EOS:
>>>>     107. How are those UUIDs generated? I am not sure if they can be
>>> generated
>>>>     in the producer library. An application may send messages through a
>>> load
>>>>     balancer and on retry, the same message could be routed to a
>>> different
>>>>     producer instance. So, it seems that the application has to generate
>>> the
>>>>     UUIDs. In that case, shouldn't the application just put the UUID in
>>> the
>>>>     payload?
>>>>
>>>>     Thanks,
>>>>
>>>>     Jun
>>>>
>>>>
>>>>     On Fri, Dec 2, 2016 at 4:57 PM, Michael Pearce <
>>> Michael.Pearce@ig.com>
>>>>     wrote:
>>>>
>>>>     > Hi Jun.
>>>>     >
>>>>     > Per Transaction Tracing / Data Lineage.
>>>>     >
>>>>     > As Stated in the KIP this has the first use case of how many APM
>>> tools now
>>>>     > work.
>>>>     > I would find it impossible for any one to argue this is not
>>> important or a
>>>>     > niche market as it has its own gartner report for this space. Such
>>>>     > companies as Appdynamics, NewRelic, Dynatrace, Hawqular are but a
>>> few.
>>>>     >
>>>>     > Likewise these APM tools can help very rapidly track down issues
>>> and
>>>>     > automatically capture metrics, perform actions based on unexpected
>>> behavior
>>>>     > to auto recover services.
>>>>     >
>>>>     > Before mentioning looking at aggregated stats, in these cases where
>>>>     > actually on critical flows we cannot afford to have aggregated
>>> rolled up
>>>>     > stats only.
>>>>     >
>>>>     > With the APM tool we use its actually able to detect a single
>>> transaction
>>>>     > failure and capture the thread traces in the JVM where it failed
>>> and
>>>>     > everything for us, to the point it sends us alerts where we have
>>> this
>>>>     > giving the line number of the code that caused it, the transaction
>>> trace
>>>>     > through all the services and endpoints (supported) upto the point
>>> of
>>>>     > failure, it can also capture the data in and out (so we can
>>> replay).
>>>>     > Because atm Kafka doesn’t support us being able to stich in these
>>> tracing
>>>>     > transaction ids natively, we cannot get these benefits as such is
>>> limiting
>>>>     > our ability support apps and monitor them to the same standards we
>>> come to
>>>>     > expect when on a kafka flow.
>>>>     >
>>>>     > This actually ties in with Data Lineage, as the same tracing can
>>> be used
>>>>     > to back stich this. Essentially many times due to the sums of money
>>>>     > involved there are disputes, and typically as a financial
>>> institute the
>>>>     > easiest and cleanest way to prove when disputes arise is to
>>> present the
>>>>     > actual flow and processes involved in a transaction.
>>>>     >
>>>>     > Likewise as Hadoop matures its evident this case is important, as
>>> tools
>>>>     > such as Atlas (Hortonworks led) and Navigator (cloudera led) are
>>> evident
>>>>     > also I believe the importance here is very much NOT just a
>>> financial issue.
>>>>     >
>>>>     > From a MDM point of view any company wanting to care about Data
>>> Quality
>>>>     > and Data Governance - Data Lineage is a key piece in this puzzle.
>>>>     >
>>>>     >
>>>>     >
>>>>     > RE Mirroring,
>>>>     >
>>>>     > As per the KIP in-fact this is exactly what we do re cluster id,
>>> to mirror
>>>>     > a network of clusters between AZ’s / Regions. We know a
>>> transaction for a
>>>>     > key will be done within a  AZ/Region, as such we know the write to
>>> kafka
>>>>     > would be ordered per key. But we need eventual view of that across
>>> in our
>>>>     > other regions/az’s. When we have complete AZ or Region failure we
>>> know
>>>>     > there will be a brief interruption whilst those transactions are
>>> moved to
>>>>     > another region but we expect after it to continue.
>>>>     >
>>>>     > As mentioned having separate Partions to do this starts to get
>>>>     > ugly/complicated for us:
>>>>     > how would I do compaction where a key is in two partitions?
>>>>     > How do we balance consumers so where multiple partitions with the
>>> same key
>>>>     > goto the same consumer
>>>>     > What do you do if cluster 1 has 5 partitions but cluster 20 has 10
>>> because
>>>>     > its larger kit in our more core DC’s, as such key to partition
>>> mappings for
>>>>     > consumers get even more complicated.
>>>>     > What do you do if we add or remove a complete region
>>>>     >
>>>>     > Where as simple mirror will work we just need to ensure we don’t
>>> have a
>>>>     > cycle which we can do with clusterId.
>>>>     >
>>>>     > We even have started to look at shortest path mirror routing based
>>> on
>>>>     > clusterId, if we also had the region and az info on the originating
>>>>     > message, this we have not implemented but some ideas come from
>>> network
>>>>     > routing, and also the dispatcher router in apache qpid.
>>>>     >
>>>>     > Also we need to have data perimeters e.g. certain data cannot leave
>>>>     > certain countries borders. We want this all automated so that at
>>> the
>>>>     > platform level without having to touch or look at the business
>>> data inside
>>>>     > we can have headers we can put tags into so that we can ensure
>>> this doesn’t
>>>>     > occur when we mirror. (actually links in to data lineage / tracing
>>> as again
>>>>     > we need to tag messages at a platform level) Examples are we are
>>> not
>>>>     > allowed Private customer details to leave Switzerland, yet we need
>>> those
>>>>     > systems integrated.
>>>>     >
>>>>     > Lastly around mirroring we have a partionKey field, as the key
>>> used for
>>>>     > portioning logic != compaction key all the time but we want to
>>> preserve it
>>>>     > for when we mirror so that if source cluster partition count !=
>>> destination
>>>>     > cluster partition count we can honour the same partitioning logic.
>>>>     >
>>>>     >
>>>>     >
>>>>     > RE End 2 End encryption
>>>>     >
>>>>     > As I believe mentioned just before, the solution you mention just
>>> doesn’t
>>>>     > cut the mustard these days with many regulators. An operations
>>> person with
>>>>     > access to the box should not be able to have access to the data.
>>> Many now
>>>>     > actually impose quite literally the implementation expected being
>>> end2end
>>>>     > encryption for certain data (Singapore for us is one that I am
>>> most aware
>>>>     > of). In fact we’re even now needing encrypt the data and store the
>>> keys in
>>>>     > HSM modules.
>>>>     >
>>>>     > Likewise the performance penalty on encrypting decrypting as you
>>> produce
>>>>     > over wire, then again encrypt decrypt as the data is stored on the
>>> brokers
>>>>     > disks and back again, then again encrypted and decrypted back over
>>> the wire
>>>>     > each time for each consumer all adds up, ignoring this doubling
>>> with mirror
>>>>     > makers etc. simply encrypting the value once on write by the
>>> client and
>>>>     > again decrypting on consume by the consumer is far more
>>> performant, but
>>>>     > then the routing and platform meta data needs to be separate (thus
>>> headers)
>>>>     >
>>>>     >
>>>>     >
>>>>     > RE Auditing:
>>>>     >
>>>>     > Our Auditing needs are:
>>>>     > Producer Id,
>>>>     > Origin Cluster Id that message first produced into
>>>>     > Origin AZ – agreed we can derive this if we have cluster id, but
>>> it makes
>>>>     > resolving this for audit reporting a lot easier.
>>>>     > Origin Region – agreed we can derive this if we have cluster id,
>>> but it
>>>>     > makes resolving this for audit reporting a lot easier.
>>>>     > Unique Message Identification (this is not the same as transaction
>>>>     > tracing) – note offset and partition are not the same, as when we
>>> mirror or
>>>>     > have for what ever system failure duplicate send,
>>>>     > Custom Client wrapper version (where organizations have to wrap
>>> the kafka
>>>>     > client for added features) so we know what version of the wrapper
>>> is used
>>>>     > Producer IP address (in case of clients being in our vm/open stack
>>> infra
>>>>     > where they can move around, producer id will stay the same but
>>> this would
>>>>     > change)
>>>>     >
>>>>     >
>>>>     >
>>>>     > RE Once and only once delivery case
>>>>     >
>>>>     > Using the same Message UUID for auditing we can achieve this quite
>>> simply.
>>>>     >
>>>>     > As per how some other brokers do this (cough qpid, artemis)
>>> message uuid
>>>>     > are used to dedupe where message is sent and produced but the
>>> client didn’t
>>>>     > receive the ack, and there for replays the send, by having a
>>> unique message
>>>>     > id per message, this can be filtered out, on consumers where
>>> message
>>>>     > delivery may occur twice for what ever reasons a message uuid can
>>> be used
>>>>     > to remove duplicates being deliverd , like wise we can do this in
>>> the
>>>>     > mirrormakers so if we detect a dupe message we can avoid
>>> replicating it.
>>>>     >
>>>>     >
>>>>     >
>>>>     >
>>>>     > Cheers
>>>>     > Mike
>>>>     >
>>>>     >
>>>>     >
>>>>     > On 02/12/2016, 22:09, "Jun Rao" <ju...@confluent.io> wrote:
>>>>     >
>>>>     >     Since this KIP affects message format, wire protocol, apis, I
>>> think
>>>>     > it's
>>>>     >     worth spending a bit more time to nail down the concrete use
>>> cases. It
>>>>     >     would be bad if we add this feature, but when start
>>> implementing it
>>>>     > for say
>>>>     >     mirroring, we then realize that header is not the best
>>> approach.
>>>>     > Initially,
>>>>     >     I thought I was convinced of the use cases of headers and was
>>> trying to
>>>>     >     write down a few use cases to convince others. That's when I
>>> became
>>>>     > less
>>>>     >     certain. For me to be convinced, I just want to see two strong
>>> use
>>>>     > cases
>>>>     >     (instead of 10 maybe use cases) in the third-party space. The
>>> reason is
>>>>     >     that when we discussed the use cases within a company, often
>>> it ends
>>>>     > with
>>>>     >     "we can't force everyone to use this standard since we may
>>> have to
>>>>     >     integrate with third-party tools".
>>>>     >
>>>>     >     At present, I am not sure why headers are useful for things
>>> like
>>>>     > schemaId
>>>>     >     or encryption. In order to do anything useful to the value,
>>> one needs
>>>>     > to
>>>>     >     know the schemaId or how data is encrypted, but header is
>>> optional.
>>>>     > But, I
>>>>     >     can be convinced if someone (Radai, Sean, Todd?) provides more
>>> details
>>>>     > on
>>>>     >     the argument.
>>>>     >
>>>>     >     I am not very sure header is the best approach for mirroring
>>> either. If
>>>>     >     someone has thought about this more, I'd be happy to hear.
>>>>     >
>>>>     >     I can see the data lineage use case. I am just not sure how
>>> widely
>>>>     >     applicable this is. If someone familiar with this space can
>>> justify
>>>>     > this is
>>>>     >     a significant use case, say in the finance industry, this
>>> would be a
>>>>     > strong
>>>>     >     use case.
>>>>     >
>>>>     >     I can see the auditing use case. I am just not sure if a native
>>>>     > producer id
>>>>     >     solves that problem. If there are additional metadata that's
>>> worth
>>>>     >     collecting but not covered by the producer id, that would make
>>> this a
>>>>     >     strong use case.
>>>>     >
>>>>     >     Thanks,
>>>>     >
>>>>     >     Jun
>>>>     >
>>>>     >
>>>>     >     On Fri, Dec 2, 2016 at 1:41 PM, radai <
>>> radai.rosenblatt@gmail.com>
>>>>     > wrote:
>>>>     >
>>>>     >     > this KIP is about enabling headers, nothing more nothing
>>> less - so
>>>>     > no,
>>>>     >     > broker-side use of headers is not in the KIP scope.
>>>>     >     >
>>>>     >     > obviously though, once you have headers potential use cases
>>> could
>>>>     > include
>>>>     >     > broker-side header-aware interceptors (which would be the
>>> topic of
>>>>     > other
>>>>     >     > future KIPs).
>>>>     >     >
>>>>     >     > a trivially clear use case (to me) would be using such
>>> broker-side
>>>>     >     > interceptors to enforce compliance with organizational
>>> policies - it
>>>>     > would
>>>>     >     > make our SREs lives much easier if instead of retroactively
>>>>     > discovering
>>>>     >     > "rogue" topics/users those messages would have been rejected
>>>>     > up-front.
>>>>     >     >
>>>>     >     > the kafka broker code is lacking any such extensibility
>>> support
>>>>     > (beyond
>>>>     >     > maybe authorizer) which is why these use cases were left out
>>> of the
>>>>     > "case
>>>>     >     > for headers" doc - broker extensibility is a separate
>>> discussion.
>>>>     >     >
>>>>     >     > On Fri, Dec 2, 2016 at 12:59 PM, Gwen Shapira <
>>> gwen@confluent.io>
>>>>     > wrote:
>>>>     >     >
>>>>     >     > > Woah, I wasn't aware this is something we'll do. It wasn't
>>> in the
>>>>     > KIP,
>>>>     >     > > right?
>>>>     >     > >
>>>>     >     > > I guess we could do it the same way ACLs currently work.
>>>>     >     > > I had in mind something that will allow admins to apply
>>> rules to
>>>>     > the
>>>>     >     > > new create/delete/config topic APIs. So Todd can decide to
>>> reject
>>>>     >     > > "create topic" requests that ask for more than 40
>>> partitions, or
>>>>     >     > > require exactly 3 replicas, or no more than 50GB partition
>>> size,
>>>>     > etc.
>>>>     >     > >
>>>>     >     > > ACLs were added a bit ad-hoc, if we are planning to apply
>>> more
>>>>     > rules
>>>>     >     > > to requests (and I think we should), we may want a bit
>>> more generic
>>>>     >     > > design around that.
>>>>     >     > >
>>>>     >     > > On Fri, Dec 2, 2016 at 7:16 AM, radai <
>>> radai.rosenblatt@gmail.com>
>>>>     >     > wrote:
>>>>     >     > > > "wouldn't you be in the business of making sure everyone
>>> uses
>>>>     > them
>>>>     >     > > > properly?"
>>>>     >     > > >
>>>>     >     > > > thats where a broker-side plugin would come handy - any
>>> incoming
>>>>     >     > message
>>>>     >     > > > that does not conform to org policy (read - does not
>>> have the
>>>>     > proper
>>>>     >     > > > headers) gets thrown out (with an error returned to user)
>>>>     >     > > >
>>>>     >     > > > On Thu, Dec 1, 2016 at 8:44 PM, Todd Palino <
>>> tpalino@gmail.com>
>>>>     > wrote:
>>>>     >     > > >
>>>>     >     > > >> Come on, I’ve done at least 2 talks on this one :)
>>>>     >     > > >>
>>>>     >     > > >> Producing counts to a topic is part of it, but that’s
>>> only
>>>>     > part. So
>>>>     >     > you
>>>>     >     > > >> count you have 100 messages in topic A. When you mirror
>>> topic A
>>>>     > to
>>>>     >     > > another
>>>>     >     > > >> cluster, you have 99 messages. Where was your problem?
>>> Or
>>>>     > worse, you
>>>>     >     > > have
>>>>     >     > > >> 100 messages, but one producer duplicated messages and
>>> another
>>>>     > one
>>>>     >     > lost
>>>>     >     > > >> messages. You need details about where the message came
>>> from in
>>>>     > order
>>>>     >     > to
>>>>     >     > > >> pinpoint problems when they happen. Source producer
>>> info, where
>>>>     > it was
>>>>     >     > > >> produced into your infrastructure, and when it was
>>> produced.
>>>>     > This
>>>>     >     > > requires
>>>>     >     > > >> you to add the information to the message.
>>>>     >     > > >>
>>>>     >     > > >> And yes, you still need to maintain your clients. So
>>> maybe my
>>>>     > original
>>>>     >     > > >> example was not the best. My thoughts on not wanting to
>>> be
>>>>     > responsible
>>>>     >     > > for
>>>>     >     > > >> message formats stands, because that’s very much
>>> separate from
>>>>     > the
>>>>     >     > > client.
>>>>     >     > > >> As you know, we have our own internal client library
>>> that can
>>>>     > insert
>>>>     >     > the
>>>>     >     > > >> right headers, and right now inserts the right audit
>>>>     > information into
>>>>     >     > > the
>>>>     >     > > >> message fields. If they exist, and assuming the message
>>> is Avro
>>>>     >     > encoded.
>>>>     >     > > >> What if someone wants to use JSON instead for a good
>>> reason?
>>>>     > What if
>>>>     >     > > user X
>>>>     >     > > >> wants to encrypt messages, but user Y does not?
>>> Maintaining the
>>>>     > client
>>>>     >     > > >> library is still much easier than maintaining the
>>> message
>>>>     > formats.
>>>>     >     > > >>
>>>>     >     > > >> -Todd
>>>>     >     > > >>
>>>>     >     > > >>
>>>>     >     > > >> On Thu, Dec 1, 2016 at 6:21 PM, Gwen Shapira <
>>> gwen@confluent.io
>>>>     > >
>>>>     >     > wrote:
>>>>     >     > > >>
>>>>     >     > > >> > Based on your last sentence, consider me convinced :)
>>>>     >     > > >> >
>>>>     >     > > >> > I get why headers are critical for Mirroring (you
>>> need tags to
>>>>     >     > prevent
>>>>     >     > > >> > loops and sometimes to route messages to the correct
>>>>     > destination).
>>>>     >     > > >> > But why do you need headers to audit? We are auditing
>>> by
>>>>     > producing
>>>>     >     > > >> > counts to a side topic (and I was under the
>>> impression you do
>>>>     > the
>>>>     >     > > >> > same), so we never need to modify the message.
>>>>     >     > > >> >
>>>>     >     > > >> > Another thing - after we added headers, wouldn't you
>>> be in the
>>>>     >     > > >> > business of making sure everyone uses them properly?
>>> Making
>>>>     > sure
>>>>     >     > > >> > everyone includes the right headers you need, not
>>> using the
>>>>     > header
>>>>     >     > > >> > names you intend to use, etc. I don't think the
>>> "policing"
>>>>     > business
>>>>     >     > > >> > will ever go away.
>>>>     >     > > >> >
>>>>     >     > > >> > On Thu, Dec 1, 2016 at 5:25 PM, Todd Palino <
>>>>     > tpalino@gmail.com>
>>>>     >     > > wrote:
>>>>     >     > > >> > > Got it. As an ops guy, I'm not very happy with the
>>>>     > workaround.
>>>>     >     > Avro
>>>>     >     > > >> means
>>>>     >     > > >> > > that I have to be concerned with the format of the
>>> messages
>>>>     > in
>>>>     >     > > order to
>>>>     >     > > >> > run
>>>>     >     > > >> > > the infrastructure (audit, mirroring, etc.). That
>>> means
>>>>     > that I
>>>>     >     > have
>>>>     >     > > to
>>>>     >     > > >> > > handle the schemas, and I have to enforce rules
>>> about good
>>>>     >     > formats.
>>>>     >     > > >> This
>>>>     >     > > >> > is
>>>>     >     > > >> > > not something I want to be in the business of,
>>> because I
>>>>     > should be
>>>>     >     > > able
>>>>     >     > > >> > to
>>>>     >     > > >> > > run a service infrastructure without needing to be
>>> in the
>>>>     > weeds of
>>>>     >     > > >> > dealing
>>>>     >     > > >> > > with customer data formats.
>>>>     >     > > >> > >
>>>>     >     > > >> > > Trust me, a sizable portion of my support time is
>>> spent
>>>>     > dealing
>>>>     >     > with
>>>>     >     > > >> > schema
>>>>     >     > > >> > > issues. I really would like to get away from that.
>>> Maybe
>>>>     > I'd have
>>>>     >     > > more
>>>>     >     > > >> > time
>>>>     >     > > >> > > for other hobbies. Like writing. ;)
>>>>     >     > > >> > >
>>>>     >     > > >> > > -Todd
>>>>     >     > > >> > >
>>>>     >     > > >> > > On Thu, Dec 1, 2016 at 4:04 PM Gwen Shapira <
>>>>     > gwen@confluent.io>
>>>>     >     > > wrote:
>>>>     >     > > >> > >
>>>>     >     > > >> > >> I'm pretty satisfied with the current workarounds
>>> (Avro
>>>>     > container
>>>>     >     > > >> > >> format), so I'm not too excited about the extra
>>> work
>>>>     > required to
>>>>     >     > do
>>>>     >     > > >> > >> headers in Kafka. I absolutely don't mind it if
>>> you do
>>>>     > it...
>>>>     >     > > >> > >> I think the Apache convention for "good idea, but
>>> not
>>>>     > willing to
>>>>     >     > > put
>>>>     >     > > >> > >> any work toward it" is +0.5? anyway, that's what I
>>> was
>>>>     > trying to
>>>>     >     > > >> > >> convey :)
>>>>     >     > > >> > >>
>>>>     >     > > >> > >> On Thu, Dec 1, 2016 at 3:05 PM, Todd Palino <
>>>>     > tpalino@gmail.com>
>>>>     >     > > >> wrote:
>>>>     >     > > >> > >> > Well I guess my question for you, then, is what
>>> is
>>>>     > holding you
>>>>     >     > > back
>>>>     >     > > >> > from
>>>>     >     > > >> > >> > full support for headers? What’s the bit that
>>> you’re
>>>>     > missing
>>>>     >     > that
>>>>     >     > > >> has
>>>>     >     > > >> > you
>>>>     >     > > >> > >> > under a full +1?
>>>>     >     > > >> > >> >
>>>>     >     > > >> > >> > -Todd
>>>>     >     > > >> > >> >
>>>>     >     > > >> > >> >
>>>>     >     > > >> > >> > On Thu, Dec 1, 2016 at 1:59 PM, Gwen Shapira <
>>>>     >     > gwen@confluent.io>
>>>>     >     > > >> > wrote:
>>>>     >     > > >> > >> >
>>>>     >     > > >> > >> >> I know why people who support headers support
>>> them, and
>>>>     > I've
>>>>     >     > > seen
>>>>     >     > > >> > what
>>>>     >     > > >> > >> >> the discussion is like.
>>>>     >     > > >> > >> >>
>>>>     >     > > >> > >> >> This is why I'm asking people who are against
>>> headers
>>>>     >     > > (especially
>>>>     >     > > >> > >> >> committers) what will make them change their
>>> mind - so
>>>>     > we can
>>>>     >     > > get
>>>>     >     > > >> > this
>>>>     >     > > >> > >> >> part over one way or another.
>>>>     >     > > >> > >> >>
>>>>     >     > > >> > >> >> If I sound frustrated it is not at Radai, Jun
>>> or you
>>>>     > (Todd)...
>>>>     >     > > I am
>>>>     >     > > >> > >> >> just looking for something concrete we can do
>>> to move
>>>>     > the
>>>>     >     > > >> discussion
>>>>     >     > > >> > >> >> along to the yummy design details (which is the
>>>>     > argument I
>>>>     >     > > really
>>>>     >     > > >> am
>>>>     >     > > >> > >> >> looking forward to).
>>>>     >     > > >> > >> >>
>>>>     >     > > >> > >> >> On Thu, Dec 1, 2016 at 1:53 PM, Todd Palino <
>>>>     >     > tpalino@gmail.com>
>>>>     >     > > >> > wrote:
>>>>     >     > > >> > >> >> > So, Gwen, to your question (even though I’m
>>> not a
>>>>     >     > > committer)...
>>>>     >     > > >> > >> >> >
>>>>     >     > > >> > >> >> > I have always been a strong supporter of
>>> introducing
>>>>     > the
>>>>     >     > > concept
>>>>     >     > > >> > of an
>>>>     >     > > >> > >> >> > envelope to messages, which headers
>>> accomplishes. The
>>>>     >     > message
>>>>     >     > > key
>>>>     >     > > >> > is
>>>>     >     > > >> > >> >> > already an example of a piece of envelope
>>>>     > information. By
>>>>     >     > > >> > providing a
>>>>     >     > > >> > >> >> means
>>>>     >     > > >> > >> >> > to do this within Kafka itself, and not
>>> relying on
>>>>     > use-case
>>>>     >     > > >> > specific
>>>>     >     > > >> > >> >> > implementations, you make it much easier for
>>>>     > components to
>>>>     >     > > >> > >> interoperate.
>>>>     >     > > >> > >> >> It
>>>>     >     > > >> > >> >> > simplifies development of all these things
>>> (message
>>>>     > routing,
>>>>     >     > > >> > auditing,
>>>>     >     > > >> > >> >> > encryption, etc.) because each one does not
>>> have to
>>>>     > reinvent
>>>>     >     > > the
>>>>     >     > > >> > >> wheel.
>>>>     >     > > >> > >> >> >
>>>>     >     > > >> > >> >> > It also makes it much easier from a client
>>> point of
>>>>     > view if
>>>>     >     > > the
>>>>     >     > > >> > >> headers
>>>>     >     > > >> > >> >> are
>>>>     >     > > >> > >> >> > defined as part of the protocol and/or
>>> message format
>>>>     > in
>>>>     >     > > general
>>>>     >     > > >> > >> because
>>>>     >     > > >> > >> >> > you can easily produce and consume messages
>>> without
>>>>     > having
>>>>     >     > to
>>>>     >     > > >> take
>>>>     >     > > >> > >> into
>>>>     >     > > >> > >> >> > account specific cases. For example, I want
>>> to route
>>>>     >     > messages,
>>>>     >     > > >> but
>>>>     >     > > >> > >> >> client A
>>>>     >     > > >> > >> >> > doesn’t support the way audit implemented
>>> headers, and
>>>>     >     > client
>>>>     >     > > B
>>>>     >     > > >> > >> doesn’t
>>>>     >     > > >> > >> >> > support the way encryption or routing
>>> implemented
>>>>     > headers,
>>>>     >     > so
>>>>     >     > > now
>>>>     >     > > >> > my
>>>>     >     > > >> > >> >> > application has to create some really fragile
>>> (my
>>>>     >     > autocorrect
>>>>     >     > > >> just
>>>>     >     > > >> > >> tried
>>>>     >     > > >> > >> >> to
>>>>     >     > > >> > >> >> > make that “tragic”, which is probably
>>> appropriate
>>>>     > too) code
>>>>     >     > to
>>>>     >     > > >> > strip
>>>>     >     > > >> > >> >> > everything off, rather than just consuming the
>>>>     > messages,
>>>>     >     > > picking
>>>>     >     > > >> > out
>>>>     >     > > >> > >> the
>>>>     >     > > >> > >> >> 1
>>>>     >     > > >> > >> >> > or 2 headers it’s interested in, and
>>> performing its
>>>>     >     > function.
>>>>     >     > > >> > >> >> >
>>>>     >     > > >> > >> >> > Honestly, this discussion has been going on
>>> for a
>>>>     > long time,
>>>>     >     > > and
>>>>     >     > > >> > it’s
>>>>     >     > > >> > >> >> > always “Oh, you came up with 2 use cases, and
>>> yeah,
>>>>     > those
>>>>     >     > use
>>>>     >     > > >> cases
>>>>     >     > > >> > >> are
>>>>     >     > > >> > >> >> > real things that someone would want to do.
>>> Here’s an
>>>>     >     > alternate
>>>>     >     > > >> way
>>>>     >     > > >> > to
>>>>     >     > > >> > >> >> > implement them so let’s not do headers.” If
>>> we have a
>>>>     > few
>>>>     >     > use
>>>>     >     > > >> cases
>>>>     >     > > >> > >> that
>>>>     >     > > >> > >> >> we
>>>>     >     > > >> > >> >> > actually came up with, you can be sure that
>>> over the
>>>>     > next
>>>>     >     > year
>>>>     >     > > >> > >> there’s a
>>>>     >     > > >> > >> >> > dozen others that we didn’t think of that
>>> someone
>>>>     > would like
>>>>     >     > > to
>>>>     >     > > >> > do. I
>>>>     >     > > >> > >> >> > really think it’s time to stop rehashing this
>>>>     > discussion and
>>>>     >     > > >> > instead
>>>>     >     > > >> > >> >> focus
>>>>     >     > > >> > >> >> > on a workable standard that we can adopt.
>>>>     >     > > >> > >> >> >
>>>>     >     > > >> > >> >> > -Todd
>>>>     >     > > >> > >> >> >
>>>>     >     > > >> > >> >> >
>>>>     >     > > >> > >> >> > On Thu, Dec 1, 2016 at 1:39 PM, Todd Palino <
>>>>     >     > > tpalino@gmail.com>
>>>>     >     > > >> > >> wrote:
>>>>     >     > > >> > >> >> >
>>>>     >     > > >> > >> >> >> C. per message encryption
>>>>     >     > > >> > >> >> >>> One drawback of this approach is that this
>>>>     > significantly
>>>>     >     > > reduce
>>>>     >     > > >> > the
>>>>     >     > > >> > >> >> >>> effectiveness of compression, which happens
>>> on a
>>>>     > set of
>>>>     >     > > >> > serialized
>>>>     >     > > >> > >> >> >>> messages. An alternative is to enable SSL
>>> for wire
>>>>     >     > > encryption
>>>>     >     > > >> and
>>>>     >     > > >> > >> rely
>>>>     >     > > >> > >> >> on
>>>>     >     > > >> > >> >> >>> the storage system (e.g. LUKS) for at rest
>>>>     > encryption.
>>>>     >     > > >> > >> >> >>
>>>>     >     > > >> > >> >> >>
>>>>     >     > > >> > >> >> >> Jun, this is not sufficient. While this does
>>> cover
>>>>     > the case
>>>>     >     > > of
>>>>     >     > > >> > >> removing
>>>>     >     > > >> > >> >> a
>>>>     >     > > >> > >> >> >> drive from the system, it will not satisfy
>>> most
>>>>     > compliance
>>>>     >     > > >> > >> requirements
>>>>     >     > > >> > >> >> for
>>>>     >     > > >> > >> >> >> encryption of data as whoever has access to
>>> the
>>>>     > broker
>>>>     >     > itself
>>>>     >     > > >> > still
>>>>     >     > > >> > >> has
>>>>     >     > > >> > >> >> >> access to the unencrypted data. For
>>> end-to-end
>>>>     > encryption
>>>>     >     > you
>>>>     >     > > >> > need to
>>>>     >     > > >> > >> >> >> encrypt at the producer, before it enters the
>>>>     > system, and
>>>>     >     > > >> decrypt
>>>>     >     > > >> > at
>>>>     >     > > >> > >> the
>>>>     >     > > >> > >> >> >> consumer, after it exits the system.
>>>>     >     > > >> > >> >> >>
>>>>     >     > > >> > >> >> >> -Todd
>>>>     >     > > >> > >> >> >>
>>>>     >     > > >> > >> >> >>
>>>>     >     > > >> > >> >> >> On Thu, Dec 1, 2016 at 1:03 PM, radai <
>>>>     >     > > >> radai.rosenblatt@gmail.com
>>>>     >     > > >> > >
>>>>     >     > > >> > >> >> wrote:
>>>>     >     > > >> > >> >> >>
>>>>     >     > > >> > >> >> >>> another big plus of headers in the protocol
>>> is that
>>>>     > it
>>>>     >     > would
>>>>     >     > > >> > enable
>>>>     >     > > >> > >> >> rapid
>>>>     >     > > >> > >> >> >>> iteration on ideas outside of core kafka
>>> and would
>>>>     > reduce
>>>>     >     > > the
>>>>     >     > > >> > >> number of
>>>>     >     > > >> > >> >> >>> future wire format changes required.
>>>>     >     > > >> > >> >> >>>
>>>>     >     > > >> > >> >> >>> a lot of what is currently a KIP represents
>>> use
>>>>     > cases that
>>>>     >     > > are
>>>>     >     > > >> > not
>>>>     >     > > >> > >> 100%
>>>>     >     > > >> > >> >> >>> relevant to all users, and some of them
>>> require
>>>>     > rather
>>>>     >     > > invasive
>>>>     >     > > >> > wire
>>>>     >     > > >> > >> >> >>> protocol changes. a thing a good recent
>>> example of
>>>>     > this is
>>>>     >     > > >> > kip-98.
>>>>     >     > > >> > >> >> >>> tx-utilizing traffic is expected to be a
>>> very small
>>>>     >     > > fraction of
>>>>     >     > > >> > >> total
>>>>     >     > > >> > >> >> >>> traffic and yet the changes are invasive.
>>>>     >     > > >> > >> >> >>>
>>>>     >     > > >> > >> >> >>> every such wire format change translates
>>> into
>>>>     > painful and
>>>>     >     > > slow
>>>>     >     > > >> > >> >> adoption of
>>>>     >     > > >> > >> >> >>> new versions.
>>>>     >     > > >> > >> >> >>>
>>>>     >     > > >> > >> >> >>> i think a lot of functionality currently in
>>> KIPs
>>>>     > could be
>>>>     >     > > "spun
>>>>     >     > > >> > out"
>>>>     >     > > >> > >> >> and
>>>>     >     > > >> > >> >> >>> implemented as opt-in plugins transmitting
>>> data over
>>>>     >     > > headers.
>>>>     >     > > >> > this
>>>>     >     > > >> > >> >> would
>>>>     >     > > >> > >> >> >>> keep the core wire format stable(r), core
>>> codebase
>>>>     >     > smaller,
>>>>     >     > > and
>>>>     >     > > >> > >> avoid
>>>>     >     > > >> > >> >> the
>>>>     >     > > >> > >> >> >>> "burden of proof" thats sometimes required
>>> to prove
>>>>     > a
>>>>     >     > > certain
>>>>     >     > > >> > >> feature
>>>>     >     > > >> > >> >> is
>>>>     >     > > >> > >> >> >>> useful enough for a wide-enough audience to
>>> warrant
>>>>     > a wire
>>>>     >     > > >> format
>>>>     >     > > >> > >> >> change
>>>>     >     > > >> > >> >> >>> and code complexity additions.
>>>>     >     > > >> > >> >> >>>
>>>>     >     > > >> > >> >> >>> (to be clear - kip-98 goes beyond "mere"
>>> wire format
>>>>     >     > changes
>>>>     >     > > >> and
>>>>     >     > > >> > im
>>>>     >     > > >> > >> not
>>>>     >     > > >> > >> >> >>> saying it could have been completely done
>>> with
>>>>     > headers,
>>>>     >     > but
>>>>     >     > > >> > >> >> exactly-once
>>>>     >     > > >> > >> >> >>> delivery certainly could)
>>>>     >     > > >> > >> >> >>>
>>>>     >     > > >> > >> >> >>> On Thu, Dec 1, 2016 at 11:20 AM, Gwen
>>> Shapira <
>>>>     >     > > >> gwen@confluent.io
>>>>     >     > > >> > >
>>>>     >     > > >> > >> >> wrote:
>>>>     >     > > >> > >> >> >>>
>>>>     >     > > >> > >> >> >>> > On Thu, Dec 1, 2016 at 10:24 AM, radai <
>>>>     >     > > >> > >> radai.rosenblatt@gmail.com>
>>>>     >     > > >> > >> >> >>> wrote:
>>>>     >     > > >> > >> >> >>> > > "For use cases within an organization,
>>> one could
>>>>     >     > always
>>>>     >     > > use
>>>>     >     > > >> > >> other
>>>>     >     > > >> > >> >> >>> > > approaches such as company-wise
>>> containers"
>>>>     >     > > >> > >> >> >>> > > this is what linkedin has traditionally
>>> done
>>>>     > but there
>>>>     >     > > are
>>>>     >     > > >> > now
>>>>     >     > > >> > >> >> cases
>>>>     >     > > >> > >> >> >>> > (read
>>>>     >     > > >> > >> >> >>> > > - topics) where this is not acceptable.
>>> this
>>>>     > makes
>>>>     >     > > headers
>>>>     >     > > >> > >> useful
>>>>     >     > > >> > >> >> even
>>>>     >     > > >> > >> >> >>> > > within single orgs for cases where
>>>>     >     > > one-container-fits-all
>>>>     >     > > >> > cannot
>>>>     >     > > >> > >> >> >>> apply.
>>>>     >     > > >> > >> >> >>> > >
>>>>     >     > > >> > >> >> >>> > > as for the particular use cases listed,
>>> i dont
>>>>     > want
>>>>     >     > > this to
>>>>     >     > > >> > >> devolve
>>>>     >     > > >> > >> >> >>> to a
>>>>     >     > > >> > >> >> >>> > > discussion of particular use cases - i
>>> think its
>>>>     >     > enough
>>>>     >     > > >> that
>>>>     >     > > >> > >> some
>>>>     >     > > >> > >> >> of
>>>>     >     > > >> > >> >> >>> them
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > I think a main point of contention is
>>> that: We
>>>>     >     > identified
>>>>     >     > > few
>>>>     >     > > >> > >> >> >>> > use-cases where headers are useful, do we
>>> want
>>>>     > Kafka to
>>>>     >     > > be a
>>>>     >     > > >> > >> system
>>>>     >     > > >> > >> >> >>> > that supports those use-cases?
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > For example, Jun said:
>>>>     >     > > >> > >> >> >>> > "Not sure how widely useful record-level
>>> lineage
>>>>     > is
>>>>     >     > though
>>>>     >     > > >> > since
>>>>     >     > > >> > >> the
>>>>     >     > > >> > >> >> >>> > overhead could
>>>>     >     > > >> > >> >> >>> > be significant."
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > We know NiFi supports record level
>>> lineage. I
>>>>     > don't
>>>>     >     > think
>>>>     >     > > it
>>>>     >     > > >> > was
>>>>     >     > > >> > >> >> >>> > developed for lols, I think it is safe to
>>> assume
>>>>     > that
>>>>     >     > the
>>>>     >     > > NSA
>>>>     >     > > >> > >> needed
>>>>     >     > > >> > >> >> >>> > that functionality. We also know that
>>> certain
>>>>     > financial
>>>>     >     > > >> > institutes
>>>>     >     > > >> > >> >> >>> > need to track tampering with records at a
>>> record
>>>>     > level
>>>>     >     > and
>>>>     >     > > >> > there
>>>>     >     > > >> > >> are
>>>>     >     > > >> > >> >> >>> > federal regulations that absolutely
>>> require
>>>>     > this.  They
>>>>     >     > > also
>>>>     >     > > >> > need
>>>>     >     > > >> > >> to
>>>>     >     > > >> > >> >> >>> > prove that routing apps that "touches" the
>>>>     > messages and
>>>>     >     > > >> either
>>>>     >     > > >> > >> reads
>>>>     >     > > >> > >> >> >>> > or updates headers couldn't have possibly
>>>>     > modified the
>>>>     >     > > >> payload
>>>>     >     > > >> > >> >> itself.
>>>>     >     > > >> > >> >> >>> > They use record level encryption to do
>>> that -
>>>>     > apps can
>>>>     >     > > read
>>>>     >     > > >> and
>>>>     >     > > >> > >> >> >>> > (sometimes) modify headers but can't
>>> touch the
>>>>     > payload.
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > We can totally say "those are corner
>>> cases and
>>>>     > not worth
>>>>     >     > > >> adding
>>>>     >     > > >> > >> >> >>> > headers to Kafka for", they should use a
>>> different
>>>>     >     > pubsub
>>>>     >     > > >> > message
>>>>     >     > > >> > >> for
>>>>     >     > > >> > >> >> >>> > that (Nifi or one of the other 1000 that
>>> cater
>>>>     >     > > specifically
>>>>     >     > > >> to
>>>>     >     > > >> > the
>>>>     >     > > >> > >> >> >>> > financial industry).
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > But this gets us into a catch 22:
>>>>     >     > > >> > >> >> >>> > If we discuss a specific use-case,
>>> someone can
>>>>     > always
>>>>     >     > say
>>>>     >     > > it
>>>>     >     > > >> > isn't
>>>>     >     > > >> > >> >> >>> > interesting enough for Kafka. If we
>>> discuss more
>>>>     > general
>>>>     >     > > >> > trends,
>>>>     >     > > >> > >> >> >>> > others can say "well, we are not sure any
>>> of them
>>>>     > really
>>>>     >     > > >> needs
>>>>     >     > > >> > >> >> headers
>>>>     >     > > >> > >> >> >>> > specifically. This is just hand waving
>>> and not
>>>>     >     > > interesting.".
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > I think discussing use-cases in specifics
>>> is super
>>>>     >     > > important
>>>>     >     > > >> to
>>>>     >     > > >> > >> >> decide
>>>>     >     > > >> > >> >> >>> > implementation details for headers (my
>>> use-cases
>>>>     > lean
>>>>     >     > > toward
>>>>     >     > > >> > >> >> numerical
>>>>     >     > > >> > >> >> >>> > keys with namespaces and object values,
>>> others
>>>>     > differ),
>>>>     >     > > but I
>>>>     >     > > >> > >> think
>>>>     >     > > >> > >> >> we
>>>>     >     > > >> > >> >> >>> > need to answer the general "Are we going
>>> to have
>>>>     >     > headers"
>>>>     >     > > >> > question
>>>>     >     > > >> > >> >> >>> > first.
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > I'd love to hear from the other
>>> committers in the
>>>>     >     > > discussion:
>>>>     >     > > >> > >> >> >>> > What would it take to convince you that
>>> headers
>>>>     > in Kafka
>>>>     >     > > are
>>>>     >     > > >> a
>>>>     >     > > >> > >> good
>>>>     >     > > >> > >> >> >>> > idea in general, so we can move ahead and
>>> try to
>>>>     > agree
>>>>     >     > on
>>>>     >     > > the
>>>>     >     > > >> > >> >> details?
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > I feel like we keep moving the goal posts
>>> and
>>>>     > this is
>>>>     >     > > truly
>>>>     >     > > >> > >> >> exhausting.
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > For the record, I mildly support adding
>>> headers
>>>>     > to Kafka
>>>>     >     > > >> > (+0.5?).
>>>>     >     > > >> > >> >> >>> > The community can continue to find
>>> workarounds to
>>>>     > the
>>>>     >     > > issue
>>>>     >     > > >> and
>>>>     >     > > >> > >> there
>>>>     >     > > >> > >> >> >>> > are some benefits to keeping the message
>>> format
>>>>     > and
>>>>     >     > > clients
>>>>     >     > > >> > >> simpler.
>>>>     >     > > >> > >> >> >>> > But I see the usefulness of headers to
>>> many
>>>>     > use-cases
>>>>     >     > and
>>>>     >     > > if
>>>>     >     > > >> we
>>>>     >     > > >> > >> can
>>>>     >     > > >> > >> >> >>> > find a good and generally useful way to
>>> add it to
>>>>     > Kafka,
>>>>     >     > > it
>>>>     >     > > >> > will
>>>>     >     > > >> > >> make
>>>>     >     > > >> > >> >> >>> > Kafka easier to use for many - worthy
>>> goal in my
>>>>     > eyes.
>>>>     >     > > >> > >> >> >>> >
>>>>     >     > > >> > >> >> >>> > > are interesting/feasible, but:
>>>>     >     > > >> > >> >> >>> > > A+B. i think there are use cases for
>>> polyglot
>>>>     > topics.
>>>>     >     > > >> > >> especially if
>>>>     >     > > >> > >> >> >>> kafka
>>>>     >     > > >> > >> >> >>> > > is being used to "trunk" something else.
>>>>     >     > > >> > >> >> >>> > > D. multiple topics would make it harder
>>> to write
>>>>     >     > > portable
>>>>     >     > > >> > >> consumer
>>>>     >     > > >> > >> >> >>> code.
>>>>     >     > > >> > >> >> >>> > > partition remapping would mess with
>>> locality of
>>>>     >     > > consumption
>>>>     >     > > >> > >> >> >>> guarantees.
>>>>     >     > > >> > >> >> >>> > > E+F. a use case I see for
>>> lineage/metadata is
>>>>     >     > > >> > >> billing/chargeback.
>>>>     >     > > >> > >> >> for
>>>>     >     > > >> > >> >> >>> > that
>>>>     >     > > >> > >> >> >>> > > use case it is not enough to simply
>>> record the
>>>>     > point
>>>>     >     > of
>>>>     >     > > >> > origin,
>>>>     >     > > >> > >> but
>>>>     >     > > >> > >> >> >>> every
>>>>     >     > > >> > >> >> >>> > > replication stop (think mirror maker)
>>> must also
>>>>     > add a
>>>>     >     > > >> record
>>>>     >     > > >> > to
>>>>     >     > > >> > >> >> form a
>>>>     >     > > >> > >> >> >>> > > "transit log".
>>>>     >     > > >> > >> >> >>> > >
>>>>     >     > > >> > >> >> >>> > > as for stream processing on top of
>>> kafka - i
>>>>     > know
>>>>     >     > samza
>>>>     >     > > >> has a
>>>>     >     > > >> > >> >> metadata
>>>>     >     > > >> > >> >> >>> > map
>>>>     >     > > >> > >> >> >>> > > which they carry around in addition to
>>> user
>>>>     > values.
>>>>     >     > > headers
>>>>     >     > > >> > are
>>>>     >     > > >> > >> the
>>>>     >     > > >> > >> >> >>> > perfect
>>>>     >     > > >> > >> >> >>> > > fit for these things.
>>>>     >     > > >> > >> >> >>> > >
>>>>     >     > > >> > >> >> >>> > >
>>>>     >     > > >> > >> >> >>> > >
>>>>     >     > > >> > >> >> >>> > > On Wed, Nov 30, 2016 at 6:50 PM, Jun
>>> Rao <
>>>>     >     > > jun@confluent.io
>>>>     >     > > >> >
>>>>     >     > > >> > >> wrote:
>>>>     >     > > >> > >> >> >>> > >
>>>>     >     > > >> > >> >> >>> > >> Hi, Michael,
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> In order to answer the first two
>>> questions, it
>>>>     > would
>>>>     >     > be
>>>>     >     > > >> > helpful
>>>>     >     > > >> > >> >> if we
>>>>     >     > > >> > >> >> >>> > could
>>>>     >     > > >> > >> >> >>> > >> identify 1 or 2 strong use cases for
>>> headers
>>>>     > in the
>>>>     >     > > space
>>>>     >     > > >> > for
>>>>     >     > > >> > >> >> >>> > third-party
>>>>     >     > > >> > >> >> >>> > >> vendors. For use cases within an
>>> organization,
>>>>     > one
>>>>     >     > > could
>>>>     >     > > >> > always
>>>>     >     > > >> > >> >> use
>>>>     >     > > >> > >> >> >>> > other
>>>>     >     > > >> > >> >> >>> > >> approaches such as company-wise
>>> containers to
>>>>     > get
>>>>     >     > > around
>>>>     >     > > >> w/o
>>>>     >     > > >> > >> >> >>> headers. I
>>>>     >     > > >> > >> >> >>> > >> went through the use cases in the KIP
>>> and in
>>>>     > Radai's
>>>>     >     > > wiki
>>>>     >     > > >> (
>>>>     >     > > >> > >> >> >>> > >> https://cwiki.apache.org/confl
>>>>     > uence/display/KAFKA/A+
>>>>     >     > > >> > >> >> >>> > Case+for+Kafka+Headers
>>>>     >     > > >> > >> >> >>> > >> ).
>>>>     >     > > >> > >> >> >>> > >> The following are the ones that that I
>>>>     > understand and
>>>>     >     > > >> could
>>>>     >     > > >> > be
>>>>     >     > > >> > >> in
>>>>     >     > > >> > >> >> the
>>>>     >     > > >> > >> >> >>> > >> third-party use case category.
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> A. content-type
>>>>     >     > > >> > >> >> >>> > >> It seems that in general, content-type
>>> should
>>>>     > be set
>>>>     >     > at
>>>>     >     > > >> the
>>>>     >     > > >> > >> topic
>>>>     >     > > >> > >> >> >>> level.
>>>>     >     > > >> > >> >> >>> > >> Not sure if mixing messages with
>>> different
>>>>     > content
>>>>     >     > > types
>>>>     >     > > >> > >> should be
>>>>     >     > > >> > >> >> >>> > >> encouraged.
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> B. schema id
>>>>     >     > > >> > >> >> >>> > >> Since the value is mostly useless
>>> without
>>>>     > schema id,
>>>>     >     > it
>>>>     >     > > >> > seems
>>>>     >     > > >> > >> that
>>>>     >     > > >> > >> >> >>> > storing
>>>>     >     > > >> > >> >> >>> > >> the schema id together with serialized
>>> bytes
>>>>     > in the
>>>>     >     > > value
>>>>     >     > > >> is
>>>>     >     > > >> > >> >> better?
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> C. per message encryption
>>>>     >     > > >> > >> >> >>> > >> One drawback of this approach is that
>>> this
>>>>     >     > > significantly
>>>>     >     > > >> > reduce
>>>>     >     > > >> > >> >> the
>>>>     >     > > >> > >> >> >>> > >> effectiveness of compression, which
>>> happens on
>>>>     > a set
>>>>     >     > of
>>>>     >     > > >> > >> serialized
>>>>     >     > > >> > >> >> >>> > >> messages. An alternative is to enable
>>> SSL for
>>>>     > wire
>>>>     >     > > >> > encryption
>>>>     >     > > >> > >> and
>>>>     >     > > >> > >> >> >>> rely
>>>>     >     > > >> > >> >> >>> > on
>>>>     >     > > >> > >> >> >>> > >> the storage system (e.g. LUKS) for at
>>> rest
>>>>     >     > encryption.
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> D. cluster ID for mirroring across
>>> Kafka
>>>>     > clusters
>>>>     >     > > >> > >> >> >>> > >> This is actually interesting. Today,
>>> to avoid
>>>>     >     > > introducing
>>>>     >     > > >> > >> cycles
>>>>     >     > > >> > >> >> when
>>>>     >     > > >> > >> >> >>> > doing
>>>>     >     > > >> > >> >> >>> > >> mirroring across data centers, one
>>> would
>>>>     > either have
>>>>     >     > to
>>>>     >     > > >> set
>>>>     >     > > >> > up
>>>>     >     > > >> > >> two
>>>>     >     > > >> > >> >> >>> Kafka
>>>>     >     > > >> > >> >> >>> > >> clusters (a local and an aggregate)
>>> per data
>>>>     > center
>>>>     >     > or
>>>>     >     > > >> > rename
>>>>     >     > > >> > >> >> topics.
>>>>     >     > > >> > >> >> >>> > >> Neither is ideal. With headers, the
>>> producer
>>>>     > could
>>>>     >     > tag
>>>>     >     > > >> each
>>>>     >     > > >> > >> >> message
>>>>     >     > > >> > >> >> >>> with
>>>>     >     > > >> > >> >> >>> > >> the producing cluster ID in the header.
>>>>     > MirrorMaker
>>>>     >     > > could
>>>>     >     > > >> > then
>>>>     >     > > >> > >> >> avoid
>>>>     >     > > >> > >> >> >>> > >> mirroring messages to a cluster if
>>> they are
>>>>     > tagged
>>>>     >     > with
>>>>     >     > > >> the
>>>>     >     > > >> > >> same
>>>>     >     > > >> > >> >> >>> cluster
>>>>     >     > > >> > >> >> >>> > >> id.
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> However, an alternative approach is to
>>>>     > introduce sth
>>>>     >     > > like
>>>>     >     > > >> > >> >> >>> hierarchical
>>>>     >     > > >> > >> >> >>> > >> topic and store messages from different
>>>>     > clusters in
>>>>     >     > > >> > different
>>>>     >     > > >> > >> >> >>> partitions
>>>>     >     > > >> > >> >> >>> > >> under the same topic. This approach
>>> avoids
>>>>     > filtering
>>>>     >     > > out
>>>>     >     > > >> > >> unneeded
>>>>     >     > > >> > >> >> >>> data
>>>>     >     > > >> > >> >> >>> > and
>>>>     >     > > >> > >> >> >>> > >> makes offset preserving easier to
>>> support. It
>>>>     > may
>>>>     >     > make
>>>>     >     > > >> > >> compaction
>>>>     >     > > >> > >> >> >>> > trickier
>>>>     >     > > >> > >> >> >>> > >> though since the same key may show up
>>> in
>>>>     > different
>>>>     >     > > >> > partitions.
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> E. record-level lineage
>>>>     >     > > >> > >> >> >>> > >> For example, a source connector could
>>> store in
>>>>     > the
>>>>     >     > > message
>>>>     >     > > >> > the
>>>>     >     > > >> > >> >> >>> metadata
>>>>     >     > > >> > >> >> >>> > >> (e.g. UUID) of the source record.
>>> Similarly,
>>>>     > if a
>>>>     >     > > stream
>>>>     >     > > >> job
>>>>     >     > > >> > >> >> >>> transforms
>>>>     >     > > >> > >> >> >>> > >> messages from topic A to topic B, the
>>> library
>>>>     > could
>>>>     >     > > >> include
>>>>     >     > > >> > the
>>>>     >     > > >> > >> >> >>> source
>>>>     >     > > >> > >> >> >>> > >> message offset in each of the
>>> transformed
>>>>     > message in
>>>>     >     > > the
>>>>     >     > > >> > >> header.
>>>>     >     > > >> > >> >> Not
>>>>     >     > > >> > >> >> >>> > sure
>>
>>>>     >     > > >> > >> >> >>> > >> how widely useful record-level lineage
>>> is
>>>>     > though
>>>>     >     > since
>>>>     >     > > the
>>>>     >     > > >> > >> >> overhead
>>>>     >     > > >> > >> >> >>> > could
>>>>     >     > > >> > >> >> >>> > >> be significant.
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> F. auditing metadata
>>>>     >     > > >> > >> >> >>> > >> We could put things like
>>> clientId/host/user in
>>>>     > the
>>>>     >     > > header
>>>>     >     > > >> in
>>>>     >     > > >> > >> each
>>>>     >     > > >> > >> >> >>> > message
>>>>     >     > > >> > >> >> >>> > >> for auditing. These metadata are
>>> really at the
>>>>     >     > producer
>>>>     >     > > >> > level
>>>>     >     > > >> > >> >> though.
>>>>     >     > > >> > >> >> >>> > So, a
>>>>     >     > > >> > >> >> >>> > >> more efficient way is to only include a
>>>>     > "producerId"
>>>>     >     > > per
>>>>     >     > > >> > >> message
>>>>     >     > > >> > >> >> and
>>>>     >     > > >> > >> >> >>> > send
>>>>     >     > > >> > >> >> >>> > >> the producerId -> metadata mapping
>>>>     > independently.
>>>>     >     > > KIP-98
>>>>     >     > > >> is
>>>>     >     > > >> > >> >> actually
>>>>     >     > > >> > >> >> >>> > >> proposing including such a producerId
>>> natively
>>>>     > in the
>>>>     >     > > >> > message.
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> So, overall, I not sure that I am fully
>>>>     > convinced of
>>>>     >     > > the
>>>>     >     > > >> > strong
>>>>     >     > > >> > >> >> >>> > third-party
>>>>     >     > > >> > >> >> >>> > >> use cases of headers yet. Perhaps we
>>> could
>>>>     > discuss a
>>>>     >     > > bit
>>>>     >     > > >> > more
>>>>     >     > > >> > >> to
>>>>     >     > > >> > >> >> make
>>>>     >     > > >> > >> >> >>> > one
>>>>     >     > > >> > >> >> >>> > >> or two really convincing use cases.
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> Another orthogonal  question is
>>> whether header
>>>>     > should
>>>>     >     > > be
>>>>     >     > > >> > >> exposed
>>>>     >     > > >> > >> >> in
>>>>     >     > > >> > >> >> >>> > stream
>>>>     >     > > >> > >> >> >>> > >> processing systems such Kafka stream,
>>> Samza,
>>>>     > and
>>>>     >     > Spark
>>>>     >     > > >> > >> streaming.
>>>>     >     > > >> > >> >> >>> > >> Currently, those systems just deal with
>>>>     > key/value
>>>>     >     > > pairs.
>>>>     >     > > >> > >> Should we
>>>>     >     > > >> > >> >> >>> > expose a
>>>>     >     > > >> > >> >> >>> > >> third thing header there too or
>>> somehow map
>>>>     > header to
>>>>     >     > > key
>>>>     >     > > >> or
>>>>     >     > > >> > >> >> value?
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> Thanks,
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> Jun
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> On Tue, Nov 29, 2016 at 3:35 AM,
>>> Michael
>>>>     > Pearce <
>>>>     >     > > >> > >> >> >>> Michael.Pearce@ig.com>
>>>>     >     > > >> > >> >> >>> > >> wrote:
>>>>     >     > > >> > >> >> >>> > >>
>>>>     >     > > >> > >> >> >>> > >> > I assume, that after a period of a
>>> week,
>>>>     > that there
>>>>     >     > > is
>>>>     >     > > >> no
>>>>     >     > > >> > >> >> concerns
>>>>     >     > > >> > >> >> >>> now
>>>>     >     > > >> > >> >> >>> > >> > with points 1, and 2 and now we have
>>>>     > agreement that
>>>>     >     > > >> > headers
>>>>     >     > > >> > >> are
>>>>     >     > > >> > >> >> >>> useful
>>>>     >     > > >> > >> >> >>> > >> and
>>>>     >     > > >> > >> >> >>> > >> > needed in Kafka. As such if put to a
>>> KIP
>>>>     > vote, this
>>>>     >     > > >> > wouldn’t
>>>>     >     > > >> > >> be
>>>>     >     > > >> > >> >> a
>>>>     >     > > >> > >> >> >>> > reason
>>>>     >     > > >> > >> >> >>> > >> to
>>>>     >     > > >> > >> >> >>> > >> > reject.
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> > @
>>>>     >     > > >> > >> >> >>> > >> > Ignacio on point 4).
>>>>     >     > > >> > >> >> >>> > >> > I think for purpose of getting this
>>> KIP
>>>>     > moving past
>>>>     >     > > >> this,
>>>>     >     > > >> > we
>>>>     >     > > >> > >> can
>>>>     >     > > >> > >> >> >>> state
>>>>     >     > > >> > >> >> >>> > >> the
>>>>     >     > > >> > >> >> >>> > >> > key will be a 4 bytes space that can
>>> will be
>>>>     >     > > naturally
>>>>     >     > > >> > >> >> interpreted
>>>>     >     > > >> > >> >> >>> as
>>>>     >     > > >> > >> >> >>> > an
>>>>     >     > > >> > >> >> >>> > >> > Int32 (if namespacing is later
>>> wanted you can
>>>>     >     > easily
>>>>     >     > > >> split
>>>>     >     > > >> > >> this
>>>>     >     > > >> > >> >> >>> into
>>>>     >     > > >> > >> >> >>> > two
>>>>     >     > > >> > >> >> >>> > >> > int16 spaces), from the wire protocol
>>>>     >     > implementation
>>>>     >     > > >> this
>>>>     >     > > >> > >> makes
>>>>     >     > > >> > >> >> no
>>>>     >     > > >> > >> >> >>> > >> > difference I don’t believe. Is this
>>>>     > reasonable to
>>>>     >     > > all?
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> > On 5) as per point 4 therefor happy
>>> we keep
>>>>     > with 32
>>>>     >     > > >> bits.
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> > On 18/11/2016, 20:34, "
>>>>     > ignacio.solis@gmail.com on
>>>>     >     > > >> behalf
>>>>     >     > > >> > of
>>>>     >     > > >> > >> >> >>> Ignacio
>>>>     >     > > >> > >> >> >>> > >> > Solis" <ignacio.solis@gmail.com on
>>> behalf of
>>>>     >     > > >> > isolis@igso.net
>>>>     >     > > >> > >> >
>>>>     >     > > >> > >> >> >>> wrote:
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >     Summary:
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >     3) Yes - Header value as byte[]
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >     4a) Int,Int - No
>>>>     >     > > >> > >> >> >>> > >> >     4b) Int - Yes
>>>>     >     > > >> > >> >> >>> > >> >     4c) String - Reluctant maybe
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >     5) I believe the header system
>>> should
>>>>     > take a
>>>>     >     > > single
>>>>     >     > > >> > >> int.  I
>>>>     >     > > >> > >> >> >>> think
>>>>     >     > > >> > >> >> >>> > >> > 32bits is
>>>>     >     > > >> > >> >> >>> > >> >     a good size, if you want to
>>> interpret
>>>>     > this as
>>>>     >     > to
>>>>     >     > > >> 16bit
>>>>     >     > > >> > >> >> numbers
>>>>     >     > > >> > >> >> >>> in
>>>>     >     > > >> > >> >> >>> > the
>>>>     >     > > >> > >> >> >>> > >> > layer
>>>>     >     > > >> > >> >> >>> > >> >     above go right ahead.  If
>>> somebody wants
>>>>     > to
>>>>     >     > argue
>>>>     >     > > >> for
>>>>     >     > > >> > 16
>>>>     >     > > >> > >> >> bits
>>>>     >     > > >> > >> >> >>> or
>>>>     >     > > >> > >> >> >>> > 64
>>>>     >     > > >> > >> >> >>> > >> > bits of
>>>>     >     > > >> > >> >> >>> > >> >     header key space I would listen.
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >     Discussion:
>>>>     >     > > >> > >> >> >>> > >> >     Dividing the key space into
>>> sub_key_1 and
>>>>     >     > > sub_key_2
>>>>     >     > > >> > >> makes no
>>>>     >     > > >> > >> >> >>> > sense to
>>>>     >     > > >> > >> >> >>> > >> > me at
>>>>     >     > > >> > >> >> >>> > >> >     this layer.  Are we going to
>>> start
>>>>     > providing
>>>>     >     > > APIs to
>>>>     >     > > >> > get
>>>>     >     > > >> > >> all
>>>>     >     > > >> > >> >> >>> the
>>>>     >     > > >> > >> >> >>> > >> >     sub_key_1s? or all the
>>> sub_key_2s?  If
>>>>     > there is
>>>>     >     > > no
>>>>     >     > > >> > >> >> >>> distinguishing
>>>>     >     > > >> > >> >> >>> > >> > functions
>>>>     >     > > >> > >> >> >>> > >> >     that are applied to each one
>>> then they
>>>>     > should
>>>>     >     > be
>>>>     >     > > a
>>>>     >     > > >> > single
>>>>     >     > > >> > >> >> >>> value.
>>>>     >     > > >> > >> >> >>> > At
>>>>     >     > > >> > >> >> >>> > >> > this
>>>>     >     > > >> > >> >> >>> > >> >     layer all we're doing is
>>> equality.
>>>>     >     > > >> > >> >> >>> > >> >     If the above layer wants to
>>> interpret
>>>>     > this as
>>>>     >     > 2,
>>>>     >     > > 3
>>>>     >     > > >> or
>>>>     >     > > >> > >> more
>>>>     >     > > >> > >> >> >>> values
>>>>     >     > > >> > >> >> >>> > >> > that's a
>>>>     >     > > >> > >> >> >>> > >> >     different question.  I
>>> personally think
>>>>     > it's
>>>>     >     > all
>>>>     >     > > one
>>>>     >     > > >> > >> >> keyspace
>>>>     >     > > >> > >> >> >>> > that is
>>>>     >     > > >> > >> >> >>> > >> >     getting assigned using some
>>> structure,
>>>>     > but if
>>>>     >     > you
>>>>     >     > > >> > want to
>>>>     >     > > >> > >> >> >>> > sub-assign
>>>>     >     > > >> > >> >> >>> > >> > parts
>>>>     >     > > >> > >> >> >>> > >> >     of it then that's fine.
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >     The same discussion applies to
>>> strings.
>>>>     > If
>>>>     >     > > somebody
>>>>     >     > > >> > >> argued
>>>>     >     > > >> > >> >> for
>>>>     >     > > >> > >> >> >>> > >> > strings,
>>>>     >     > > >> > >> >> >>> > >> >     would we be arguing to divide the
>>>>     > strings with
>>>>     >     > > dots
>>>>     >     > > >> > ('.')
>>>>     >     > > >> > >> >> as a
>>>>     >     > > >> > >> >> >>> > >> > requirement?
>>>>     >     > > >> > >> >> >>> > >> >     Would we want them to give us the
>>>>     > different
>>>>     >     > name
>>>>     >     > > >> > segments
>>>>     >     > > >> > >> >> >>> > separately?
>>>>     >     > > >> > >> >> >>> > >> >     Would we be performing any
>>> actions on
>>>>     > this key
>>>>     >     > > other
>>>>     >     > > >> > than
>>>>     >     > > >> > >> >> >>> > matching?
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >     Nacho
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >     On Fri, Nov 18, 2016 at 9:30 AM,
>>> Michael
>>>>     >     > Pearce <
>>>>     >     > > >> > >> >> >>> > >> Michael.Pearce@ig.com
>>>>     >     > > >> > >> >> >>> > >> > >
>>>>     >     > > >> > >> >> >>> > >> >     wrote:
>>>>     >     > > >> > >> >> >>> > >> >
>>>>     >     > > >> > >> >> >>> > >> >     > #jay #jun any concerns on 1
>>> and 2
>>>>     > still?
>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>     >     > > >> > >> >> >>> > >> >     > @all
>>>>     >     > > >> > >> >> >>> > >> >     > To get this moving along a bit
>>> more
>>>>     > I'd also
>>>>     >     > > like
>>>>     >     > > >> to
>>>>     >     > > >> > >> ask
>>>>     >     > > >> > >> >> to
>>>>     >     > > >> > >> >> >>> get
>>>>     >     > > >> > >> >> >>> > >> > clarity on
>>>>     >     > > >> > >> >> >>> > >> >     > the below last points:
>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>     >     > > >> > >> >> >>> > >> >     > 3) I believe we're all roughly
>>> happy
>>>>     > with the
>>>>     >     > > >> header
>>>>     >     > > >> > >> value
>>>>     >     > > >> > >> >> >>> > being a
>>>>     >     > > >> > >> >> >>> > >> > byte[]?
>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>     >     > > >> > >> >> >>> > >> >     > 4) I believe consensus has
>>> been for an
>>>>     >     > > namespace
>>>>     >     > > >> > based
>>>>     >     > > >> > >> int
>>>>     >     > > >> > >> >> >>> > approach
>>>>     >     > > >> > >> >> >>> > >> >     > {int,int} for the key. Any
>>> objections
>>>>     > if this
>>>>     >     > > is
>>>>     >     > > >> > what
>>>>     >     > > >> > >> we
>>>>     >     > > >> > >> >> go
>>>>     >     > > >> > >> >> >>> > with?
>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>     >     > > >> > >> >> >>> > >> >     > 5) as we have if assumption in
>>> (4)  is
>>>>     >     > correct,
>>>>     >     > > >> > >> {int,int}
>>>>     >     > > >> > >> >> >>> keys.
>>>>     >     > > >> > >> >> >>> > >> >     > Should both int's be int16 or
>>> int32?
>>>>     >     > > >> > >> >> >>> > >> >     > I'm for them being int16(2
>>> bytes) as
>>>>     > combined
>>>>     >     > > is
>>>>     >     > > >> > space
>>>>     >     > > >> > >> of
>>>>     >     > > >> > >> >> >>> > 4bytes as
>>>>     >     > > >> > >> >> >>> > >> > per
>>>>     >     > > >> > >> >> >>> > >> >     > original and gives plenty of
>>>>     > combinations for
>>>>     >     > > the
>>>>     >     > > >> > >> >> >>> foreseeable,
>>>>     >     > > >> > >> >> >>> > and
>>>>     >     > > >> > >> >> >>> > >> > keeps
>>>>     >     > > >> > >> >> >>> > >> >     > the overhead small.
>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>     >     > > >> > >> >> >>> > >> >     > Do we see any benefit in
>>> another kip
>>>>     > call to
>>>>     >     > > >> discuss
>>>>     >     > > >> > >> >> these at
>>>>     >     > > >> > >> >> >>> > all?
>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>     >     > > >> > >> >> >>> > >> >     > Cheers
>>>>     >     > > >> > >> >> >>> > >> >     > Mike
>>>>     >     > > >> > >> >> >>> > >> >     > ______________________________
>>>>     > __________
>>>>     >     > > >> > >> >> >>> > >> >     > From: K Burstev <
>>> k.burstev@yandex.com>
>>>>     >     > > >> > >> >> >>> > >> >     > Sent: Friday, November 18, 2016
>>>>     > 7:07:07 AM
>>>>     >     > > >> > >> >> >>> > >> >     > To: dev@kafka.apache.org
>>>>     >     > > >> > >> >> >>> > >> >     > Subject: Re: [DISCUSS] KIP-82
>>> - Add
>>>>     > Record
>>>>     >     > > Headers
>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>     >     > > >> > >> >> >>> > >> >     > For what it is worth also i
>>> agree. As
>>>>     > a user:
>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>     >     > > >> > >> >> >>> > >> >     >  1) Yes - Headers are
>>> worthwhile
>>>>     >     > > >> > >> >> >>> > >> >     >  2) Yes - Headers should be a
>>> top level
>>>>     >     > option
>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>     >     > > >> > >> >> >>> > >> >     > 14.11.2016, 21:15, "Ignacio
>>> Solis" <
>>>>     >     > > >> isolis@igso.net
>>>>     >     > > >> > >:
>>>>     >     > > >> > >> >> >>> > >> >     > > 1) Yes - Headers are
>>> worthwhile
>>>>     >     > > >> > >> >> >>> > >> >     > > 2) Yes - Headers should be a
>>> top
>>>>     > level
>>>>     >     > option
>>>>     >     > > >> > >> >> >>> > >> >     > >
>>>>     >     > > >> > >> >> >>> > >> >     > > On Mon, Nov 14, 2016 at 9:16
>>> AM,
>>>>     > Michael
>>>>     >     > > Pearce
>>>>     >     > > >> <
>>>>     >     > > >> > >> >> >>> > >> > Michael.Pearce@ig.com>
>>>>     >     > > >> > >> >> >>> > >> >     > > wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Hi Roger,
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  The kip details/examples
>>> the
>>>>     > original
>>>>     >     > > proposal
>>>>     >     > > >> > for
>>>>     >     > > >> > >> key
>>>>     >     > > >> > >> >> >>> > spacing
>>>>     >     > > >> > >> >> >>> > >> ,
>>>>     >     > > >> > >> >> >>> > >> > not
>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  new mentioned as per
>>> discussion
>>>>     > namespace
>>>>     >     > > >> idea.
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  We will need to update the
>>> kip,
>>>>     > when we
>>>>     >     > get
>>>>     >     > > >> > >> agreement
>>>>     >     > > >> > >> >> >>> this
>>>>     >     > > >> > >> >> >>> > is a
>>>>     >     > > >> > >> >> >>> > >> > better
>>>>     >     > > >> > >> >> >>> > >> >     > >>  approach (which seems to
>>> be the
>>>>     > case if I
>>>>     >     > > have
>>>>     >     > > >> > >> >> understood
>>>>     >     > > >> > >> >> >>> > the
>>>>     >     > > >> > >> >> >>> > >> > general
>>>>     >     > > >> > >> >> >>> > >> >     > >>  feeling in the
>>> conversation)
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Re the variable ints, at
>>> very
>>>>     > early stage
>>>>     >     > > we
>>>>     >     > > >> did
>>>>     >     > > >> > >> think
>>>>     >     > > >> > >> >> >>> about
>>>>     >     > > >> > >> >> >>> > >> > this. I
>>>>     >     > > >> > >> >> >>> > >> >     > think
>>>>     >     > > >> > >> >> >>> > >> >     > >>  the added complexity for
>>> the
>>>>     > saving isn't
>>>>     >     > > >> worth
>>>>     >     > > >> > it.
>>>>     >     > > >> > >> >> I'd
>>>>     >     > > >> > >> >> >>> > rather
>>>>     >     > > >> > >> >> >>> > >> go
>>>>     >     > > >> > >> >> >>> > >> >     > with, if
>>>>     >     > > >> > >> >> >>> > >> >     > >>  we want to reduce
>>> overheads and
>>>>     > size
>>>>     >     > int16
>>>>     >     > > >> > (2bytes)
>>>>     >     > > >> > >> >> keys
>>>>     >     > > >> > >> >> >>> as
>>>>     >     > > >> > >> >> >>> > it
>>>>     >     > > >> > >> >> >>> > >> > keeps it
>>>>     >     > > >> > >> >> >>> > >> >     > >>  simple.
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  On the note of no headers,
>>> there
>>>>     > is as
>>>>     >     > per
>>>>     >     > > the
>>>>     >     > > >> > kip
>>>>     >     > > >> > >> as
>>>>     >     > > >> > >> >> we
>>>>     >     > > >> > >> >> >>> > use an
>>>>     >     > > >> > >> >> >>> > >> >     > attribute
>>>>     >     > > >> > >> >> >>> > >> >     > >>  bit to denote if headers
>>> are
>>>>     > present or
>>>>     >     > > not as
>>>>     >     > > >> > such
>>>>     >     > > >> > >> >> >>> > provides a
>>>>     >     > > >> > >> >> >>> > >> > zero
>>>>     >     > > >> > >> >> >>> > >> >     > >>  overhead currently if
>>> headers are
>>>>     > not
>>>>     >     > used.
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  I think as radai mentions
>>> would be
>>>>     > good
>>>>     >     > > first
>>>>     >     > > >> > if we
>>>>     >     > > >> > >> >> can
>>>>     >     > > >> > >> >> >>> get
>>>>     >     > > >> > >> >> >>> > >> > clarity if
>>>>     >     > > >> > >> >> >>> > >> >     > do
>>>>     >     > > >> > >> >> >>> > >> >     > >>  we now have general
>>> consensus that
>>>>     > (1)
>>>>     >     > > headers
>>>>     >     > > >> > are
>>>>     >     > > >> > >> >> >>> > worthwhile
>>>>     >     > > >> > >> >> >>> > >> and
>>>>     >     > > >> > >> >> >>> > >> >     > useful,
>>>>     >     > > >> > >> >> >>> > >> >     > >>  and (2) we want it as a
>>> top level
>>>>     > entity.
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Just to state the obvious i
>>>>     > believe (1)
>>>>     >     > > >> headers
>>>>     >     > > >> > are
>>>>     >     > > >> > >> >> >>> > worthwhile
>>>>     >     > > >> > >> >> >>> > >> > and (2)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  agree as a top level
>>> entity.
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Cheers
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Mike
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>> ______________________________
>>>>     > __________
>>>>     >     > > >> > >> >> >>> > >> >     > >>  From: Roger Hoover <
>>>>     >     > roger.hoover@gmail.com
>>>>     >     > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Sent: Wednesday, November
>>> 9, 2016
>>>>     > 9:10:47
>>>>     >     > > PM
>>>>     >     > > >> > >> >> >>> > >> >     > >>  To: dev@kafka.apache.org
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Subject: Re: [DISCUSS]
>>> KIP-82 - Add
>>>>     >     > Record
>>>>     >     > > >> > Headers
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Sorry for going a little
>>> in the
>>>>     > weeds but
>>>>     >     > > >> thanks
>>>>     >     > > >> > >> for
>>>>     >     > > >> > >> >> the
>>>>     >     > > >> > >> >> >>> > >> replies
>>>>     >     > > >> > >> >> >>> > >> >     > regarding
>>>>     >     > > >> > >> >> >>> > >> >     > >>  varint.
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Agreed that a prefix and
>>> {int,
>>>>     > int} can
>>>>     >     > be
>>>>     >     > > the
>>>>     >     > > >> > >> same.
>>>>     >     > > >> > >> >> It
>>>>     >     > > >> > >> >> >>> > doesn't
>>>>     >     > > >> > >> >> >>> > >> > look
>>>>     >     > > >> > >> >> >>> > >> >     > like
>>>>     >     > > >> > >> >> >>> > >> >     > >>  that's what the KIP is
>>> saying the
>>>>     > "Open"
>>>>     >     > > >> > section.
>>>>     >     > > >> > >> The
>>>>     >     > > >> > >> >> >>> > example
>>>>     >     > > >> > >> >> >>> > >> > shows
>>>>     >     > > >> > >> >> >>> > >> >     > >>  2100001
>>>>     >     > > >> > >> >> >>> > >> >     > >>  for New Relic and 210002
>>> for App
>>>>     > Dynamics
>>>>     >     > > >> > implying
>>>>     >     > > >> > >> >> that
>>>>     >     > > >> > >> >> >>> the
>>>>     >     > > >> > >> >> >>> > New
>>>>     >     > > >> > >> >> >>> > >> > Relic
>>>>     >     > > >> > >> >> >>> > >> >     > >>  organization will have
>>> only a
>>>>     > single
>>>>     >     > > header id
>>>>     >     > > >> > to
>>>>     >     > > >> > >> work
>>>>     >     > > >> > >> >> >>> > with. Or
>>>>     >     > > >> > >> >> >>> > >> > is
>>>>     >     > > >> > >> >> >>> > >> >     > 2100001
>>>>     >     > > >> > >> >> >>> > >> >     > >>  a prefix? The main point
>>> of a
>>>>     > namespace
>>>>     >     > or
>>>>     >     > > >> > prefix
>>>>     >     > > >> > >> is
>>>>     >     > > >> > >> >> to
>>>>     >     > > >> > >> >> >>> > reduce
>>>>     >     > > >> > >> >> >>> > >> > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  overhead of config mapping
>>> or
>>>>     >     > registration
>>>>     >     > > >> > >> depending
>>>>     >     > > >> > >> >> on
>>>>     >     > > >> > >> >> >>> how
>>>>     >     > > >> > >> >> >>> > >> >     > >>  namespaces/prefixes are
>>> managed.
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Would love to hear more
>>> feedback
>>>>     > on the
>>>>     >     > > >> > >> higher-level
>>>>     >     > > >> > >> >> >>> > questions
>>>>     >     > > >> > >> >> >>> > >> >     > though...
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Cheers,
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Roger
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  On Wed, Nov 9, 2016 at
>>> 11:38 AM,
>>>>     > radai <
>>>>     >     > > >> > >> >> >>> > >> > radai.rosenblatt@gmail.com>
>>>>     >     > > >> > >> >> >>> > >> >     > wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > I think this discussion
>>> is
>>>>     > getting a
>>>>     >     > bit
>>>>     >     > > >> into
>>>>     >     > > >> > the
>>>>     >     > > >> > >> >> >>> weeds on
>>>>     >     > > >> > >> >> >>> > >> > technical
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > implementation details.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > I'd liek to step back a
>>> minute
>>>>     > and try
>>>>     >     > > and
>>>>     >     > > >> > >> establish
>>>>     >     > > >> > >> >> >>> > where we
>>>>     >     > > >> > >> >> >>> > >> > are in
>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > larger picture:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > (re-wording nacho's last
>>>>     > paragraph)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > 1. are we all in
>>> agreement that
>>>>     > headers
>>>>     >     > > are
>>>>     >     > > >> a
>>>>     >     > > >> > >> >> >>> worthwhile
>>>>     >     > > >> > >> >> >>> > and
>>>>     >     > > >> > >> >> >>> > >> > useful
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > addition to have? this
>>> was
>>>>     > contested
>>>>     >     > > early
>>>>     >     > > >> on
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > 2. are we all in
>>> agreement on
>>>>     > headers
>>>>     >     > as
>>>>     >     > > top
>>>>     >     > > >> > >> level
>>>>     >     > > >> > >> >> >>> entity
>>>>     >     > > >> > >> >> >>> > vs
>>>>     >     > > >> > >> >> >>> > >> > headers
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > squirreled-away in V?
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > if there are still
>>> concerns
>>>>     > around
>>>>     >     > these
>>>>     >     > > #2
>>>>     >     > > >> > >> points
>>>>     >     > > >> > >> >> >>> (#jay?
>>>>     >     > > >> > >> >> >>> > >> > #jun?)?
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > (and now back to our
>>> normal
>>>>     > programming
>>>>     >     > > ...)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > varints are nice. having
>>> said
>>>>     > that, its
>>>>     >     > > >> adding
>>>>     >     > > >> > >> >> >>> complexity
>>>>     >     > > >> > >> >> >>> > >> (see
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>> https://github.com/addthis/
>>>>     >     > > >> > >> >> stream-lib/blob/master/src/
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>> main/java/com/clearspring/
>>>>     >     > > >> > >> >> analytics/util/Varint.java
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > as 1st google result)
>>> and would
>>>>     > require
>>>>     >     > > >> anyone
>>>>     >     > > >> > >> >> writing
>>>>     >     > > >> > >> >> >>> > other
>>>>     >     > > >> > >> >> >>> > >> > clients
>>>>     >     > > >> > >> >> >>> > >> >     > (C?
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > Python? Go? Bash? ;-) )
>>> to
>>>>     >     > get/implement
>>>>     >     > > the
>>>>     >     > > >> > >> same,
>>>>     >     > > >> > >> >> and
>>>>     >     > > >> > >> >> >>> for
>>>>     >     > > >> > >> >> >>> > >> > relatively
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > little gain (int vs
>>> string is
>>>>     > order of
>>>>     >     > > >> > magnitude,
>>>>     >     > > >> > >> >> this
>>>>     >     > > >> > >> >> >>> > isnt).
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > int namespacing vs {int,
>>> int}
>>>>     >     > namespacing
>>>>     >     > > >> are
>>>>     >     > > >> > >> >> basically
>>>>     >     > > >> > >> >> >>> > the
>>>>     >     > > >> > >> >> >>> > >> > same
>>>>     >     > > >> > >> >> >>> > >> >     > thing -
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > youre just namespacing
>>> an int64
>>>>     > and
>>>>     >     > > giving
>>>>     >     > > >> > people
>>>>     >     > > >> > >> >> while
>>>>     >     > > >> > >> >> >>> > 2^32
>>>>     >     > > >> > >> >> >>> > >> > ranges
>>>>     >     > > >> > >> >> >>> > >> >     > at a
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > time. the part i like
>>> about this
>>>>     > is
>>>>     >     > > letting
>>>>     >     > > >> > >> people
>>>>     >     > > >> > >> >> >>> have a
>>>>     >     > > >> > >> >> >>> > >> large
>>>>     >     > > >> > >> >> >>> > >> >     > swath of
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > numbers with one
>>> registration so
>>>>     > they
>>>>     >     > > dont
>>>>     >     > > >> > have
>>>>     >     > > >> > >> to
>>>>     >     > > >> > >> >> come
>>>>     >     > > >> > >> >> >>> > back
>>>>     >     > > >> > >> >> >>> > >> > for
>>>>     >     > > >> > >> >> >>> > >> >     > every
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > single plugin/header
>>> they want to
>>>>     >     > > "reserve".
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > On Wed, Nov 9, 2016 at
>>> 11:01 AM,
>>>>     > Roger
>>>>     >     > > >> Hoover
>>>>     >     > > >> > <
>>>>     >     > > >> > >> >> >>> > >> >     > roger.hoover@gmail.com>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > Since some of the
>>> debate has
>>>>     > been
>>>>     >     > about
>>>>     >     > > >> > >> overhead +
>>>>     >     > > >> > >> >> >>> > >> > performance, I'm
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > wondering if we have
>>>>     > considered a
>>>>     >     > > varint
>>>>     >     > > >> > >> encoding
>>>>     >     > > >> > >> >> (
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>> https://developers.google.com/
>>>>     >     > > >> > >> >> protocol-buffers/docs/
>>>>     >     > > >> > >> >> >>> > >> >     > encoding#varints)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > for
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > the header length
>>> field (int32
>>>>     > in the
>>>>     >     > > >> > proposal)
>>>>     >     > > >> > >> >> and
>>>>     >     > > >> > >> >> >>> for
>>>>     >     > > >> > >> >> >>> > >> > header
>>>>     >     > > >> > >> >> >>> > >> >     > ids? If
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > you
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > don't use headers, the
>>>>     > overhead would
>>>>     >     > > be a
>>>>     >     > > >> > >> single
>>>>     >     > > >> > >> >> >>> byte
>>>>     >     > > >> > >> >> >>> > and
>>>>     >     > > >> > >> >> >>> > >> > for each
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > header
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > id < 128 would also
>>> need only a
>>>>     >     > single
>>>>     >     > > >> byte?
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > On Wed, Nov 9, 2016 at
>>> 6:43 AM,
>>>>     >     > radai <
>>>>     >     > > >> > >> >> >>> > >> > radai.rosenblatt@gmail.com>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > @magnus - and very
>>> dangerous
>>>>     > (youre
>>>>     >     > > >> > >> essentially
>>>>     >     > > >> > >> >> >>> > >> > downloading and
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > executing
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > arbitrary code off
>>> the
>>>>     > internet on
>>>>     >     > > your
>>>>     >     > > >> > >> servers
>>>>     >     > > >> > >> >> ...
>>>>     >     > > >> > >> >> >>> > bad
>>>>     >     > > >> > >> >> >>> > >> > idea
>>>>     >     > > >> > >> >> >>> > >> >     > without
>>>>     >     > > >> > >> >> >>> > >> >     > >>  a
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > sandbox, even with)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > as for it being a
>>> purely
>>>>     >     > > administrative
>>>>     >     > > >> > task
>>>>     >     > > >> > >> - i
>>>>     >     > > >> > >> >> >>> > >> disagree.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > i wish it would,
>>> really,
>>>>     > because
>>>>     >     > > then my
>>>>     >     > > >> > >> earlier
>>>>     >     > > >> > >> >> >>> > point on
>>>>     >     > > >> > >> >> >>> > >> > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > complexity
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > of
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > the remapping
>>> process would
>>>>     > be
>>>>     >     > > invalid,
>>>>     >     > > >> > but
>>>>     >     > > >> > >> at
>>>>     >     > > >> > >> >> >>> > linkedin,
>>>>     >     > > >> > >> >> >>> > >> > for
>>>>     >     > > >> > >> >> >>> > >> >     > example,
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > we
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (the team im in) run
>>> kafka
>>>>     > as a
>>>>     >     > > service.
>>>>     >     > > >> > we
>>>>     >     > > >> > >> dont
>>>>     >     > > >> > >> >> >>> > really
>>>>     >     > > >> > >> >> >>> > >> > know
>>>>     >     > > >> > >> >> >>> > >> >     > what our
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > users
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (developing
>>> applications
>>>>     > that use
>>>>     >     > > kafka)
>>>>     >     > > >> > are
>>>>     >     > > >> > >> up
>>>>     >     > > >> > >> >> to
>>>>     >     > > >> > >> >> >>> at
>>>>     >     > > >> > >> >> >>> > any
>>>>     >     > > >> > >> >> >>> > >> > given
>>>>     >     > > >> > >> >> >>> > >> >     > >>  moment.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > it
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > is very possible
>>> (given the
>>>>     >     > > existance of
>>>>     >     > > >> > >> headers
>>>>     >     > > >> > >> >> >>> and a
>>>>     >     > > >> > >> >> >>> > >> >     > corresponding
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugin
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > ecosystem) for some
>>>>     > application to
>>>>     >     > > >> "equip"
>>>>     >     > > >> > >> their
>>>>     >     > > >> > >> >> >>> > >> producers
>>>>     >     > > >> > >> >> >>> > >> > and
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > consumers
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > with the required
>>> plugin
>>>>     > without us
>>>>     >     > > >> > knowing.
>>>>     >     > > >> > >> i
>>>>     >     > > >> > >> >> dont
>>>>     >     > > >> > >> >> >>> > mean
>>>>     >     > > >> > >> >> >>> > >> > to imply
>>>>     >     > > >> > >> >> >>> > >> >     > >>  thats
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > bad, i just want to
>>> make the
>>>>     > point
>>>>     >     > > that
>>>>     >     > > >> > its
>>>>     >     > > >> > >> not
>>>>     >     > > >> > >> >> as
>>>>     >     > > >> > >> >> >>> > simple
>>>>     >     > > >> > >> >> >>> > >> >     > keeping it
>>>>     >     > > >> > >> >> >>> > >> >     > >>  in
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > sync across a
>>> large-enough
>>>>     >     > > organization.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > On Wed, Nov 9, 2016
>>> at 6:17
>>>>     > AM,
>>>>     >     > > Magnus
>>>>     >     > > >> > >> Edenhill
>>>>     >     > > >> > >> >> <
>>>>     >     > > >> > >> >> >>> > >> >     > magnus@edenhill.se>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > I think there is a
>>> piece
>>>>     > missing
>>>>     >     > in
>>>>     >     > > >> the
>>>>     >     > > >> > >> >> Strings
>>>>     >     > > >> > >> >> >>> > >> > discussion,
>>>>     >     > > >> > >> >> >>> > >> >     > where
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > pro-Stringers
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > reason that by
>>> providing
>>>>     > unique
>>>>     >     > > string
>>>>     >     > > >> > >> >> >>> identifiers
>>>>     >     > > >> > >> >> >>> > for
>>>>     >     > > >> > >> >> >>> > >> > each
>>>>     >     > > >> > >> >> >>> > >> >     > header
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > everything will
>>> just
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > magically work for
>>> all
>>>>     > parts of
>>>>     >     > the
>>>>     >     > > >> > stream
>>>>     >     > > >> > >> >> >>> pipeline.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > But the strings
>>> dont mean
>>>>     >     > anything
>>>>     >     > > by
>>>>     >     > > >> > >> >> themselves,
>>>>     >     > > >> > >> >> >>> > and
>>>>     >     > > >> > >> >> >>> > >> > while we
>>>>     >     > > >> > >> >> >>> > >> >     > >>  could
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > probably envision
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > some auto plugin
>>> loader
>>>>     > that
>>>>     >     > > >> downloads,
>>>>     >     > > >> > >> >> compiles,
>>>>     >     > > >> > >> >> >>> > links
>>>>     >     > > >> > >> >> >>> > >> > and
>>>>     >     > > >> > >> >> >>> > >> >     > runs
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugins
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > on-demand
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > as soon as they're
>>> seen by
>>>>     > a
>>>>     >     > > >> consumer, I
>>>>     >     > > >> > >> dont
>>>>     >     > > >> > >> >> >>> really
>>>>     >     > > >> > >> >> >>> > >> see
>>>>     >     > > >> > >> >> >>> > >> > a
>>>>     >     > > >> > >> >> >>> > >> >     > use-case
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > for
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > something
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > so dynamic (and
>>> fragile) in
>>>>     >     > > practice.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > In the real world
>>> an
>>>>     > application
>>>>     >     > > will
>>>>     >     > > >> be
>>>>     >     > > >> > >> >> >>> configured
>>>>     >     > > >> > >> >> >>> > >> with
>>>>     >     > > >> > >> >> >>> > >> > a set
>>>>     >     > > >> > >> >> >>> > >> >     > of
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugins
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > to either add
>>> (producer)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > or read (consumer)
>>> headers.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > This is an
>>> administrative
>>>>     > task
>>>>     >     > > based
>>>>     >     > > >> on
>>>>     >     > > >> > >> what
>>>>     >     > > >> > >> >> >>> > features a
>>>>     >     > > >> > >> >> >>> > >> > client
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > needs/provides and
>>> results
>>>>     > in
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > some sort of
>>> configuration
>>>>     > to
>>>>     >     > > enable
>>>>     >     > > >> and
>>>>     >     > > >> > >> >> >>> configure
>>>>     >     > > >> > >> >> >>> > the
>>>>     >     > > >> > >> >> >>> > >> > desired
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > plugins.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > Since this needs
>>> to be kept
>>>>     >     > > somewhat
>>>>     >     > > >> in
>>>>     >     > > >> > >> sync
>>>>     >     > > >> > >> >> >>> across
>>>>     >     > > >> > >> >> >>> > an
>>>>     >     > > >> > >> >> >>> > >> >     > organisation
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (there
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > is no point in
>>> having
>>>>     > producers
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > add headers no
>>> consumers
>>>>     > will
>>>>     >     > read,
>>>>     >     > > >> and
>>>>     >     > > >> > >> vice
>>>>     >     > > >> > >> >> >>> versa),
>>>>     >     > > >> > >> >> >>> > >> the
>>>>     >     > > >> > >> >> >>> > >> > added
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > complexity
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > of assigning an id
>>>>     > namespace
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > for each plugin as
>>> it is
>>>>     > being
>>>>     >     > > >> > configured
>>>>     >     > > >> > >> >> should
>>>>     >     > > >> > >> >> >>> be
>>>>     >     > > >> > >> >> >>> > >> > tolerable.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > /Magnus
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > 2016-11-09 13:06
>>> GMT+01:00
>>>>     >     > Michael
>>>>     >     > > >> > Pearce <
>>>>     >     > > >> > >> >> >>> > >> >     > Michael.Pearce@ig.com>:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Just
>>> following/catching
>>>>     > up on
>>>>     >     > > what
>>>>     >     > > >> > seems
>>>>     >     > > >> > >> to
>>>>     >     > > >> > >> >> be
>>>>     >     > > >> > >> >> >>> an
>>>>     >     > > >> > >> >> >>> > >> > active
>>>>     >     > > >> > >> >> >>> > >> >     > night :)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > @Radai sorry if
>>> it may
>>>>     > seem
>>>>     >     > > obvious
>>>>     >     > > >> > but
>>>>     >     > > >> > >> what
>>>>     >     > > >> > >> >> >>> does
>>>>     >     > > >> > >> >> >>> > MD
>>>>     >     > > >> > >> >> >>> > >> > stand
>>>>     >     > > >> > >> >> >>> > >> >     > for?
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > My take on
>>> String vs Int:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > I will state
>>> first I am
>>>>     > pro Int
>>>>     >     > > (16
>>>>     >     > > >> or
>>>>     >     > > >> > >> 32).
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > I do though
>>> playing
>>>>     > devils
>>>>     >     > > advocate
>>>>     >     > > >> > see a
>>>>     >     > > >> > >> >> big
>>>>     >     > > >> > >> >> >>> plus
>>>>     >     > > >> > >> >> >>> > >> > with the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > argument
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > of
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > String keys,
>>> this is
>>>>     > around
>>>>     >     > > >> > integrating
>>>>     >     > > >> > >> >> into an
>>>>     >     > > >> > >> >> >>> > >> > existing
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > eco-system.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > As many other
>>> systems use
>>>>     >     > String
>>>>     >     > > >> based
>>>>     >     > > >> > >> >> headers
>>>>     >     > > >> > >> >> >>> > >> (Flume,
>>>>     >     > > >> > >> >> >>> > >> > JMS)
>>>>     >     > > >> > >> >> >>> > >> >     > it
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > makes
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > it
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > much easier for
>>> these to
>>>>     > be
>>>>     >     > > >> > >> >> >>> > incorporated/integrated
>>>>     >     > > >> > >> >> >>> > >> > into.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > How with Int
>>> based
>>>>     > headers
>>>>     >     > could
>>>>     >     > > we
>>>>     >     > > >> > >> provide
>>>>     >     > > >> > >> >> a
>>>>     >     > > >> > >> >> >>> > >> > way/guidence to
>>>>     >     > > >> > >> >> >>> > >> >     > >>  make
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > this
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > integration
>>> simple /
>>>>     > easy with
>>>>     >     > > >> > transition
>>>>     >     > > >> > >> >> flows
>>>>     >     > > >> > >> >> >>> > over
>>>>     >     > > >> > >> >> >>> > >> to
>>>>     >     > > >> > >> >> >>> > >> >     > kafka?
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * tough luck
>>> buddy
>>>>     > you're on
>>>>     >     > your
>>>>     >     > > >> own
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * simply hash
>>> the string
>>>>     > into
>>>>     >     > int
>>>>     >     > > >> code
>>>>     >     > > >> > >> and
>>>>     >     > > >> > >> >> hope
>>>>     >     > > >> > >> >> >>> > for
>>>>     >     > > >> > >> >> >>> > >> no
>>>>     >     > > >> > >> >> >>> > >> >     > collisions
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > (how
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > to
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > convert back
>>> though?)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * http2 style as
>>>>     > mentioned by
>>>>     >     > > nacho.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > cheers,
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Mike
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     > ______________________________
>>>>     >     > > >> > __________
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > From: radai <
>>>>     >     > > >> > radai.rosenblatt@gmail.com>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Sent: Wednesday,
>>>>     > November 9,
>>>>     >     > 2016
>>>>     >     > > >> > 8:12 AM
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > To:
>>> dev@kafka.apache.org
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Subject: Re:
>>> [DISCUSS]
>>>>     > KIP-82 -
>>>>     >     > > Add
>>>>     >     > > >> > >> Record
>>>>     >     > > >> > >> >> >>> Headers
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > thinking about
>>> it some
>>>>     > more,
>>>>     >     > the
>>>>     >     > > >> best
>>>>     >     > > >> > >> way to
>>>>     >     > > >> > >> >> >>> > transmit
>>>>     >     > > >> > >> >> >>> > >> > the
>>>>     >     > > >> > >> >> >>> > >> >     > header
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > remapping
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > data to
>>> consumers would
>>>>     > be to
>>>>     >     > > put it
>>>>     >     > > >> > in
>>>>     >     > > >> > >> the
>>>>     >     > > >> > >> >> MD
>>>>     >     > > >> > >> >> >>> > >> response
>>>>     >     > > >> > >> >> >>> > >> >     > payload,
>>>>     >     > > >> > >> >> >>> > >> >     > >>  so
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > maybe
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > it should be
>>> discussed
>>>>     > now.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > On Wed, Nov 9,
>>> 2016 at
>>>>     > 12:09
>>>>     >     > AM,
>>>>     >     > > >> > radai <
>>>>     >     > > >> > >> >> >>> > >> >     > >>  radai.rosenblatt@gmail.com
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > im not opposed
>>> to the
>>>>     > idea of
>>>>     >     > > >> > namespace
>>>>     >     > > >> > >> >> >>> mapping.
>>>>     >     > > >> > >> >> >>> > >> all
>>>>     >     > > >> > >> >> >>> > >> > im
>>>>     >     > > >> > >> >> >>> > >> >     > saying
>>>>     >     > > >> > >> >> >>> > >> >     > >>  is
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > that
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > its
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > not part of
>>> the "mvp"
>>>>     > and,
>>>>     >     > > since
>>>>     >     > > >> it
>>>>     >     > > >> > >> >> requires
>>>>     >     > > >> > >> >> >>> no
>>>>     >     > > >> > >> >> >>> > >> wire
>>>>     >     > > >> > >> >> >>> > >> > format
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > change,
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > can
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > always be
>>> added later.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > also, its not
>>> as
>>>>     > simple as
>>>>     >     > just
>>>>     >     > > >> > >> >> configuring
>>>>     >     > > >> > >> >> >>> MM
>>>>     >     > > >> > >> >> >>> > to
>>>>     >     > > >> > >> >> >>> > >> do
>>>>     >     > > >> > >> >> >>> > >> > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > transform:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > lets
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > say i've
>>> implemented
>>>>     > large
>>>>     >     > > message
>>>>     >     > > >> > >> >> support as
>>>>     >     > > >> > >> >> >>> > >> > {666,1} and
>>>>     >     > > >> > >> >> >>> > >> >     > on
>>>>     >     > > >> > >> >> >>> > >> >     > >>  some
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > mirror
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > target cluster
>>> its been
>>>>     >     > > remapped
>>>>     >     > > >> to
>>>>     >     > > >> > >> >> {999,1}.
>>>>     >     > > >> > >> >> >>> the
>>>>     >     > > >> > >> >> >>> > >> > consumer
>>>>     >     > > >> > >> >> >>> > >> >     > >>  plugin
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > code
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > would
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > also need to
>>> be told
>>>>     > to look
>>>>     >     > > for
>>>>     >     > > >> the
>>>>     >     > > >> > >> large
>>>>     >     > > >> > >> >> >>> > message
>>>>     >     > > >> > >> >> >>> > >> > "part X
>>>>     >     > > >> > >> >> >>> > >> >     > of
>>>>     >     > > >> > >> >> >>> > >> >     > >>  Y"
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > header
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > under {999,1}.
>>> doable,
>>>>     > but
>>>>     >     > > tricky.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > On Tue, Nov 8,
>>> 2016 at
>>>>     > 10:29
>>>>     >     > > PM,
>>>>     >     > > >> > Gwen
>>>>     >     > > >> > >> >> >>> Shapira <
>>>>     >     > > >> > >> >> >>> > >> >     > >>  gwen@confluent.io
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> While you can
>>> do
>>>>     > whatever
>>>>     >     > you
>>>>     >     > > >> want
>>>>     >     > > >> > >> with a
>>>>     >     > > >> > >> >> >>> > >> namespace
>>>>     >     > > >> > >> >> >>> > >> > and
>>>>     >     > > >> > >> >> >>> > >> >     > your
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > code,
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> what I'd
>>> expect is
>>>>     > for each
>>>>     >     > > app
>>>>     >     > > >> to
>>>>     >     > > >> > >> >> >>> namespaces
>>>>     >     > > >> > >> >> >>> > >> >     > configurable...
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> So if I
>>> accidentally
>>>>     > used
>>>>     >     > 666
>>>>     >     > > for
>>>>     >     > > >> > my
>>>>     >     > > >> > >> HR
>>>>     >     > > >> > >> >> >>> > >> department,
>>>>     >     > > >> > >> >> >>> > >> > and
>>>>     >     > > >> > >> >> >>> > >> >     > still
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > want
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > to
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> run RadaiApp,
>>> I can
>>>>     > config
>>>>     >     > > >> > >> "namespace=42"
>>>>     >     > > >> > >> >> >>> for
>>>>     >     > > >> > >> >> >>> > >> > RadaiApp and
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > everything
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> will look
>>> normal.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> This means
>>> you only
>>>>     > need to
>>>>     >     > > sync
>>>>     >     > > >> > usage
>>>>     >     > > >> > >> >> >>> inside
>>>>     >     > > >> > >> >> >>> > your
>>>>     >     > > >> > >> >> >>> > >> > own
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > organization.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> Still hard,
>>> but
>>>>     > somewhat
>>>>     >     > > easier
>>>>     >     > > >> > than
>>>>     >     > > >> > >> >> syncing
>>>>     >     > > >> > >> >> >>> > with
>>>>     >     > > >> > >> >> >>> > >> > the
>>>>     >     > > >> > >> >> >>> > >> >     > entire
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > world.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> On Tue, Nov
>>> 8, 2016
>>>>     > at 10:07
>>>>     >     > > PM,
>>>>     >     > > >> > >> radai <
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>> radai.rosenblatt@gmail.com>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > and we can
>>> start
>>>>     > with
>>>>     >     > > >> {namespace,
>>>>     >     > > >> > >> id}
>>>>     >     > > >> > >> >> and
>>>>     >     > > >> > >> >> >>> no
>>>>     >     > > >> > >> >> >>> > >> > re-mapping
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > support
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > and
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> always
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > add it
>>> later on
>>>>     > if/when
>>>>     >     > > >> > collisions
>>>>     >     > > >> > >> >> >>> actually
>>>>     >     > > >> > >> >> >>> > >> > happen (i
>>>>     >     > > >> > >> >> >>> > >> >     > dont
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > think
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > they'd
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> be
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > a problem).
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > every
>>> interested
>>>>     > party (so
>>>>     >     > > orgs
>>>>     >     > > >> > or
>>>>     >     > > >> > >> >> >>> > individuals)
>>>>     >     > > >> > >> >> >>> > >> > could
>>>>     >     > > >> > >> >> >>> > >> >     > then
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > register
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > a
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > prefix (0 =
>>>>     > reserved, 1 =
>>>>     >     > > >> > confluent
>>>>     >     > > >> > >> ...
>>>>     >     > > >> > >> >> >>> 666
>>>>     >     > > >> > >> >> >>> > = me
>>>>     >     > > >> > >> >> >>> > >> > :-) )
>>>>     >     > > >> > >> >> >>> > >> >     > and
>>>>     >     > > >> > >> >> >>> > >> >     > >>  do
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > whatever
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> with
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > the 2nd ID
>>> - so once
>>>>     >     > > linkedin
>>>>     >     > > >> > >> >> registers,
>>>>     >     > > >> > >> >> >>> say
>>>>     >     > > >> > >> >> >>> > 3,
>>>>     >     > > >> > >> >> >>> > >> > then
>>>>     >     > > >> > >> >> >>> > >> >     > >>  linkedin
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > devs
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > are
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> free
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > to use {3,
>>> *} with a
>>>>     >     > > reasonable
>>>>     >     > > >> > >> >> >>> expectation
>>>>     >     > > >> > >> >> >>> > to
>>>>     >     > > >> > >> >> >>> > >> to
>>>>     >     > > >> > >> >> >>> > >> >     > collide
>>>>     >     > > >> > >> >> >>> > >> >     > >>  with
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > anything
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > else.
>>> further
>>>>     > partitioning
>>>>     >     > > of
>>>>     >     > > >> > that *
>>>>     >     > > >> > >> >> >>> becomes
>>>>     >     > > >> > >> >> >>> > >> > linkedin's
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > problem,
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > but
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > "upstream
>>>>     > registration"
>>>>     >     > of a
>>>>     >     > > >> > >> namespace
>>>>     >     > > >> > >> >> >>> only
>>>>     >     > > >> > >> >> >>> > has
>>>>     >     > > >> > >> >> >>> > >> to
>>>>     >     > > >> > >> >> >>> > >> >     > happen
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > once.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > On Tue, Nov
>>> 8, 2016
>>>>     > at
>>>>     >     > 9:03
>>>>     >     > > PM,
>>>>     >     > > >> > >> James
>>>>     >     > > >> > >> >> >>> Cheng <
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > wushujames@gmail.com
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > On Nov
>>> 8, 2016,
>>>>     > at 5:54
>>>>     >     > > PM,
>>>>     >     > > >> > Gwen
>>>>     >     > > >> > >> >> >>> Shapira <
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > gwen@confluent.io>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Thank
>>> you so
>>>>     > much for
>>>>     >     > > this
>>>>     >     > > >> > clear
>>>>     >     > > >> > >> and
>>>>     >     > > >> > >> >> >>> fair
>>>>     >     > > >> > >> >> >>> > >> > summary of
>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > arguments.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > I'm in
>>> favor of
>>>>     > ints.
>>>>     >     > > Not a
>>>>     >     > > >> > >> >> >>> deal-breaker,
>>>>     >     > > >> > >> >> >>> > but
>>>>     >     > > >> > >> >> >>> > >> > in
>>>>     >     > > >> > >> >> >>> > >> >     > favor.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Even
>>> more in
>>>>     > favor of
>>>>     >     > > >> Magnus's
>>>>     >     > > >> > >> >> >>> > decentralized
>>>>     >     > > >> > >> >> >>> > >> >     > suggestion
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > with
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Roger's
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > tweak:
>>> add a
>>>>     > namespace
>>>>     >     > > for
>>>>     >     > > >> > >> headers.
>>>>     >     > > >> > >> >> >>> This
>>>>     >     > > >> > >> >> >>> > will
>>>>     >     > > >> > >> >> >>> > >> > allow
>>>>     >     > > >> > >> >> >>> > >> >     > each
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > app
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > to
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > just
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > use
>>> whatever IDs
>>>>     > it
>>>>     >     > wants
>>>>     >     > > >> > >> >> internally,
>>>>     >     > > >> > >> >> >>> and
>>>>     >     > > >> > >> >> >>> > >> then
>>>>     >     > > >> > >> >> >>> > >> > let
>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > admin
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> deploying
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > the app
>>> figure
>>>>     > out an
>>>>     >     > > >> > available
>>>>     >     > > >> > >> >> >>> namespace
>>>>     >     > > >> > >> >> >>> > ID
>>>>     >     > > >> > >> >> >>> > >> > for the
>>>>     >     > > >> > >> >> >>> > >> >     > app
>>>>     >     > > >> > >> >> >>> > >> >     > >>  to
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > live
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > in.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > So
>>>>     >     > > >> > io.confluent.schema-registry
>>>>     >     > > >> > >> can
>>>>     >     > > >> > >> >> be
>>>>     >     > > >> > >> >> >>> > >> > namespace
>>>>     >     > > >> > >> >> >>> > >> >     > 0x01 on
>>>>     >     > > >> > >> >> >>> > >> >     > >>  my
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> deployment
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > and 0x57
>>> on
>>>>     > yours, and
>>>>     >     > > the
>>>>     >     > > >> > poor
>>>>     >     > > >> > >> guys
>>>>     >     > > >> > >> >> >>> > >> > developing the
>>>>     >     > > >> > >> >> >>> > >> >     > app
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > don't
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > need
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > to
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > worry
>>> about that.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> Gwen, if I
>>>>     > understand
>>>>     >     > your
>>>>     >     > > >> > example
>>>>     >     > > >> > >> >> >>> right, an
>>>>     >     > > >> > >> >> >>> > >> >     > application
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > deployer
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > might
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> decide to
>>> use 0x01
>>>>     > in one
>>>>     >     > > >> > >> deployment,
>>>>     >     > > >> > >> >> and
>>>>     >     > > >> > >> >> >>> > that
>>>>     >     > > >> > >> >> >>> > >> > means
>>>>     >     > > >> > >> >> >>> > >> >     > that
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > once
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> message
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> is written
>>> into the
>>>>     >     > > broker, it
>>>>     >     > > >> > >> will be
>>>>     >     > > >> > >> >> >>> > saved on
>>>>     >     > > >> > >> >> >>> > >> > the
>>>>     >     > > >> > >> >> >>> > >> >     > broker
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > with
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > that
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> specific
>>> namespace
>>>>     >     > (0x01).
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> If you
>>> were to
>>>>     > mirror
>>>>     >     > that
>>>>     >     > > >> > message
>>>>     >     > > >> > >> >> into
>>>>     >     > > >> > >> >> >>> > another
>>>>     >     > > >> > >> >> >>> > >> >     > cluster,
>>>>     >     > > >> > >> >> >>> > >> >     > >>  the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > 0x01
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > would
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> accompany
>>> the
>>>>     > message,
>>>>     >     > > right?
>>>>     >     > > >> > What
>>>>     >     > > >> > >> if
>>>>     >     > > >> > >> >> the
>>>>     >     > > >> > >> >> >>> > >> > deployers of
>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > same
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > app
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > in
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> other
>>> cluster uses
>>>>     > 0x57?
>>>>     >     > > They
>>>>     >     > > >> > won't
>>>>     >     > > >> > >> >> >>> > understand
>>>>     >     > > >> > >> >> >>> > >> > each
>>>>     >     > > >> > >> >> >>> > >> >     > other?
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> I'm not
>>> sure
>>>>     > that's an
>>>>     >     > > >> avoidable
>>>>     >     > > >> > >> >> >>> problem. I
>>>>     >     > > >> > >> >> >>> > >> > think it
>>>>     >     > > >> > >> >> >>> > >> >     > simply
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > means
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > that
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> in
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> order to
>>> share
>>>>     > data, you
>>>>     >     > > have
>>>>     >     > > >> to
>>>>     >     > > >> > >> also
>>>>     >     > > >> > >> >> >>> have a
>>>>     >     > > >> > >> >> >>> > >> > shared
>>>>     >     > > >> > >> >> >>> > >> >     > (agreed
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > upon)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>> understanding of
>>>>     > what the
>>>>     >     > > >> > >> namespaces
>>>>     >     > > >> > >> >> >>> mean.
>>>>     >     > > >> > >> >> >>> > >> Which
>>>>     >     > > >> > >> >> >>> > >> > I
>>>>     >     > > >> > >> >> >>> > >> >     > think
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > makes
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > sense,
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> because the
>>>>     > alternate
>>>>     >     > > (sharing
>>>>     >     > > >> > >> >> *nothing*
>>>>     >     > > >> > >> >> >>> at
>>>>     >     > > >> > >> >> >>> > >> all)
>>>>     >     > > >> > >> >> >>> > >> > would
>>>>     >     > > >> > >> >> >>> > >> >     > mean
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > that
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > there
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> would be
>>> no way to
>>>>     >     > > understand
>>>>     >     > > >> > each
>>>>     >     > > >> > >> >> other.
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> -James
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Gwen
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > On Tue,
>>> Nov 8,
>>>>     > 2016 at
>>>>     >     > > 4:23
>>>>     >     > > >> > PM,
>>>>     >     > > >> > >> >> radai <
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>> radai.rosenblatt@gmail.com
>>>>     > >
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> wrote:
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >> +1 for
>>> sean's
>>>>     >     > document.
>>>>     >     > > it
>>>>     >     > > >> > >> covers
>>>>     >     > > >> > >> >> >>> pretty
>>>>     >     > > >> > >> >> >>> > >> much
>>>>     >     > > >> > >> >> >>> > >> > all
>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > trade-offs
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > and
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >> provides
>>>>     > concrete
>>>>     >     > > figures
>>>>     >     > > >> to
>>>>     >     > > >> > >> argue
>>>>     >     > > >> > >> >> >>> about
>>>>     >     > > >> > >> >> >>> > :-)
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >>
>>> (nit-picking -
>>>>     > used
>>>>     >     > the
>>>>     >     > > >> same
>>>>     >     > > >> > >> xkcd
>>>>     >     > > >> > >> >> >>> twice,
>>>>     >     > > >> > >> >> >>> > >> also
>>>>     >     > > >> > >> >> >>> > >> > trove
>>>>     >     > > >> > >> >> >>> > >> >     > has
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > been
>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> superceded
>>>>     >     > > >> > >> >
>>>>     >     > > >> >
>>>>     >     > > >> >
>>>>     >     > > >> >
>>>>     >     > > >> > --
>>>>     >     > > >> > Gwen Shapira
>>>>     >     > > >> > Product Manager | Confluent
>>>>     >     > > >> > 650.450.2760 | @gwenshap
>>>>     >     > > >> > Follow us: Twitter | blog
>>>>     >     > > >> >
>>>>     >     > > >>
>>>>     >     > > >>
>>>>     >     > > >>
>>>>     >     > > >> --
>>>>     >     > > >> *Todd Palino*
>>>>     >     > > >> Staff Site Reliability Engineer
>>>>     >     > > >> Data Infrastructure Streaming
>>>>     >     > > >>
>>>>     >     > > >>
>>>>     >     > > >>
>>>>     >     > > >> linkedin.com/in/toddpalino
>>>>     >     > > >>
>>>>     >     > >
>>>>     >     > >
>>>>     >     > >
>>>>     >     > > --
>>>>     >     > > Gwen Shapira
>>>>     >     > > Product Manager | Confluent
>>>>     >     > > 650.450.2760 | @gwenshap
>>>>     >     > > Follow us: Twitter | blog
>>>>     >     > >
>>>>     >     >
>>>>     >
>>>>     >
>>>>     > The information contained in this email is strictly confidential
>>> and for
>>>>     > the use of the addressee only, unless otherwise indicated. If you
>>> are not
>>>>     > the intended recipient, please do not read, copy, use or disclose
>>> to others
>>>>     > this message or any attachment. Please also notify the sender by
>>> replying
>>>>     > to this email or by telephone (+44(020 7896 0011) and then delete
>>> the email
>>>>     > and any copies of it. Opinions, conclusion (etc) that do not
>>> relate to the
>>>>     > official business of this company shall be understood as neither
>>> given nor
>>>>     > endorsed by it. IG is a trading name of IG Markets Limited (a
>>> company
>>>>     > registered in England and Wales, company number 04008957) and IG
>>> Index
>>>>     > Limited (a company registered in England and Wales, company number
>>>>     > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
>>> Hill,
>>>>     > London EC4R 2YA. Both IG Markets Limited (register number 195355)
>>> and IG
>>>>     > Index Limited (register number 114059) are authorised and
>>> regulated by the
>>>>     > Financial Conduct Authority.
>>>>     >
>>>>
>>>>
>>>> The information contained in this email is strictly confidential and for
>>> the use of the addressee only, unless otherwise indicated. If you are not
>>> the intended recipient, please do not read, copy, use or disclose to others
>>> this message or any attachment. Please also notify the sender by replying
>>> to this email or by telephone (+44(020 7896 0011) and then delete the email
>>> and any copies of it. Opinions, conclusion (etc) that do not relate to the
>>> official business of this company shall be understood as neither given nor
>>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>>> registered in England and Wales, company number 04008957) and IG Index
>>> Limited (a company registered in England and Wales, company number
>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>>> Index Limited (register number 114059) are authorised and regulated by the
>>> Financial Conduct Authority.
>>>>
>>>
>>>
>>
>



-- 
Nacho - Ignacio Solis - isolis@igso.net

Re: [DISCUSS] Control Messages - [Was: KIP-82 - Add Record Headers]

Posted by "Matthias J. Sax" <ma...@confluent.io>.

I agree with all. Just want to elaborate a few things:

3. There are two different use cases:
   (a) the one you describe -- I want to shutdown NOW and don't want to
wait -- I agree with your observations etc
   (b) we intentionally want to "drain" the stream processing topology
before shutting down -- yes, if I have lot of intermediate data this
might take some time, but I want/need a clean shutdown like this

Case 3(b) is currently not possible and exactly want we need for
"Incremental Batch KIP" -- there are other use case for 3(b), too.


4. The point about "it's just a client thing is true, but it should work
for client that are not aware of the messages, too. Ie, we need an
opt-in mechanism -- so some changes are required -- not to the brokers
though -- but it cannot be done "external" to the clients -- otherwise
people would need to change their client code.



About "embedded control message" vs "extra control message stream".
IMHO, there a use cases for both and both approaches complete each other
(they are not conflicting).


-Matthias



On 12/14/16 8:36 PM, Ignacio Solis wrote:
> I'm renaming this thread in case we start deep diving.
> 
> I'm in favor of so called "control messages", at least the notion of
> those.  However, I'm not sure about the design.
> 
> What I understood from the original mail:
> 
> A. Provide a message that does not get returned by poll()
> B. Provide a way for applications to consume these messages (sign up?)
> C. Control messages would be associated with a topic.
> D. Control messages should be _in_ the topic.
> 
> 
> 
> 1. The first thing to point out is that this can be done with headers.
> I assume that's why you sent it on the header thread. As you state, if
> we had headers, you would not require a separate KIP.  So, in a way,
> you're trying to provide a concrete use case for headers.  I wanted to
> separate the discussion to a separate thread mostly because while I
> like the idea, and I like the fact that it can be done by headers,
> people might want to discuss alternatives.
> 
> 2. I'm also assuming that you're intentionally trying to preserve
> order. Headers could do this natively of course. You could also
> achieve this with the separate topic given identifiers, sequence
> numbers, headers, etc.  However...
> 
> 3. There are a few use cases where ordering is important but
> out-of-band is even more important. We have a few large workloads
> where this is of interest to us.  Obviously we can achieve this with a
> separate topic, but having a control channel for a topic that can send
> high priority data would be interesting.   And yes, we would learn a
> lot form the TCP experiences with the urgent pointer (
> https://tools.ietf.org/html/rfc6093 ) and other out-of-band
> communication techniques.
> 
> You have an example of a "shutdown marker".  This works ok as a
> terminator, however, it is not very fast.  If I have 4 TB of data
> because of asynchronous processing, then a shutdown marker at the end
> of the 4TB is not as useful as having an out-of-band message that will
> tell me immediately that those 4TB should not be processed.   So, from
> this perspective, I prefer to have a separate topic and not embed
> control messages with the data.
> 
> If the messages are part of the data, or associated to specific data,
> then they should be in the data. If they are about process, we need an
> out-of-band mechanism.
> 
> 
> 4. The general feeling I have gotten from a few people on the list is:
> Why not just do this above the kafka clients?  After all, you could
> have a system to ignore certain schemas.
> 
> Effectively, if we had headers, it would be done from a client
> perspective, without the need to modify anything major.
> 
> If we wanted to do it with a separate topic, that could also be done
> without any broker changes. But you could imagine wanting some broker
> changes if the broker understands that 2 streams are tied together
> then it may make decisions based on that.  This would be similar to
> the handling of file system forks (
> https://en.wikipedia.org/wiki/Fork_(file_system) )
> 
> 
> 5. Also heard on discussions about headers: we don't know if this is
> generally useful. Maybe only a couple of institutions?  It may not be
> worth it to modify the whole stack for that.
> 
> I would again say that with headers you could pull it off easily, even
> if only for a subset of clients/applications wanted to use it.
> 
> 
> So, in summary. I like the idea.  I see benefits in implementing it
> through headers, but I also see benefits of having it as a separate
> stream.  I'm not too in favor of having a separate message handling
> pipeline for the same topic though.
> 
> Nacho
> 
> 
> 
> 
> 
> On Wed, Dec 14, 2016 at 9:51 AM, Matthias J. Sax <ma...@confluent.io> wrote:
>> Yes and no. I did overload the term "control message".
>>
>> EOS control messages are for client-broker communication and thus never
>> exposed to any application. And I think this is a good design because
>> broker needs to understand those control messages. Thus, this should be
>> a protocol change.
>>
>> The type of control messages I have in mind are for client-client
>> (application-application) communication and the broker is agnostic to
>> them. Thus, it should not be a protocol change.
>>
>>
>> -Matthias
>>
>>
>>
>> On 12/14/16 9:42 AM, radai wrote:
>>> arent control messages getting pushed as their own top level protocol
>>> change (and a fairly massive one) for the transactions KIP ?
>>>
>>> On Tue, Dec 13, 2016 at 5:54 PM, Matthias J. Sax <ma...@confluent.io>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I want to add a completely new angle to this discussion. For this, I
>>>> want to propose an extension for the headers feature that enables new
>>>> uses cases -- and those new use cases might convince people to support
>>>> headers (of course including the larger scoped proposal).
>>>>
>>>> Extended Proposal:
>>>>
>>>> Allow messages with a certain header key to be special "control
>>>> messages" (w/ o w/o payload) that are not exposed to an application via
>>>> .poll().
>>>>
>>>> Thus, a consumer client would automatically skip over those messages. If
>>>> an application knows about embedded control messages, it can "sing up"
>>>> to those messages by the consumer client and either get a callback or
>>>> the consumer auto-drop for this messages gets disabled (allowing to
>>>> consumer those messages via poll()).
>>>>
>>>> (The details need further considerations/discussion. I just want to
>>>> sketch the main idea.)
>>>>
>>>> Usage:
>>>>
>>>> There is a shared topic (ie, used by multiple applications) and a
>>>> producer application wants to embed a special message in the topic for a
>>>> dedicated consumer application. Because only one application will
>>>> understand this message, it cannot be a regular message as this would
>>>> break all applications that do not understand this message. The producer
>>>> application would set a special metadata key and no consumer application
>>>> would see this control message by default because they did not enable
>>>> their consumer client to return this message in poll() (and the client
>>>> would just drop this message with special metadata key). Only the single
>>>> application that should receive this message, will subscribe to this
>>>> message on its consumer client and process it.
>>>>
>>>>
>>>> Concrete Use Case: Kafka Streams
>>>>
>>>> In Kafka Streams, we would like to propagate "control messages" from
>>>> subtopology to subtopology. There are multiple scenarios for which this
>>>> would be useful. For example, currently we do not guarantee a
>>>> "consistent shutdown" of an application. By this, I mean that input
>>>> records might not be completely processed by the whole topology because
>>>> the application shutdown happens "in between" and an intermediate result
>>>> topic gets "stock" in an intermediate topic. Thus, a user would see an
>>>> committed offset of the source topic of the application, but no
>>>> corresponding result record in the output topic.
>>>>
>>>> Having "shutdown markers" would allow us, to first stop the upstream
>>>> subtopology and write this marker into the intermediate topic and the
>>>> downstream subtopology would only shut down itself after is sees the
>>>> "shutdown marker". Thus, we can guarantee on shutdown, that no
>>>> "in-flight" messages got stuck in intermediate topics.
>>>>
>>>>
>>>> A similar usage would be for KIP-95 (Incremental Batch Processing).
>>>> There was a discussion about the proposed metadata topic, and we could
>>>> avoid this metadata topic if we would have "control messages".
>>>>
>>>>
>>>> Right now, we cannot insert an "application control message" because
>>>> Kafka Streams does not own all topics it read/writes and thus might
>>>> break other consumer application (as described above) if we inject
>>>> random messages that are not understood by other apps.
>>>>
>>>>
>>>> Of course, one can work around "embedded control messaged" by using an
>>>> additional topic to propagate control messaged between application (as
>>>> suggestion in KIP-95 via a metadata topic for Kafka Streams). But there
>>>> are major concerns about adding this metadata topic in the KIP and this
>>>> shows that other application that need a similar pattern might profit
>>>> from topic embedded "control messages", too.
>>>>
>>>>
>>>> One last important consideration: those "control messages" are used for
>>>> client to client communication and are not understood by the broker.
>>>> Thus, those messages should not be enabled within the message format
>>>> (c.f. tombstone flag -- KIP-87). However, "client land" record headers
>>>> would be a nice way to implement them. Because KIP-82 did consider key
>>>> namespaces for metatdata keys, this extension should not be an own KIP
>>>> but should be included in KIP-82 to reserve a namespace for "control
>>>> message" in the first place.
>>>>
>>>>
>>>> Sorry for the long email... Looking forward to your feedback.
>>>>
>>>>
>>>> -Matthias
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 12/8/16 12:12 AM, Michael Pearce wrote:
>>>>> Hi Jun
>>>>>
>>>>> 100) each time a transaction exits a jvm for a remote system (HTTP/JMS/
>>>> Hopefully one day kafka) the APM tools stich in a unique id (though I
>>>> believe it contains the end2end uuid embedded in this id), on receiving the
>>>> message at the receiving JVM the apm code takes this out, and continues its
>>>> tracing on the that new thread. Both JVM’s (and other languages the APM
>>>> tool supports) send this data async back to the central controllers where
>>>> the stiching togeather occurs. For this they need some header space for
>>>> them to put this id.
>>>>>
>>>>> 101) Yes indeed we have a business transaction Id in the payload. Though
>>>> this is a system level tracing, that we need to have marry up. Also as per
>>>> note on end2end encryption we’d be unable to prove the flow if the payload
>>>> is encrypted as we’d not have access to this at certain points of the flow
>>>> through the infrastructure/platform.
>>>>>
>>>>>
>>>>> 103) As said we use this mechanism in IG very successfully, as stated
>>>> per key we guarantee the transaction producing app to handle the
>>>> transaction of a key at one DC unless at point of critical failure where we
>>>> have to flip processing to another. We care about key ordering.
>>>>> I disagree on the offset comment for the partition solution unless you
>>>> do full ISR, or expensive full XA transactions even with partitions you
>>>> cannot fully guarantee offsets would match.
>>>>>
>>>>> 105) Very much so, I need to have access at the platform level to the
>>>> other meta data all mentioned, without having to need to have access to the
>>>> encryption keys of the payload.
>>>>>
>>>>> 106)
>>>>> Techincally yes for AZ/Region/Cluster, but then we’d need to have a
>>>> global producerId register which would be very hard to enforce/ensure is
>>>> current and correct, just to understand the message origins of its
>>>> region/az/cluster for routing.
>>>>> The client wrapper version, producerId can be the same, as obviously the
>>>> producer could upgrade its wrapper, as such we need to know what wrapper
>>>> version the message is created with.
>>>>> Likewise the IP address, as stated we can have our producer move, where
>>>> its IP would change.
>>>>>
>>>>> 107)
>>>>> UUID is set on the message by interceptors before actual producer
>>>> transport send. This is for platform level message dedupe guarantee, the
>>>> business payload should be agnostic to this. Please see
>>>> https://activemq.apache.org/artemis/docs/1.5.0/duplicate-detection.html
>>>> note this is not touching business payloads.
>>>>>
>>>>>
>>>>>
>>>>> On 06/12/2016, 18:22, "Jun Rao" <ju...@confluent.io> wrote:
>>>>>
>>>>>     Hi, Michael,
>>>>>
>>>>>     Thanks for the reply. I find it very helpful.
>>>>>
>>>>>     Data lineage:
>>>>>     100. I'd like to understand the APM use case a bit more. It sounds
>>>> like
>>>>>     that those APM plugins can generate a transaction id that we could
>>>>>     potentially put in the header of every message. How would you
>>>> typically
>>>>>     make use of such transaction ids? Are there other metadata
>>>> associated with
>>>>>     the transaction id and if so, how are they propagated downstream?
>>>>>
>>>>>     101. For the finance use case, if the concept of transaction is
>>>> important,
>>>>>     wouldn't it be typically included in the message payload instead of
>>>> as an
>>>>>     optional header field?
>>>>>
>>>>>     102. The data lineage that Altas and Navigator support seems to be
>>>> at the
>>>>>     dataset level, not per record level? So, not sure if per message
>>>> headers
>>>>>     are relevant there.
>>>>>
>>>>>     Mirroring:
>>>>>     103. The benefit of using separate partitions is that it potentially
>>>> makes
>>>>>     it easy to preserve offsets during mirroring. This will make it
>>>> easier for
>>>>>     consumer to switch clusters. Currently, the consumers can switch
>>>> clusters
>>>>>     by using the timestampToOffset() api, but it has to deal with
>>>> duplicates.
>>>>>     Good point on the issue with log compact and I am not sure how to
>>>> address
>>>>>     this. However, even if we mirror into the existing partitions, the
>>>> ordering
>>>>>     for messages generated from different clusters seems
>>>> non-deterministic
>>>>>     anyway. So, it seems that the consumers already have to deal with
>>>> that? If
>>>>>     a topic is compacted, does that mean which messages are preserved is
>>>> also
>>>>>     non-deterministic across clusters?
>>>>>
>>>>>     104. Good point on partition key.
>>>>>
>>>>>     End-to-end encryption:
>>>>>     105. So, it seems end-to-end encryption is useful. Are headers
>>>> useful there?
>>>>>
>>>>>     Auditing:
>>>>>     106. It seems other than the UUID, all other metadata are per
>>>> producer?
>>>>>
>>>>>     EOS:
>>>>>     107. How are those UUIDs generated? I am not sure if they can be
>>>> generated
>>>>>     in the producer library. An application may send messages through a
>>>> load
>>>>>     balancer and on retry, the same message could be routed to a
>>>> different
>>>>>     producer instance. So, it seems that the application has to generate
>>>> the
>>>>>     UUIDs. In that case, shouldn't the application just put the UUID in
>>>> the
>>>>>     payload?
>>>>>
>>>>>     Thanks,
>>>>>
>>>>>     Jun
>>>>>
>>>>>
>>>>>     On Fri, Dec 2, 2016 at 4:57 PM, Michael Pearce <
>>>> Michael.Pearce@ig.com>
>>>>>     wrote:
>>>>>
>>>>>     > Hi Jun.
>>>>>     >
>>>>>     > Per Transaction Tracing / Data Lineage.
>>>>>     >
>>>>>     > As Stated in the KIP this has the first use case of how many APM
>>>> tools now
>>>>>     > work.
>>>>>     > I would find it impossible for any one to argue this is not
>>>> important or a
>>>>>     > niche market as it has its own gartner report for this space. Such
>>>>>     > companies as Appdynamics, NewRelic, Dynatrace, Hawqular are but a
>>>> few.
>>>>>     >
>>>>>     > Likewise these APM tools can help very rapidly track down issues
>>>> and
>>>>>     > automatically capture metrics, perform actions based on unexpected
>>>> behavior
>>>>>     > to auto recover services.
>>>>>     >
>>>>>     > Before mentioning looking at aggregated stats, in these cases where
>>>>>     > actually on critical flows we cannot afford to have aggregated
>>>> rolled up
>>>>>     > stats only.
>>>>>     >
>>>>>     > With the APM tool we use its actually able to detect a single
>>>> transaction
>>>>>     > failure and capture the thread traces in the JVM where it failed
>>>> and
>>>>>     > everything for us, to the point it sends us alerts where we have
>>>> this
>>>>>     > giving the line number of the code that caused it, the transaction
>>>> trace
>>>>>     > through all the services and endpoints (supported) upto the point
>>>> of
>>>>>     > failure, it can also capture the data in and out (so we can
>>>> replay).
>>>>>     > Because atm Kafka doesn’t support us being able to stich in these
>>>> tracing
>>>>>     > transaction ids natively, we cannot get these benefits as such is
>>>> limiting
>>>>>     > our ability support apps and monitor them to the same standards we
>>>> come to
>>>>>     > expect when on a kafka flow.
>>>>>     >
>>>>>     > This actually ties in with Data Lineage, as the same tracing can
>>>> be used
>>>>>     > to back stich this. Essentially many times due to the sums of money
>>>>>     > involved there are disputes, and typically as a financial
>>>> institute the
>>>>>     > easiest and cleanest way to prove when disputes arise is to
>>>> present the
>>>>>     > actual flow and processes involved in a transaction.
>>>>>     >
>>>>>     > Likewise as Hadoop matures its evident this case is important, as
>>>> tools
>>>>>     > such as Atlas (Hortonworks led) and Navigator (cloudera led) are
>>>> evident
>>>>>     > also I believe the importance here is very much NOT just a
>>>> financial issue.
>>>>>     >
>>>>>     > From a MDM point of view any company wanting to care about Data
>>>> Quality
>>>>>     > and Data Governance - Data Lineage is a key piece in this puzzle.
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > RE Mirroring,
>>>>>     >
>>>>>     > As per the KIP in-fact this is exactly what we do re cluster id,
>>>> to mirror
>>>>>     > a network of clusters between AZ’s / Regions. We know a
>>>> transaction for a
>>>>>     > key will be done within a  AZ/Region, as such we know the write to
>>>> kafka
>>>>>     > would be ordered per key. But we need eventual view of that across
>>>> in our
>>>>>     > other regions/az’s. When we have complete AZ or Region failure we
>>>> know
>>>>>     > there will be a brief interruption whilst those transactions are
>>>> moved to
>>>>>     > another region but we expect after it to continue.
>>>>>     >
>>>>>     > As mentioned having separate Partions to do this starts to get
>>>>>     > ugly/complicated for us:
>>>>>     > how would I do compaction where a key is in two partitions?
>>>>>     > How do we balance consumers so where multiple partitions with the
>>>> same key
>>>>>     > goto the same consumer
>>>>>     > What do you do if cluster 1 has 5 partitions but cluster 20 has 10
>>>> because
>>>>>     > its larger kit in our more core DC’s, as such key to partition
>>>> mappings for
>>>>>     > consumers get even more complicated.
>>>>>     > What do you do if we add or remove a complete region
>>>>>     >
>>>>>     > Where as simple mirror will work we just need to ensure we don’t
>>>> have a
>>>>>     > cycle which we can do with clusterId.
>>>>>     >
>>>>>     > We even have started to look at shortest path mirror routing based
>>>> on
>>>>>     > clusterId, if we also had the region and az info on the originating
>>>>>     > message, this we have not implemented but some ideas come from
>>>> network
>>>>>     > routing, and also the dispatcher router in apache qpid.
>>>>>     >
>>>>>     > Also we need to have data perimeters e.g. certain data cannot leave
>>>>>     > certain countries borders. We want this all automated so that at
>>>> the
>>>>>     > platform level without having to touch or look at the business
>>>> data inside
>>>>>     > we can have headers we can put tags into so that we can ensure
>>>> this doesn’t
>>>>>     > occur when we mirror. (actually links in to data lineage / tracing
>>>> as again
>>>>>     > we need to tag messages at a platform level) Examples are we are
>>>> not
>>>>>     > allowed Private customer details to leave Switzerland, yet we need
>>>> those
>>>>>     > systems integrated.
>>>>>     >
>>>>>     > Lastly around mirroring we have a partionKey field, as the key
>>>> used for
>>>>>     > portioning logic != compaction key all the time but we want to
>>>> preserve it
>>>>>     > for when we mirror so that if source cluster partition count !=
>>>> destination
>>>>>     > cluster partition count we can honour the same partitioning logic.
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > RE End 2 End encryption
>>>>>     >
>>>>>     > As I believe mentioned just before, the solution you mention just
>>>> doesn’t
>>>>>     > cut the mustard these days with many regulators. An operations
>>>> person with
>>>>>     > access to the box should not be able to have access to the data.
>>>> Many now
>>>>>     > actually impose quite literally the implementation expected being
>>>> end2end
>>>>>     > encryption for certain data (Singapore for us is one that I am
>>>> most aware
>>>>>     > of). In fact we’re even now needing encrypt the data and store the
>>>> keys in
>>>>>     > HSM modules.
>>>>>     >
>>>>>     > Likewise the performance penalty on encrypting decrypting as you
>>>> produce
>>>>>     > over wire, then again encrypt decrypt as the data is stored on the
>>>> brokers
>>>>>     > disks and back again, then again encrypted and decrypted back over
>>>> the wire
>>>>>     > each time for each consumer all adds up, ignoring this doubling
>>>> with mirror
>>>>>     > makers etc. simply encrypting the value once on write by the
>>>> client and
>>>>>     > again decrypting on consume by the consumer is far more
>>>> performant, but
>>>>>     > then the routing and platform meta data needs to be separate (thus
>>>> headers)
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > RE Auditing:
>>>>>     >
>>>>>     > Our Auditing needs are:
>>>>>     > Producer Id,
>>>>>     > Origin Cluster Id that message first produced into
>>>>>     > Origin AZ – agreed we can derive this if we have cluster id, but
>>>> it makes
>>>>>     > resolving this for audit reporting a lot easier.
>>>>>     > Origin Region – agreed we can derive this if we have cluster id,
>>>> but it
>>>>>     > makes resolving this for audit reporting a lot easier.
>>>>>     > Unique Message Identification (this is not the same as transaction
>>>>>     > tracing) – note offset and partition are not the same, as when we
>>>> mirror or
>>>>>     > have for what ever system failure duplicate send,
>>>>>     > Custom Client wrapper version (where organizations have to wrap
>>>> the kafka
>>>>>     > client for added features) so we know what version of the wrapper
>>>> is used
>>>>>     > Producer IP address (in case of clients being in our vm/open stack
>>>> infra
>>>>>     > where they can move around, producer id will stay the same but
>>>> this would
>>>>>     > change)
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > RE Once and only once delivery case
>>>>>     >
>>>>>     > Using the same Message UUID for auditing we can achieve this quite
>>>> simply.
>>>>>     >
>>>>>     > As per how some other brokers do this (cough qpid, artemis)
>>>> message uuid
>>>>>     > are used to dedupe where message is sent and produced but the
>>>> client didn’t
>>>>>     > receive the ack, and there for replays the send, by having a
>>>> unique message
>>>>>     > id per message, this can be filtered out, on consumers where
>>>> message
>>>>>     > delivery may occur twice for what ever reasons a message uuid can
>>>> be used
>>>>>     > to remove duplicates being deliverd , like wise we can do this in
>>>> the
>>>>>     > mirrormakers so if we detect a dupe message we can avoid
>>>> replicating it.
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > Cheers
>>>>>     > Mike
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > On 02/12/2016, 22:09, "Jun Rao" <ju...@confluent.io> wrote:
>>>>>     >
>>>>>     >     Since this KIP affects message format, wire protocol, apis, I
>>>> think
>>>>>     > it's
>>>>>     >     worth spending a bit more time to nail down the concrete use
>>>> cases. It
>>>>>     >     would be bad if we add this feature, but when start
>>>> implementing it
>>>>>     > for say
>>>>>     >     mirroring, we then realize that header is not the best
>>>> approach.
>>>>>     > Initially,
>>>>>     >     I thought I was convinced of the use cases of headers and was
>>>> trying to
>>>>>     >     write down a few use cases to convince others. That's when I
>>>> became
>>>>>     > less
>>>>>     >     certain. For me to be convinced, I just want to see two strong
>>>> use
>>>>>     > cases
>>>>>     >     (instead of 10 maybe use cases) in the third-party space. The
>>>> reason is
>>>>>     >     that when we discussed the use cases within a company, often
>>>> it ends
>>>>>     > with
>>>>>     >     "we can't force everyone to use this standard since we may
>>>> have to
>>>>>     >     integrate with third-party tools".
>>>>>     >
>>>>>     >     At present, I am not sure why headers are useful for things
>>>> like
>>>>>     > schemaId
>>>>>     >     or encryption. In order to do anything useful to the value,
>>>> one needs
>>>>>     > to
>>>>>     >     know the schemaId or how data is encrypted, but header is
>>>> optional.
>>>>>     > But, I
>>>>>     >     can be convinced if someone (Radai, Sean, Todd?) provides more
>>>> details
>>>>>     > on
>>>>>     >     the argument.
>>>>>     >
>>>>>     >     I am not very sure header is the best approach for mirroring
>>>> either. If
>>>>>     >     someone has thought about this more, I'd be happy to hear.
>>>>>     >
>>>>>     >     I can see the data lineage use case. I am just not sure how
>>>> widely
>>>>>     >     applicable this is. If someone familiar with this space can
>>>> justify
>>>>>     > this is
>>>>>     >     a significant use case, say in the finance industry, this
>>>> would be a
>>>>>     > strong
>>>>>     >     use case.
>>>>>     >
>>>>>     >     I can see the auditing use case. I am just not sure if a native
>>>>>     > producer id
>>>>>     >     solves that problem. If there are additional metadata that's
>>>> worth
>>>>>     >     collecting but not covered by the producer id, that would make
>>>> this a
>>>>>     >     strong use case.
>>>>>     >
>>>>>     >     Thanks,
>>>>>     >
>>>>>     >     Jun
>>>>>     >
>>>>>     >
>>>>>     >     On Fri, Dec 2, 2016 at 1:41 PM, radai <
>>>> radai.rosenblatt@gmail.com>
>>>>>     > wrote:
>>>>>     >
>>>>>     >     > this KIP is about enabling headers, nothing more nothing
>>>> less - so
>>>>>     > no,
>>>>>     >     > broker-side use of headers is not in the KIP scope.
>>>>>     >     >
>>>>>     >     > obviously though, once you have headers potential use cases
>>>> could
>>>>>     > include
>>>>>     >     > broker-side header-aware interceptors (which would be the
>>>> topic of
>>>>>     > other
>>>>>     >     > future KIPs).
>>>>>     >     >
>>>>>     >     > a trivially clear use case (to me) would be using such
>>>> broker-side
>>>>>     >     > interceptors to enforce compliance with organizational
>>>> policies - it
>>>>>     > would
>>>>>     >     > make our SREs lives much easier if instead of retroactively
>>>>>     > discovering
>>>>>     >     > "rogue" topics/users those messages would have been rejected
>>>>>     > up-front.
>>>>>     >     >
>>>>>     >     > the kafka broker code is lacking any such extensibility
>>>> support
>>>>>     > (beyond
>>>>>     >     > maybe authorizer) which is why these use cases were left out
>>>> of the
>>>>>     > "case
>>>>>     >     > for headers" doc - broker extensibility is a separate
>>>> discussion.
>>>>>     >     >
>>>>>     >     > On Fri, Dec 2, 2016 at 12:59 PM, Gwen Shapira <
>>>> gwen@confluent.io>
>>>>>     > wrote:
>>>>>     >     >
>>>>>     >     > > Woah, I wasn't aware this is something we'll do. It wasn't
>>>> in the
>>>>>     > KIP,
>>>>>     >     > > right?
>>>>>     >     > >
>>>>>     >     > > I guess we could do it the same way ACLs currently work.
>>>>>     >     > > I had in mind something that will allow admins to apply
>>>> rules to
>>>>>     > the
>>>>>     >     > > new create/delete/config topic APIs. So Todd can decide to
>>>> reject
>>>>>     >     > > "create topic" requests that ask for more than 40
>>>> partitions, or
>>>>>     >     > > require exactly 3 replicas, or no more than 50GB partition
>>>> size,
>>>>>     > etc.
>>>>>     >     > >
>>>>>     >     > > ACLs were added a bit ad-hoc, if we are planning to apply
>>>> more
>>>>>     > rules
>>>>>     >     > > to requests (and I think we should), we may want a bit
>>>> more generic
>>>>>     >     > > design around that.
>>>>>     >     > >
>>>>>     >     > > On Fri, Dec 2, 2016 at 7:16 AM, radai <
>>>> radai.rosenblatt@gmail.com>
>>>>>     >     > wrote:
>>>>>     >     > > > "wouldn't you be in the business of making sure everyone
>>>> uses
>>>>>     > them
>>>>>     >     > > > properly?"
>>>>>     >     > > >
>>>>>     >     > > > thats where a broker-side plugin would come handy - any
>>>> incoming
>>>>>     >     > message
>>>>>     >     > > > that does not conform to org policy (read - does not
>>>> have the
>>>>>     > proper
>>>>>     >     > > > headers) gets thrown out (with an error returned to user)
>>>>>     >     > > >
>>>>>     >     > > > On Thu, Dec 1, 2016 at 8:44 PM, Todd Palino <
>>>> tpalino@gmail.com>
>>>>>     > wrote:
>>>>>     >     > > >
>>>>>     >     > > >> Come on, I’ve done at least 2 talks on this one :)
>>>>>     >     > > >>
>>>>>     >     > > >> Producing counts to a topic is part of it, but that’s
>>>> only
>>>>>     > part. So
>>>>>     >     > you
>>>>>     >     > > >> count you have 100 messages in topic A. When you mirror
>>>> topic A
>>>>>     > to
>>>>>     >     > > another
>>>>>     >     > > >> cluster, you have 99 messages. Where was your problem?
>>>> Or
>>>>>     > worse, you
>>>>>     >     > > have
>>>>>     >     > > >> 100 messages, but one producer duplicated messages and
>>>> another
>>>>>     > one
>>>>>     >     > lost
>>>>>     >     > > >> messages. You need details about where the message came
>>>> from in
>>>>>     > order
>>>>>     >     > to
>>>>>     >     > > >> pinpoint problems when they happen. Source producer
>>>> info, where
>>>>>     > it was
>>>>>     >     > > >> produced into your infrastructure, and when it was
>>>> produced.
>>>>>     > This
>>>>>     >     > > requires
>>>>>     >     > > >> you to add the information to the message.
>>>>>     >     > > >>
>>>>>     >     > > >> And yes, you still need to maintain your clients. So
>>>> maybe my
>>>>>     > original
>>>>>     >     > > >> example was not the best. My thoughts on not wanting to
>>>> be
>>>>>     > responsible
>>>>>     >     > > for
>>>>>     >     > > >> message formats stands, because that’s very much
>>>> separate from
>>>>>     > the
>>>>>     >     > > client.
>>>>>     >     > > >> As you know, we have our own internal client library
>>>> that can
>>>>>     > insert
>>>>>     >     > the
>>>>>     >     > > >> right headers, and right now inserts the right audit
>>>>>     > information into
>>>>>     >     > > the
>>>>>     >     > > >> message fields. If they exist, and assuming the message
>>>> is Avro
>>>>>     >     > encoded.
>>>>>     >     > > >> What if someone wants to use JSON instead for a good
>>>> reason?
>>>>>     > What if
>>>>>     >     > > user X
>>>>>     >     > > >> wants to encrypt messages, but user Y does not?
>>>> Maintaining the
>>>>>     > client
>>>>>     >     > > >> library is still much easier than maintaining the
>>>> message
>>>>>     > formats.
>>>>>     >     > > >>
>>>>>     >     > > >> -Todd
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >> On Thu, Dec 1, 2016 at 6:21 PM, Gwen Shapira <
>>>> gwen@confluent.io
>>>>>     > >
>>>>>     >     > wrote:
>>>>>     >     > > >>
>>>>>     >     > > >> > Based on your last sentence, consider me convinced :)
>>>>>     >     > > >> >
>>>>>     >     > > >> > I get why headers are critical for Mirroring (you
>>>> need tags to
>>>>>     >     > prevent
>>>>>     >     > > >> > loops and sometimes to route messages to the correct
>>>>>     > destination).
>>>>>     >     > > >> > But why do you need headers to audit? We are auditing
>>>> by
>>>>>     > producing
>>>>>     >     > > >> > counts to a side topic (and I was under the
>>>> impression you do
>>>>>     > the
>>>>>     >     > > >> > same), so we never need to modify the message.
>>>>>     >     > > >> >
>>>>>     >     > > >> > Another thing - after we added headers, wouldn't you
>>>> be in the
>>>>>     >     > > >> > business of making sure everyone uses them properly?
>>>> Making
>>>>>     > sure
>>>>>     >     > > >> > everyone includes the right headers you need, not
>>>> using the
>>>>>     > header
>>>>>     >     > > >> > names you intend to use, etc. I don't think the
>>>> "policing"
>>>>>     > business
>>>>>     >     > > >> > will ever go away.
>>>>>     >     > > >> >
>>>>>     >     > > >> > On Thu, Dec 1, 2016 at 5:25 PM, Todd Palino <
>>>>>     > tpalino@gmail.com>
>>>>>     >     > > wrote:
>>>>>     >     > > >> > > Got it. As an ops guy, I'm not very happy with the
>>>>>     > workaround.
>>>>>     >     > Avro
>>>>>     >     > > >> means
>>>>>     >     > > >> > > that I have to be concerned with the format of the
>>>> messages
>>>>>     > in
>>>>>     >     > > order to
>>>>>     >     > > >> > run
>>>>>     >     > > >> > > the infrastructure (audit, mirroring, etc.). That
>>>> means
>>>>>     > that I
>>>>>     >     > have
>>>>>     >     > > to
>>>>>     >     > > >> > > handle the schemas, and I have to enforce rules
>>>> about good
>>>>>     >     > formats.
>>>>>     >     > > >> This
>>>>>     >     > > >> > is
>>>>>     >     > > >> > > not something I want to be in the business of,
>>>> because I
>>>>>     > should be
>>>>>     >     > > able
>>>>>     >     > > >> > to
>>>>>     >     > > >> > > run a service infrastructure without needing to be
>>>> in the
>>>>>     > weeds of
>>>>>     >     > > >> > dealing
>>>>>     >     > > >> > > with customer data formats.
>>>>>     >     > > >> > >
>>>>>     >     > > >> > > Trust me, a sizable portion of my support time is
>>>> spent
>>>>>     > dealing
>>>>>     >     > with
>>>>>     >     > > >> > schema
>>>>>     >     > > >> > > issues. I really would like to get away from that.
>>>> Maybe
>>>>>     > I'd have
>>>>>     >     > > more
>>>>>     >     > > >> > time
>>>>>     >     > > >> > > for other hobbies. Like writing. ;)
>>>>>     >     > > >> > >
>>>>>     >     > > >> > > -Todd
>>>>>     >     > > >> > >
>>>>>     >     > > >> > > On Thu, Dec 1, 2016 at 4:04 PM Gwen Shapira <
>>>>>     > gwen@confluent.io>
>>>>>     >     > > wrote:
>>>>>     >     > > >> > >
>>>>>     >     > > >> > >> I'm pretty satisfied with the current workarounds
>>>> (Avro
>>>>>     > container
>>>>>     >     > > >> > >> format), so I'm not too excited about the extra
>>>> work
>>>>>     > required to
>>>>>     >     > do
>>>>>     >     > > >> > >> headers in Kafka. I absolutely don't mind it if
>>>> you do
>>>>>     > it...
>>>>>     >     > > >> > >> I think the Apache convention for "good idea, but
>>>> not
>>>>>     > willing to
>>>>>     >     > > put
>>>>>     >     > > >> > >> any work toward it" is +0.5? anyway, that's what I
>>>> was
>>>>>     > trying to
>>>>>     >     > > >> > >> convey :)
>>>>>     >     > > >> > >>
>>>>>     >     > > >> > >> On Thu, Dec 1, 2016 at 3:05 PM, Todd Palino <
>>>>>     > tpalino@gmail.com>
>>>>>     >     > > >> wrote:
>>>>>     >     > > >> > >> > Well I guess my question for you, then, is what
>>>> is
>>>>>     > holding you
>>>>>     >     > > back
>>>>>     >     > > >> > from
>>>>>     >     > > >> > >> > full support for headers? What’s the bit that
>>>> you’re
>>>>>     > missing
>>>>>     >     > that
>>>>>     >     > > >> has
>>>>>     >     > > >> > you
>>>>>     >     > > >> > >> > under a full +1?
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> > -Todd
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> > On Thu, Dec 1, 2016 at 1:59 PM, Gwen Shapira <
>>>>>     >     > gwen@confluent.io>
>>>>>     >     > > >> > wrote:
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> >> I know why people who support headers support
>>>> them, and
>>>>>     > I've
>>>>>     >     > > seen
>>>>>     >     > > >> > what
>>>>>     >     > > >> > >> >> the discussion is like.
>>>>>     >     > > >> > >> >>
>>>>>     >     > > >> > >> >> This is why I'm asking people who are against
>>>> headers
>>>>>     >     > > (especially
>>>>>     >     > > >> > >> >> committers) what will make them change their
>>>> mind - so
>>>>>     > we can
>>>>>     >     > > get
>>>>>     >     > > >> > this
>>>>>     >     > > >> > >> >> part over one way or another.
>>>>>     >     > > >> > >> >>
>>>>>     >     > > >> > >> >> If I sound frustrated it is not at Radai, Jun
>>>> or you
>>>>>     > (Todd)...
>>>>>     >     > > I am
>>>>>     >     > > >> > >> >> just looking for something concrete we can do
>>>> to move
>>>>>     > the
>>>>>     >     > > >> discussion
>>>>>     >     > > >> > >> >> along to the yummy design details (which is the
>>>>>     > argument I
>>>>>     >     > > really
>>>>>     >     > > >> am
>>>>>     >     > > >> > >> >> looking forward to).
>>>>>     >     > > >> > >> >>
>>>>>     >     > > >> > >> >> On Thu, Dec 1, 2016 at 1:53 PM, Todd Palino <
>>>>>     >     > tpalino@gmail.com>
>>>>>     >     > > >> > wrote:
>>>>>     >     > > >> > >> >> > So, Gwen, to your question (even though I’m
>>>> not a
>>>>>     >     > > committer)...
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > I have always been a strong supporter of
>>>> introducing
>>>>>     > the
>>>>>     >     > > concept
>>>>>     >     > > >> > of an
>>>>>     >     > > >> > >> >> > envelope to messages, which headers
>>>> accomplishes. The
>>>>>     >     > message
>>>>>     >     > > key
>>>>>     >     > > >> > is
>>>>>     >     > > >> > >> >> > already an example of a piece of envelope
>>>>>     > information. By
>>>>>     >     > > >> > providing a
>>>>>     >     > > >> > >> >> means
>>>>>     >     > > >> > >> >> > to do this within Kafka itself, and not
>>>> relying on
>>>>>     > use-case
>>>>>     >     > > >> > specific
>>>>>     >     > > >> > >> >> > implementations, you make it much easier for
>>>>>     > components to
>>>>>     >     > > >> > >> interoperate.
>>>>>     >     > > >> > >> >> It
>>>>>     >     > > >> > >> >> > simplifies development of all these things
>>>> (message
>>>>>     > routing,
>>>>>     >     > > >> > auditing,
>>>>>     >     > > >> > >> >> > encryption, etc.) because each one does not
>>>> have to
>>>>>     > reinvent
>>>>>     >     > > the
>>>>>     >     > > >> > >> wheel.
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > It also makes it much easier from a client
>>>> point of
>>>>>     > view if
>>>>>     >     > > the
>>>>>     >     > > >> > >> headers
>>>>>     >     > > >> > >> >> are
>>>>>     >     > > >> > >> >> > defined as part of the protocol and/or
>>>> message format
>>>>>     > in
>>>>>     >     > > general
>>>>>     >     > > >> > >> because
>>>>>     >     > > >> > >> >> > you can easily produce and consume messages
>>>> without
>>>>>     > having
>>>>>     >     > to
>>>>>     >     > > >> take
>>>>>     >     > > >> > >> into
>>>>>     >     > > >> > >> >> > account specific cases. For example, I want
>>>> to route
>>>>>     >     > messages,
>>>>>     >     > > >> but
>>>>>     >     > > >> > >> >> client A
>>>>>     >     > > >> > >> >> > doesn’t support the way audit implemented
>>>> headers, and
>>>>>     >     > client
>>>>>     >     > > B
>>>>>     >     > > >> > >> doesn’t
>>>>>     >     > > >> > >> >> > support the way encryption or routing
>>>> implemented
>>>>>     > headers,
>>>>>     >     > so
>>>>>     >     > > now
>>>>>     >     > > >> > my
>>>>>     >     > > >> > >> >> > application has to create some really fragile
>>>> (my
>>>>>     >     > autocorrect
>>>>>     >     > > >> just
>>>>>     >     > > >> > >> tried
>>>>>     >     > > >> > >> >> to
>>>>>     >     > > >> > >> >> > make that “tragic”, which is probably
>>>> appropriate
>>>>>     > too) code
>>>>>     >     > to
>>>>>     >     > > >> > strip
>>>>>     >     > > >> > >> >> > everything off, rather than just consuming the
>>>>>     > messages,
>>>>>     >     > > picking
>>>>>     >     > > >> > out
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> 1
>>>>>     >     > > >> > >> >> > or 2 headers it’s interested in, and
>>>> performing its
>>>>>     >     > function.
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > Honestly, this discussion has been going on
>>>> for a
>>>>>     > long time,
>>>>>     >     > > and
>>>>>     >     > > >> > it’s
>>>>>     >     > > >> > >> >> > always “Oh, you came up with 2 use cases, and
>>>> yeah,
>>>>>     > those
>>>>>     >     > use
>>>>>     >     > > >> cases
>>>>>     >     > > >> > >> are
>>>>>     >     > > >> > >> >> > real things that someone would want to do.
>>>> Here’s an
>>>>>     >     > alternate
>>>>>     >     > > >> way
>>>>>     >     > > >> > to
>>>>>     >     > > >> > >> >> > implement them so let’s not do headers.” If
>>>> we have a
>>>>>     > few
>>>>>     >     > use
>>>>>     >     > > >> cases
>>>>>     >     > > >> > >> that
>>>>>     >     > > >> > >> >> we
>>>>>     >     > > >> > >> >> > actually came up with, you can be sure that
>>>> over the
>>>>>     > next
>>>>>     >     > year
>>>>>     >     > > >> > >> there’s a
>>>>>     >     > > >> > >> >> > dozen others that we didn’t think of that
>>>> someone
>>>>>     > would like
>>>>>     >     > > to
>>>>>     >     > > >> > do. I
>>>>>     >     > > >> > >> >> > really think it’s time to stop rehashing this
>>>>>     > discussion and
>>>>>     >     > > >> > instead
>>>>>     >     > > >> > >> >> focus
>>>>>     >     > > >> > >> >> > on a workable standard that we can adopt.
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > -Todd
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > On Thu, Dec 1, 2016 at 1:39 PM, Todd Palino <
>>>>>     >     > > tpalino@gmail.com>
>>>>>     >     > > >> > >> wrote:
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> >> C. per message encryption
>>>>>     >     > > >> > >> >> >>> One drawback of this approach is that this
>>>>>     > significantly
>>>>>     >     > > reduce
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> effectiveness of compression, which happens
>>>> on a
>>>>>     > set of
>>>>>     >     > > >> > serialized
>>>>>     >     > > >> > >> >> >>> messages. An alternative is to enable SSL
>>>> for wire
>>>>>     >     > > encryption
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> rely
>>>>>     >     > > >> > >> >> on
>>>>>     >     > > >> > >> >> >>> the storage system (e.g. LUKS) for at rest
>>>>>     > encryption.
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >> Jun, this is not sufficient. While this does
>>>> cover
>>>>>     > the case
>>>>>     >     > > of
>>>>>     >     > > >> > >> removing
>>>>>     >     > > >> > >> >> a
>>>>>     >     > > >> > >> >> >> drive from the system, it will not satisfy
>>>> most
>>>>>     > compliance
>>>>>     >     > > >> > >> requirements
>>>>>     >     > > >> > >> >> for
>>>>>     >     > > >> > >> >> >> encryption of data as whoever has access to
>>>> the
>>>>>     > broker
>>>>>     >     > itself
>>>>>     >     > > >> > still
>>>>>     >     > > >> > >> has
>>>>>     >     > > >> > >> >> >> access to the unencrypted data. For
>>>> end-to-end
>>>>>     > encryption
>>>>>     >     > you
>>>>>     >     > > >> > need to
>>>>>     >     > > >> > >> >> >> encrypt at the producer, before it enters the
>>>>>     > system, and
>>>>>     >     > > >> decrypt
>>>>>     >     > > >> > at
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> >> consumer, after it exits the system.
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >> -Todd
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >> On Thu, Dec 1, 2016 at 1:03 PM, radai <
>>>>>     >     > > >> radai.rosenblatt@gmail.com
>>>>>     >     > > >> > >
>>>>>     >     > > >> > >> >> wrote:
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >>> another big plus of headers in the protocol
>>>> is that
>>>>>     > it
>>>>>     >     > would
>>>>>     >     > > >> > enable
>>>>>     >     > > >> > >> >> rapid
>>>>>     >     > > >> > >> >> >>> iteration on ideas outside of core kafka
>>>> and would
>>>>>     > reduce
>>>>>     >     > > the
>>>>>     >     > > >> > >> number of
>>>>>     >     > > >> > >> >> >>> future wire format changes required.
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> a lot of what is currently a KIP represents
>>>> use
>>>>>     > cases that
>>>>>     >     > > are
>>>>>     >     > > >> > not
>>>>>     >     > > >> > >> 100%
>>>>>     >     > > >> > >> >> >>> relevant to all users, and some of them
>>>> require
>>>>>     > rather
>>>>>     >     > > invasive
>>>>>     >     > > >> > wire
>>>>>     >     > > >> > >> >> >>> protocol changes. a thing a good recent
>>>> example of
>>>>>     > this is
>>>>>     >     > > >> > kip-98.
>>>>>     >     > > >> > >> >> >>> tx-utilizing traffic is expected to be a
>>>> very small
>>>>>     >     > > fraction of
>>>>>     >     > > >> > >> total
>>>>>     >     > > >> > >> >> >>> traffic and yet the changes are invasive.
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> every such wire format change translates
>>>> into
>>>>>     > painful and
>>>>>     >     > > slow
>>>>>     >     > > >> > >> >> adoption of
>>>>>     >     > > >> > >> >> >>> new versions.
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> i think a lot of functionality currently in
>>>> KIPs
>>>>>     > could be
>>>>>     >     > > "spun
>>>>>     >     > > >> > out"
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> implemented as opt-in plugins transmitting
>>>> data over
>>>>>     >     > > headers.
>>>>>     >     > > >> > this
>>>>>     >     > > >> > >> >> would
>>>>>     >     > > >> > >> >> >>> keep the core wire format stable(r), core
>>>> codebase
>>>>>     >     > smaller,
>>>>>     >     > > and
>>>>>     >     > > >> > >> avoid
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> "burden of proof" thats sometimes required
>>>> to prove
>>>>>     > a
>>>>>     >     > > certain
>>>>>     >     > > >> > >> feature
>>>>>     >     > > >> > >> >> is
>>>>>     >     > > >> > >> >> >>> useful enough for a wide-enough audience to
>>>> warrant
>>>>>     > a wire
>>>>>     >     > > >> format
>>>>>     >     > > >> > >> >> change
>>>>>     >     > > >> > >> >> >>> and code complexity additions.
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> (to be clear - kip-98 goes beyond "mere"
>>>> wire format
>>>>>     >     > changes
>>>>>     >     > > >> and
>>>>>     >     > > >> > im
>>>>>     >     > > >> > >> not
>>>>>     >     > > >> > >> >> >>> saying it could have been completely done
>>>> with
>>>>>     > headers,
>>>>>     >     > but
>>>>>     >     > > >> > >> >> exactly-once
>>>>>     >     > > >> > >> >> >>> delivery certainly could)
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> On Thu, Dec 1, 2016 at 11:20 AM, Gwen
>>>> Shapira <
>>>>>     >     > > >> gwen@confluent.io
>>>>>     >     > > >> > >
>>>>>     >     > > >> > >> >> wrote:
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> > On Thu, Dec 1, 2016 at 10:24 AM, radai <
>>>>>     >     > > >> > >> radai.rosenblatt@gmail.com>
>>>>>     >     > > >> > >> >> >>> wrote:
>>>>>     >     > > >> > >> >> >>> > > "For use cases within an organization,
>>>> one could
>>>>>     >     > always
>>>>>     >     > > use
>>>>>     >     > > >> > >> other
>>>>>     >     > > >> > >> >> >>> > > approaches such as company-wise
>>>> containers"
>>>>>     >     > > >> > >> >> >>> > > this is what linkedin has traditionally
>>>> done
>>>>>     > but there
>>>>>     >     > > are
>>>>>     >     > > >> > now
>>>>>     >     > > >> > >> >> cases
>>>>>     >     > > >> > >> >> >>> > (read
>>>>>     >     > > >> > >> >> >>> > > - topics) where this is not acceptable.
>>>> this
>>>>>     > makes
>>>>>     >     > > headers
>>>>>     >     > > >> > >> useful
>>>>>     >     > > >> > >> >> even
>>>>>     >     > > >> > >> >> >>> > > within single orgs for cases where
>>>>>     >     > > one-container-fits-all
>>>>>     >     > > >> > cannot
>>>>>     >     > > >> > >> >> >>> apply.
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > > as for the particular use cases listed,
>>>> i dont
>>>>>     > want
>>>>>     >     > > this to
>>>>>     >     > > >> > >> devolve
>>>>>     >     > > >> > >> >> >>> to a
>>>>>     >     > > >> > >> >> >>> > > discussion of particular use cases - i
>>>> think its
>>>>>     >     > enough
>>>>>     >     > > >> that
>>>>>     >     > > >> > >> some
>>>>>     >     > > >> > >> >> of
>>>>>     >     > > >> > >> >> >>> them
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > I think a main point of contention is
>>>> that: We
>>>>>     >     > identified
>>>>>     >     > > few
>>>>>     >     > > >> > >> >> >>> > use-cases where headers are useful, do we
>>>> want
>>>>>     > Kafka to
>>>>>     >     > > be a
>>>>>     >     > > >> > >> system
>>>>>     >     > > >> > >> >> >>> > that supports those use-cases?
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > For example, Jun said:
>>>>>     >     > > >> > >> >> >>> > "Not sure how widely useful record-level
>>>> lineage
>>>>>     > is
>>>>>     >     > though
>>>>>     >     > > >> > since
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> >>> > overhead could
>>>>>     >     > > >> > >> >> >>> > be significant."
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > We know NiFi supports record level
>>>> lineage. I
>>>>>     > don't
>>>>>     >     > think
>>>>>     >     > > it
>>>>>     >     > > >> > was
>>>>>     >     > > >> > >> >> >>> > developed for lols, I think it is safe to
>>>> assume
>>>>>     > that
>>>>>     >     > the
>>>>>     >     > > NSA
>>>>>     >     > > >> > >> needed
>>>>>     >     > > >> > >> >> >>> > that functionality. We also know that
>>>> certain
>>>>>     > financial
>>>>>     >     > > >> > institutes
>>>>>     >     > > >> > >> >> >>> > need to track tampering with records at a
>>>> record
>>>>>     > level
>>>>>     >     > and
>>>>>     >     > > >> > there
>>>>>     >     > > >> > >> are
>>>>>     >     > > >> > >> >> >>> > federal regulations that absolutely
>>>> require
>>>>>     > this.  They
>>>>>     >     > > also
>>>>>     >     > > >> > need
>>>>>     >     > > >> > >> to
>>>>>     >     > > >> > >> >> >>> > prove that routing apps that "touches" the
>>>>>     > messages and
>>>>>     >     > > >> either
>>>>>     >     > > >> > >> reads
>>>>>     >     > > >> > >> >> >>> > or updates headers couldn't have possibly
>>>>>     > modified the
>>>>>     >     > > >> payload
>>>>>     >     > > >> > >> >> itself.
>>>>>     >     > > >> > >> >> >>> > They use record level encryption to do
>>>> that -
>>>>>     > apps can
>>>>>     >     > > read
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> >> >>> > (sometimes) modify headers but can't
>>>> touch the
>>>>>     > payload.
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > We can totally say "those are corner
>>>> cases and
>>>>>     > not worth
>>>>>     >     > > >> adding
>>>>>     >     > > >> > >> >> >>> > headers to Kafka for", they should use a
>>>> different
>>>>>     >     > pubsub
>>>>>     >     > > >> > message
>>>>>     >     > > >> > >> for
>>>>>     >     > > >> > >> >> >>> > that (Nifi or one of the other 1000 that
>>>> cater
>>>>>     >     > > specifically
>>>>>     >     > > >> to
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> > financial industry).
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > But this gets us into a catch 22:
>>>>>     >     > > >> > >> >> >>> > If we discuss a specific use-case,
>>>> someone can
>>>>>     > always
>>>>>     >     > say
>>>>>     >     > > it
>>>>>     >     > > >> > isn't
>>>>>     >     > > >> > >> >> >>> > interesting enough for Kafka. If we
>>>> discuss more
>>>>>     > general
>>>>>     >     > > >> > trends,
>>>>>     >     > > >> > >> >> >>> > others can say "well, we are not sure any
>>>> of them
>>>>>     > really
>>>>>     >     > > >> needs
>>>>>     >     > > >> > >> >> headers
>>>>>     >     > > >> > >> >> >>> > specifically. This is just hand waving
>>>> and not
>>>>>     >     > > interesting.".
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > I think discussing use-cases in specifics
>>>> is super
>>>>>     >     > > important
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> >> decide
>>>>>     >     > > >> > >> >> >>> > implementation details for headers (my
>>>> use-cases
>>>>>     > lean
>>>>>     >     > > toward
>>>>>     >     > > >> > >> >> numerical
>>>>>     >     > > >> > >> >> >>> > keys with namespaces and object values,
>>>> others
>>>>>     > differ),
>>>>>     >     > > but I
>>>>>     >     > > >> > >> think
>>>>>     >     > > >> > >> >> we
>>>>>     >     > > >> > >> >> >>> > need to answer the general "Are we going
>>>> to have
>>>>>     >     > headers"
>>>>>     >     > > >> > question
>>>>>     >     > > >> > >> >> >>> > first.
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > I'd love to hear from the other
>>>> committers in the
>>>>>     >     > > discussion:
>>>>>     >     > > >> > >> >> >>> > What would it take to convince you that
>>>> headers
>>>>>     > in Kafka
>>>>>     >     > > are
>>>>>     >     > > >> a
>>>>>     >     > > >> > >> good
>>>>>     >     > > >> > >> >> >>> > idea in general, so we can move ahead and
>>>> try to
>>>>>     > agree
>>>>>     >     > on
>>>>>     >     > > the
>>>>>     >     > > >> > >> >> details?
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > I feel like we keep moving the goal posts
>>>> and
>>>>>     > this is
>>>>>     >     > > truly
>>>>>     >     > > >> > >> >> exhausting.
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > For the record, I mildly support adding
>>>> headers
>>>>>     > to Kafka
>>>>>     >     > > >> > (+0.5?).
>>>>>     >     > > >> > >> >> >>> > The community can continue to find
>>>> workarounds to
>>>>>     > the
>>>>>     >     > > issue
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> there
>>>>>     >     > > >> > >> >> >>> > are some benefits to keeping the message
>>>> format
>>>>>     > and
>>>>>     >     > > clients
>>>>>     >     > > >> > >> simpler.
>>>>>     >     > > >> > >> >> >>> > But I see the usefulness of headers to
>>>> many
>>>>>     > use-cases
>>>>>     >     > and
>>>>>     >     > > if
>>>>>     >     > > >> we
>>>>>     >     > > >> > >> can
>>>>>     >     > > >> > >> >> >>> > find a good and generally useful way to
>>>> add it to
>>>>>     > Kafka,
>>>>>     >     > > it
>>>>>     >     > > >> > will
>>>>>     >     > > >> > >> make
>>>>>     >     > > >> > >> >> >>> > Kafka easier to use for many - worthy
>>>> goal in my
>>>>>     > eyes.
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > > are interesting/feasible, but:
>>>>>     >     > > >> > >> >> >>> > > A+B. i think there are use cases for
>>>> polyglot
>>>>>     > topics.
>>>>>     >     > > >> > >> especially if
>>>>>     >     > > >> > >> >> >>> kafka
>>>>>     >     > > >> > >> >> >>> > > is being used to "trunk" something else.
>>>>>     >     > > >> > >> >> >>> > > D. multiple topics would make it harder
>>>> to write
>>>>>     >     > > portable
>>>>>     >     > > >> > >> consumer
>>>>>     >     > > >> > >> >> >>> code.
>>>>>     >     > > >> > >> >> >>> > > partition remapping would mess with
>>>> locality of
>>>>>     >     > > consumption
>>>>>     >     > > >> > >> >> >>> guarantees.
>>>>>     >     > > >> > >> >> >>> > > E+F. a use case I see for
>>>> lineage/metadata is
>>>>>     >     > > >> > >> billing/chargeback.
>>>>>     >     > > >> > >> >> for
>>>>>     >     > > >> > >> >> >>> > that
>>>>>     >     > > >> > >> >> >>> > > use case it is not enough to simply
>>>> record the
>>>>>     > point
>>>>>     >     > of
>>>>>     >     > > >> > origin,
>>>>>     >     > > >> > >> but
>>>>>     >     > > >> > >> >> >>> every
>>>>>     >     > > >> > >> >> >>> > > replication stop (think mirror maker)
>>>> must also
>>>>>     > add a
>>>>>     >     > > >> record
>>>>>     >     > > >> > to
>>>>>     >     > > >> > >> >> form a
>>>>>     >     > > >> > >> >> >>> > > "transit log".
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > > as for stream processing on top of
>>>> kafka - i
>>>>>     > know
>>>>>     >     > samza
>>>>>     >     > > >> has a
>>>>>     >     > > >> > >> >> metadata
>>>>>     >     > > >> > >> >> >>> > map
>>>>>     >     > > >> > >> >> >>> > > which they carry around in addition to
>>>> user
>>>>>     > values.
>>>>>     >     > > headers
>>>>>     >     > > >> > are
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> >>> > perfect
>>>>>     >     > > >> > >> >> >>> > > fit for these things.
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > > On Wed, Nov 30, 2016 at 6:50 PM, Jun
>>>> Rao <
>>>>>     >     > > jun@confluent.io
>>>>>     >     > > >> >
>>>>>     >     > > >> > >> wrote:
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > >> Hi, Michael,
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> In order to answer the first two
>>>> questions, it
>>>>>     > would
>>>>>     >     > be
>>>>>     >     > > >> > helpful
>>>>>     >     > > >> > >> >> if we
>>>>>     >     > > >> > >> >> >>> > could
>>>>>     >     > > >> > >> >> >>> > >> identify 1 or 2 strong use cases for
>>>> headers
>>>>>     > in the
>>>>>     >     > > space
>>>>>     >     > > >> > for
>>>>>     >     > > >> > >> >> >>> > third-party
>>>>>     >     > > >> > >> >> >>> > >> vendors. For use cases within an
>>>> organization,
>>>>>     > one
>>>>>     >     > > could
>>>>>     >     > > >> > always
>>>>>     >     > > >> > >> >> use
>>>>>     >     > > >> > >> >> >>> > other
>>>>>     >     > > >> > >> >> >>> > >> approaches such as company-wise
>>>> containers to
>>>>>     > get
>>>>>     >     > > around
>>>>>     >     > > >> w/o
>>>>>     >     > > >> > >> >> >>> headers. I
>>>>>     >     > > >> > >> >> >>> > >> went through the use cases in the KIP
>>>> and in
>>>>>     > Radai's
>>>>>     >     > > wiki
>>>>>     >     > > >> (
>>>>>     >     > > >> > >> >> >>> > >> https://cwiki.apache.org/confl
>>>>>     > uence/display/KAFKA/A+
>>>>>     >     > > >> > >> >> >>> > Case+for+Kafka+Headers
>>>>>     >     > > >> > >> >> >>> > >> ).
>>>>>     >     > > >> > >> >> >>> > >> The following are the ones that that I
>>>>>     > understand and
>>>>>     >     > > >> could
>>>>>     >     > > >> > be
>>>>>     >     > > >> > >> in
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> > >> third-party use case category.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> A. content-type
>>>>>     >     > > >> > >> >> >>> > >> It seems that in general, content-type
>>>> should
>>>>>     > be set
>>>>>     >     > at
>>>>>     >     > > >> the
>>>>>     >     > > >> > >> topic
>>>>>     >     > > >> > >> >> >>> level.
>>>>>     >     > > >> > >> >> >>> > >> Not sure if mixing messages with
>>>> different
>>>>>     > content
>>>>>     >     > > types
>>>>>     >     > > >> > >> should be
>>>>>     >     > > >> > >> >> >>> > >> encouraged.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> B. schema id
>>>>>     >     > > >> > >> >> >>> > >> Since the value is mostly useless
>>>> without
>>>>>     > schema id,
>>>>>     >     > it
>>>>>     >     > > >> > seems
>>>>>     >     > > >> > >> that
>>>>>     >     > > >> > >> >> >>> > storing
>>>>>     >     > > >> > >> >> >>> > >> the schema id together with serialized
>>>> bytes
>>>>>     > in the
>>>>>     >     > > value
>>>>>     >     > > >> is
>>>>>     >     > > >> > >> >> better?
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> C. per message encryption
>>>>>     >     > > >> > >> >> >>> > >> One drawback of this approach is that
>>>> this
>>>>>     >     > > significantly
>>>>>     >     > > >> > reduce
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> > >> effectiveness of compression, which
>>>> happens on
>>>>>     > a set
>>>>>     >     > of
>>>>>     >     > > >> > >> serialized
>>>>>     >     > > >> > >> >> >>> > >> messages. An alternative is to enable
>>>> SSL for
>>>>>     > wire
>>>>>     >     > > >> > encryption
>>>>>     >     > > >> > >> and
>>>>>     >     > > >> > >> >> >>> rely
>>>>>     >     > > >> > >> >> >>> > on
>>>>>     >     > > >> > >> >> >>> > >> the storage system (e.g. LUKS) for at
>>>> rest
>>>>>     >     > encryption.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> D. cluster ID for mirroring across
>>>> Kafka
>>>>>     > clusters
>>>>>     >     > > >> > >> >> >>> > >> This is actually interesting. Today,
>>>> to avoid
>>>>>     >     > > introducing
>>>>>     >     > > >> > >> cycles
>>>>>     >     > > >> > >> >> when
>>>>>     >     > > >> > >> >> >>> > doing
>>>>>     >     > > >> > >> >> >>> > >> mirroring across data centers, one
>>>> would
>>>>>     > either have
>>>>>     >     > to
>>>>>     >     > > >> set
>>>>>     >     > > >> > up
>>>>>     >     > > >> > >> two
>>>>>     >     > > >> > >> >> >>> Kafka
>>>>>     >     > > >> > >> >> >>> > >> clusters (a local and an aggregate)
>>>> per data
>>>>>     > center
>>>>>     >     > or
>>>>>     >     > > >> > rename
>>>>>     >     > > >> > >> >> topics.
>>>>>     >     > > >> > >> >> >>> > >> Neither is ideal. With headers, the
>>>> producer
>>>>>     > could
>>>>>     >     > tag
>>>>>     >     > > >> each
>>>>>     >     > > >> > >> >> message
>>>>>     >     > > >> > >> >> >>> with
>>>>>     >     > > >> > >> >> >>> > >> the producing cluster ID in the header.
>>>>>     > MirrorMaker
>>>>>     >     > > could
>>>>>     >     > > >> > then
>>>>>     >     > > >> > >> >> avoid
>>>>>     >     > > >> > >> >> >>> > >> mirroring messages to a cluster if
>>>> they are
>>>>>     > tagged
>>>>>     >     > with
>>>>>     >     > > >> the
>>>>>     >     > > >> > >> same
>>>>>     >     > > >> > >> >> >>> cluster
>>>>>     >     > > >> > >> >> >>> > >> id.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> However, an alternative approach is to
>>>>>     > introduce sth
>>>>>     >     > > like
>>>>>     >     > > >> > >> >> >>> hierarchical
>>>>>     >     > > >> > >> >> >>> > >> topic and store messages from different
>>>>>     > clusters in
>>>>>     >     > > >> > different
>>>>>     >     > > >> > >> >> >>> partitions
>>>>>     >     > > >> > >> >> >>> > >> under the same topic. This approach
>>>> avoids
>>>>>     > filtering
>>>>>     >     > > out
>>>>>     >     > > >> > >> unneeded
>>>>>     >     > > >> > >> >> >>> data
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> makes offset preserving easier to
>>>> support. It
>>>>>     > may
>>>>>     >     > make
>>>>>     >     > > >> > >> compaction
>>>>>     >     > > >> > >> >> >>> > trickier
>>>>>     >     > > >> > >> >> >>> > >> though since the same key may show up
>>>> in
>>>>>     > different
>>>>>     >     > > >> > partitions.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> E. record-level lineage
>>>>>     >     > > >> > >> >> >>> > >> For example, a source connector could
>>>> store in
>>>>>     > the
>>>>>     >     > > message
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> metadata
>>>>>     >     > > >> > >> >> >>> > >> (e.g. UUID) of the source record.
>>>> Similarly,
>>>>>     > if a
>>>>>     >     > > stream
>>>>>     >     > > >> job
>>>>>     >     > > >> > >> >> >>> transforms
>>>>>     >     > > >> > >> >> >>> > >> messages from topic A to topic B, the
>>>> library
>>>>>     > could
>>>>>     >     > > >> include
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> source
>>>>>     >     > > >> > >> >> >>> > >> message offset in each of the
>>>> transformed
>>>>>     > message in
>>>>>     >     > > the
>>>>>     >     > > >> > >> header.
>>>>>     >     > > >> > >> >> Not
>>>>>     >     > > >> > >> >> >>> > sure
>>>
>>>>>     >     > > >> > >> >> >>> > >> how widely useful record-level lineage
>>>> is
>>>>>     > though
>>>>>     >     > since
>>>>>     >     > > the
>>>>>     >     > > >> > >> >> overhead
>>>>>     >     > > >> > >> >> >>> > could
>>>>>     >     > > >> > >> >> >>> > >> be significant.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> F. auditing metadata
>>>>>     >     > > >> > >> >> >>> > >> We could put things like
>>>> clientId/host/user in
>>>>>     > the
>>>>>     >     > > header
>>>>>     >     > > >> in
>>>>>     >     > > >> > >> each
>>>>>     >     > > >> > >> >> >>> > message
>>>>>     >     > > >> > >> >> >>> > >> for auditing. These metadata are
>>>> really at the
>>>>>     >     > producer
>>>>>     >     > > >> > level
>>>>>     >     > > >> > >> >> though.
>>>>>     >     > > >> > >> >> >>> > So, a
>>>>>     >     > > >> > >> >> >>> > >> more efficient way is to only include a
>>>>>     > "producerId"
>>>>>     >     > > per
>>>>>     >     > > >> > >> message
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> > send
>>>>>     >     > > >> > >> >> >>> > >> the producerId -> metadata mapping
>>>>>     > independently.
>>>>>     >     > > KIP-98
>>>>>     >     > > >> is
>>>>>     >     > > >> > >> >> actually
>>>>>     >     > > >> > >> >> >>> > >> proposing including such a producerId
>>>> natively
>>>>>     > in the
>>>>>     >     > > >> > message.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> So, overall, I not sure that I am fully
>>>>>     > convinced of
>>>>>     >     > > the
>>>>>     >     > > >> > strong
>>>>>     >     > > >> > >> >> >>> > third-party
>>>>>     >     > > >> > >> >> >>> > >> use cases of headers yet. Perhaps we
>>>> could
>>>>>     > discuss a
>>>>>     >     > > bit
>>>>>     >     > > >> > more
>>>>>     >     > > >> > >> to
>>>>>     >     > > >> > >> >> make
>>>>>     >     > > >> > >> >> >>> > one
>>>>>     >     > > >> > >> >> >>> > >> or two really convincing use cases.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> Another orthogonal  question is
>>>> whether header
>>>>>     > should
>>>>>     >     > > be
>>>>>     >     > > >> > >> exposed
>>>>>     >     > > >> > >> >> in
>>>>>     >     > > >> > >> >> >>> > stream
>>>>>     >     > > >> > >> >> >>> > >> processing systems such Kafka stream,
>>>> Samza,
>>>>>     > and
>>>>>     >     > Spark
>>>>>     >     > > >> > >> streaming.
>>>>>     >     > > >> > >> >> >>> > >> Currently, those systems just deal with
>>>>>     > key/value
>>>>>     >     > > pairs.
>>>>>     >     > > >> > >> Should we
>>>>>     >     > > >> > >> >> >>> > expose a
>>>>>     >     > > >> > >> >> >>> > >> third thing header there too or
>>>> somehow map
>>>>>     > header to
>>>>>     >     > > key
>>>>>     >     > > >> or
>>>>>     >     > > >> > >> >> value?
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> Thanks,
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> Jun
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> On Tue, Nov 29, 2016 at 3:35 AM,
>>>> Michael
>>>>>     > Pearce <
>>>>>     >     > > >> > >> >> >>> Michael.Pearce@ig.com>
>>>>>     >     > > >> > >> >> >>> > >> wrote:
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> > I assume, that after a period of a
>>>> week,
>>>>>     > that there
>>>>>     >     > > is
>>>>>     >     > > >> no
>>>>>     >     > > >> > >> >> concerns
>>>>>     >     > > >> > >> >> >>> now
>>>>>     >     > > >> > >> >> >>> > >> > with points 1, and 2 and now we have
>>>>>     > agreement that
>>>>>     >     > > >> > headers
>>>>>     >     > > >> > >> are
>>>>>     >     > > >> > >> >> >>> useful
>>>>>     >     > > >> > >> >> >>> > >> and
>>>>>     >     > > >> > >> >> >>> > >> > needed in Kafka. As such if put to a
>>>> KIP
>>>>>     > vote, this
>>>>>     >     > > >> > wouldn’t
>>>>>     >     > > >> > >> be
>>>>>     >     > > >> > >> >> a
>>>>>     >     > > >> > >> >> >>> > reason
>>>>>     >     > > >> > >> >> >>> > >> to
>>>>>     >     > > >> > >> >> >>> > >> > reject.
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> > @
>>>>>     >     > > >> > >> >> >>> > >> > Ignacio on point 4).
>>>>>     >     > > >> > >> >> >>> > >> > I think for purpose of getting this
>>>> KIP
>>>>>     > moving past
>>>>>     >     > > >> this,
>>>>>     >     > > >> > we
>>>>>     >     > > >> > >> can
>>>>>     >     > > >> > >> >> >>> state
>>>>>     >     > > >> > >> >> >>> > >> the
>>>>>     >     > > >> > >> >> >>> > >> > key will be a 4 bytes space that can
>>>> will be
>>>>>     >     > > naturally
>>>>>     >     > > >> > >> >> interpreted
>>>>>     >     > > >> > >> >> >>> as
>>>>>     >     > > >> > >> >> >>> > an
>>>>>     >     > > >> > >> >> >>> > >> > Int32 (if namespacing is later
>>>> wanted you can
>>>>>     >     > easily
>>>>>     >     > > >> split
>>>>>     >     > > >> > >> this
>>>>>     >     > > >> > >> >> >>> into
>>>>>     >     > > >> > >> >> >>> > two
>>>>>     >     > > >> > >> >> >>> > >> > int16 spaces), from the wire protocol
>>>>>     >     > implementation
>>>>>     >     > > >> this
>>>>>     >     > > >> > >> makes
>>>>>     >     > > >> > >> >> no
>>>>>     >     > > >> > >> >> >>> > >> > difference I don’t believe. Is this
>>>>>     > reasonable to
>>>>>     >     > > all?
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> > On 5) as per point 4 therefor happy
>>>> we keep
>>>>>     > with 32
>>>>>     >     > > >> bits.
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> > On 18/11/2016, 20:34, "
>>>>>     > ignacio.solis@gmail.com on
>>>>>     >     > > >> behalf
>>>>>     >     > > >> > of
>>>>>     >     > > >> > >> >> >>> Ignacio
>>>>>     >     > > >> > >> >> >>> > >> > Solis" <ignacio.solis@gmail.com on
>>>> behalf of
>>>>>     >     > > >> > isolis@igso.net
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> >> >>> wrote:
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     Summary:
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     3) Yes - Header value as byte[]
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     4a) Int,Int - No
>>>>>     >     > > >> > >> >> >>> > >> >     4b) Int - Yes
>>>>>     >     > > >> > >> >> >>> > >> >     4c) String - Reluctant maybe
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     5) I believe the header system
>>>> should
>>>>>     > take a
>>>>>     >     > > single
>>>>>     >     > > >> > >> int.  I
>>>>>     >     > > >> > >> >> >>> think
>>>>>     >     > > >> > >> >> >>> > >> > 32bits is
>>>>>     >     > > >> > >> >> >>> > >> >     a good size, if you want to
>>>> interpret
>>>>>     > this as
>>>>>     >     > to
>>>>>     >     > > >> 16bit
>>>>>     >     > > >> > >> >> numbers
>>>>>     >     > > >> > >> >> >>> in
>>>>>     >     > > >> > >> >> >>> > the
>>>>>     >     > > >> > >> >> >>> > >> > layer
>>>>>     >     > > >> > >> >> >>> > >> >     above go right ahead.  If
>>>> somebody wants
>>>>>     > to
>>>>>     >     > argue
>>>>>     >     > > >> for
>>>>>     >     > > >> > 16
>>>>>     >     > > >> > >> >> bits
>>>>>     >     > > >> > >> >> >>> or
>>>>>     >     > > >> > >> >> >>> > 64
>>>>>     >     > > >> > >> >> >>> > >> > bits of
>>>>>     >     > > >> > >> >> >>> > >> >     header key space I would listen.
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     Discussion:
>>>>>     >     > > >> > >> >> >>> > >> >     Dividing the key space into
>>>> sub_key_1 and
>>>>>     >     > > sub_key_2
>>>>>     >     > > >> > >> makes no
>>>>>     >     > > >> > >> >> >>> > sense to
>>>>>     >     > > >> > >> >> >>> > >> > me at
>>>>>     >     > > >> > >> >> >>> > >> >     this layer.  Are we going to
>>>> start
>>>>>     > providing
>>>>>     >     > > APIs to
>>>>>     >     > > >> > get
>>>>>     >     > > >> > >> all
>>>>>     >     > > >> > >> >> >>> the
>>>>>     >     > > >> > >> >> >>> > >> >     sub_key_1s? or all the
>>>> sub_key_2s?  If
>>>>>     > there is
>>>>>     >     > > no
>>>>>     >     > > >> > >> >> >>> distinguishing
>>>>>     >     > > >> > >> >> >>> > >> > functions
>>>>>     >     > > >> > >> >> >>> > >> >     that are applied to each one
>>>> then they
>>>>>     > should
>>>>>     >     > be
>>>>>     >     > > a
>>>>>     >     > > >> > single
>>>>>     >     > > >> > >> >> >>> value.
>>>>>     >     > > >> > >> >> >>> > At
>>>>>     >     > > >> > >> >> >>> > >> > this
>>>>>     >     > > >> > >> >> >>> > >> >     layer all we're doing is
>>>> equality.
>>>>>     >     > > >> > >> >> >>> > >> >     If the above layer wants to
>>>> interpret
>>>>>     > this as
>>>>>     >     > 2,
>>>>>     >     > > 3
>>>>>     >     > > >> or
>>>>>     >     > > >> > >> more
>>>>>     >     > > >> > >> >> >>> values
>>>>>     >     > > >> > >> >> >>> > >> > that's a
>>>>>     >     > > >> > >> >> >>> > >> >     different question.  I
>>>> personally think
>>>>>     > it's
>>>>>     >     > all
>>>>>     >     > > one
>>>>>     >     > > >> > >> >> keyspace
>>>>>     >     > > >> > >> >> >>> > that is
>>>>>     >     > > >> > >> >> >>> > >> >     getting assigned using some
>>>> structure,
>>>>>     > but if
>>>>>     >     > you
>>>>>     >     > > >> > want to
>>>>>     >     > > >> > >> >> >>> > sub-assign
>>>>>     >     > > >> > >> >> >>> > >> > parts
>>>>>     >     > > >> > >> >> >>> > >> >     of it then that's fine.
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     The same discussion applies to
>>>> strings.
>>>>>     > If
>>>>>     >     > > somebody
>>>>>     >     > > >> > >> argued
>>>>>     >     > > >> > >> >> for
>>>>>     >     > > >> > >> >> >>> > >> > strings,
>>>>>     >     > > >> > >> >> >>> > >> >     would we be arguing to divide the
>>>>>     > strings with
>>>>>     >     > > dots
>>>>>     >     > > >> > ('.')
>>>>>     >     > > >> > >> >> as a
>>>>>     >     > > >> > >> >> >>> > >> > requirement?
>>>>>     >     > > >> > >> >> >>> > >> >     Would we want them to give us the
>>>>>     > different
>>>>>     >     > name
>>>>>     >     > > >> > segments
>>>>>     >     > > >> > >> >> >>> > separately?
>>>>>     >     > > >> > >> >> >>> > >> >     Would we be performing any
>>>> actions on
>>>>>     > this key
>>>>>     >     > > other
>>>>>     >     > > >> > than
>>>>>     >     > > >> > >> >> >>> > matching?
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     Nacho
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     On Fri, Nov 18, 2016 at 9:30 AM,
>>>> Michael
>>>>>     >     > Pearce <
>>>>>     >     > > >> > >> >> >>> > >> Michael.Pearce@ig.com
>>>>>     >     > > >> > >> >> >>> > >> > >
>>>>>     >     > > >> > >> >> >>> > >> >     wrote:
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > #jay #jun any concerns on 1
>>>> and 2
>>>>>     > still?
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > @all
>>>>>     >     > > >> > >> >> >>> > >> >     > To get this moving along a bit
>>>> more
>>>>>     > I'd also
>>>>>     >     > > like
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> ask
>>>>>     >     > > >> > >> >> to
>>>>>     >     > > >> > >> >> >>> get
>>>>>     >     > > >> > >> >> >>> > >> > clarity on
>>>>>     >     > > >> > >> >> >>> > >> >     > the below last points:
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > 3) I believe we're all roughly
>>>> happy
>>>>>     > with the
>>>>>     >     > > >> header
>>>>>     >     > > >> > >> value
>>>>>     >     > > >> > >> >> >>> > being a
>>>>>     >     > > >> > >> >> >>> > >> > byte[]?
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > 4) I believe consensus has
>>>> been for an
>>>>>     >     > > namespace
>>>>>     >     > > >> > based
>>>>>     >     > > >> > >> int
>>>>>     >     > > >> > >> >> >>> > approach
>>>>>     >     > > >> > >> >> >>> > >> >     > {int,int} for the key. Any
>>>> objections
>>>>>     > if this
>>>>>     >     > > is
>>>>>     >     > > >> > what
>>>>>     >     > > >> > >> we
>>>>>     >     > > >> > >> >> go
>>>>>     >     > > >> > >> >> >>> > with?
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > 5) as we have if assumption in
>>>> (4)  is
>>>>>     >     > correct,
>>>>>     >     > > >> > >> {int,int}
>>>>>     >     > > >> > >> >> >>> keys.
>>>>>     >     > > >> > >> >> >>> > >> >     > Should both int's be int16 or
>>>> int32?
>>>>>     >     > > >> > >> >> >>> > >> >     > I'm for them being int16(2
>>>> bytes) as
>>>>>     > combined
>>>>>     >     > > is
>>>>>     >     > > >> > space
>>>>>     >     > > >> > >> of
>>>>>     >     > > >> > >> >> >>> > 4bytes as
>>>>>     >     > > >> > >> >> >>> > >> > per
>>>>>     >     > > >> > >> >> >>> > >> >     > original and gives plenty of
>>>>>     > combinations for
>>>>>     >     > > the
>>>>>     >     > > >> > >> >> >>> foreseeable,
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> > keeps
>>>>>     >     > > >> > >> >> >>> > >> >     > the overhead small.
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > Do we see any benefit in
>>>> another kip
>>>>>     > call to
>>>>>     >     > > >> discuss
>>>>>     >     > > >> > >> >> these at
>>>>>     >     > > >> > >> >> >>> > all?
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > Cheers
>>>>>     >     > > >> > >> >> >>> > >> >     > Mike
>>>>>     >     > > >> > >> >> >>> > >> >     > ______________________________
>>>>>     > __________
>>>>>     >     > > >> > >> >> >>> > >> >     > From: K Burstev <
>>>> k.burstev@yandex.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > Sent: Friday, November 18, 2016
>>>>>     > 7:07:07 AM
>>>>>     >     > > >> > >> >> >>> > >> >     > To: dev@kafka.apache.org
>>>>>     >     > > >> > >> >> >>> > >> >     > Subject: Re: [DISCUSS] KIP-82
>>>> - Add
>>>>>     > Record
>>>>>     >     > > Headers
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > For what it is worth also i
>>>> agree. As
>>>>>     > a user:
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     >  1) Yes - Headers are
>>>> worthwhile
>>>>>     >     > > >> > >> >> >>> > >> >     >  2) Yes - Headers should be a
>>>> top level
>>>>>     >     > option
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > 14.11.2016, 21:15, "Ignacio
>>>> Solis" <
>>>>>     >     > > >> isolis@igso.net
>>>>>     >     > > >> > >:
>>>>>     >     > > >> > >> >> >>> > >> >     > > 1) Yes - Headers are
>>>> worthwhile
>>>>>     >     > > >> > >> >> >>> > >> >     > > 2) Yes - Headers should be a
>>>> top
>>>>>     > level
>>>>>     >     > option
>>>>>     >     > > >> > >> >> >>> > >> >     > >
>>>>>     >     > > >> > >> >> >>> > >> >     > > On Mon, Nov 14, 2016 at 9:16
>>>> AM,
>>>>>     > Michael
>>>>>     >     > > Pearce
>>>>>     >     > > >> <
>>>>>     >     > > >> > >> >> >>> > >> > Michael.Pearce@ig.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Hi Roger,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  The kip details/examples
>>>> the
>>>>>     > original
>>>>>     >     > > proposal
>>>>>     >     > > >> > for
>>>>>     >     > > >> > >> key
>>>>>     >     > > >> > >> >> >>> > spacing
>>>>>     >     > > >> > >> >> >>> > >> ,
>>>>>     >     > > >> > >> >> >>> > >> > not
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  new mentioned as per
>>>> discussion
>>>>>     > namespace
>>>>>     >     > > >> idea.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  We will need to update the
>>>> kip,
>>>>>     > when we
>>>>>     >     > get
>>>>>     >     > > >> > >> agreement
>>>>>     >     > > >> > >> >> >>> this
>>>>>     >     > > >> > >> >> >>> > is a
>>>>>     >     > > >> > >> >> >>> > >> > better
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  approach (which seems to
>>>> be the
>>>>>     > case if I
>>>>>     >     > > have
>>>>>     >     > > >> > >> >> understood
>>>>>     >     > > >> > >> >> >>> > the
>>>>>     >     > > >> > >> >> >>> > >> > general
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  feeling in the
>>>> conversation)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Re the variable ints, at
>>>> very
>>>>>     > early stage
>>>>>     >     > > we
>>>>>     >     > > >> did
>>>>>     >     > > >> > >> think
>>>>>     >     > > >> > >> >> >>> about
>>>>>     >     > > >> > >> >> >>> > >> > this. I
>>>>>     >     > > >> > >> >> >>> > >> >     > think
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  the added complexity for
>>>> the
>>>>>     > saving isn't
>>>>>     >     > > >> worth
>>>>>     >     > > >> > it.
>>>>>     >     > > >> > >> >> I'd
>>>>>     >     > > >> > >> >> >>> > rather
>>>>>     >     > > >> > >> >> >>> > >> go
>>>>>     >     > > >> > >> >> >>> > >> >     > with, if
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  we want to reduce
>>>> overheads and
>>>>>     > size
>>>>>     >     > int16
>>>>>     >     > > >> > (2bytes)
>>>>>     >     > > >> > >> >> keys
>>>>>     >     > > >> > >> >> >>> as
>>>>>     >     > > >> > >> >> >>> > it
>>>>>     >     > > >> > >> >> >>> > >> > keeps it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  simple.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  On the note of no headers,
>>>> there
>>>>>     > is as
>>>>>     >     > per
>>>>>     >     > > the
>>>>>     >     > > >> > kip
>>>>>     >     > > >> > >> as
>>>>>     >     > > >> > >> >> we
>>>>>     >     > > >> > >> >> >>> > use an
>>>>>     >     > > >> > >> >> >>> > >> >     > attribute
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  bit to denote if headers
>>>> are
>>>>>     > present or
>>>>>     >     > > not as
>>>>>     >     > > >> > such
>>>>>     >     > > >> > >> >> >>> > provides a
>>>>>     >     > > >> > >> >> >>> > >> > zero
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  overhead currently if
>>>> headers are
>>>>>     > not
>>>>>     >     > used.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  I think as radai mentions
>>>> would be
>>>>>     > good
>>>>>     >     > > first
>>>>>     >     > > >> > if we
>>>>>     >     > > >> > >> >> can
>>>>>     >     > > >> > >> >> >>> get
>>>>>     >     > > >> > >> >> >>> > >> > clarity if
>>>>>     >     > > >> > >> >> >>> > >> >     > do
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  we now have general
>>>> consensus that
>>>>>     > (1)
>>>>>     >     > > headers
>>>>>     >     > > >> > are
>>>>>     >     > > >> > >> >> >>> > worthwhile
>>>>>     >     > > >> > >> >> >>> > >> and
>>>>>     >     > > >> > >> >> >>> > >> >     > useful,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  and (2) we want it as a
>>>> top level
>>>>>     > entity.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Just to state the obvious i
>>>>>     > believe (1)
>>>>>     >     > > >> headers
>>>>>     >     > > >> > are
>>>>>     >     > > >> > >> >> >>> > worthwhile
>>>>>     >     > > >> > >> >> >>> > >> > and (2)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  agree as a top level
>>>> entity.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Cheers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Mike
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>> ______________________________
>>>>>     > __________
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  From: Roger Hoover <
>>>>>     >     > roger.hoover@gmail.com
>>>>>     >     > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Sent: Wednesday, November
>>>> 9, 2016
>>>>>     > 9:10:47
>>>>>     >     > > PM
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  To: dev@kafka.apache.org
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Subject: Re: [DISCUSS]
>>>> KIP-82 - Add
>>>>>     >     > Record
>>>>>     >     > > >> > Headers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Sorry for going a little
>>>> in the
>>>>>     > weeds but
>>>>>     >     > > >> thanks
>>>>>     >     > > >> > >> for
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> > >> replies
>>>>>     >     > > >> > >> >> >>> > >> >     > regarding
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  varint.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Agreed that a prefix and
>>>> {int,
>>>>>     > int} can
>>>>>     >     > be
>>>>>     >     > > the
>>>>>     >     > > >> > >> same.
>>>>>     >     > > >> > >> >> It
>>>>>     >     > > >> > >> >> >>> > doesn't
>>>>>     >     > > >> > >> >> >>> > >> > look
>>>>>     >     > > >> > >> >> >>> > >> >     > like
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  that's what the KIP is
>>>> saying the
>>>>>     > "Open"
>>>>>     >     > > >> > section.
>>>>>     >     > > >> > >> The
>>>>>     >     > > >> > >> >> >>> > example
>>>>>     >     > > >> > >> >> >>> > >> > shows
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  2100001
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  for New Relic and 210002
>>>> for App
>>>>>     > Dynamics
>>>>>     >     > > >> > implying
>>>>>     >     > > >> > >> >> that
>>>>>     >     > > >> > >> >> >>> the
>>>>>     >     > > >> > >> >> >>> > New
>>>>>     >     > > >> > >> >> >>> > >> > Relic
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  organization will have
>>>> only a
>>>>>     > single
>>>>>     >     > > header id
>>>>>     >     > > >> > to
>>>>>     >     > > >> > >> work
>>>>>     >     > > >> > >> >> >>> > with. Or
>>>>>     >     > > >> > >> >> >>> > >> > is
>>>>>     >     > > >> > >> >> >>> > >> >     > 2100001
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  a prefix? The main point
>>>> of a
>>>>>     > namespace
>>>>>     >     > or
>>>>>     >     > > >> > prefix
>>>>>     >     > > >> > >> is
>>>>>     >     > > >> > >> >> to
>>>>>     >     > > >> > >> >> >>> > reduce
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  overhead of config mapping
>>>> or
>>>>>     >     > registration
>>>>>     >     > > >> > >> depending
>>>>>     >     > > >> > >> >> on
>>>>>     >     > > >> > >> >> >>> how
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  namespaces/prefixes are
>>>> managed.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Would love to hear more
>>>> feedback
>>>>>     > on the
>>>>>     >     > > >> > >> higher-level
>>>>>     >     > > >> > >> >> >>> > questions
>>>>>     >     > > >> > >> >> >>> > >> >     > though...
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Cheers,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Roger
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  On Wed, Nov 9, 2016 at
>>>> 11:38 AM,
>>>>>     > radai <
>>>>>     >     > > >> > >> >> >>> > >> > radai.rosenblatt@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > I think this discussion
>>>> is
>>>>>     > getting a
>>>>>     >     > bit
>>>>>     >     > > >> into
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> weeds on
>>>>>     >     > > >> > >> >> >>> > >> > technical
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > implementation details.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > I'd liek to step back a
>>>> minute
>>>>>     > and try
>>>>>     >     > > and
>>>>>     >     > > >> > >> establish
>>>>>     >     > > >> > >> >> >>> > where we
>>>>>     >     > > >> > >> >> >>> > >> > are in
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > larger picture:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > (re-wording nacho's last
>>>>>     > paragraph)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > 1. are we all in
>>>> agreement that
>>>>>     > headers
>>>>>     >     > > are
>>>>>     >     > > >> a
>>>>>     >     > > >> > >> >> >>> worthwhile
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> > useful
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > addition to have? this
>>>> was
>>>>>     > contested
>>>>>     >     > > early
>>>>>     >     > > >> on
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > 2. are we all in
>>>> agreement on
>>>>>     > headers
>>>>>     >     > as
>>>>>     >     > > top
>>>>>     >     > > >> > >> level
>>>>>     >     > > >> > >> >> >>> entity
>>>>>     >     > > >> > >> >> >>> > vs
>>>>>     >     > > >> > >> >> >>> > >> > headers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > squirreled-away in V?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > if there are still
>>>> concerns
>>>>>     > around
>>>>>     >     > these
>>>>>     >     > > #2
>>>>>     >     > > >> > >> points
>>>>>     >     > > >> > >> >> >>> (#jay?
>>>>>     >     > > >> > >> >> >>> > >> > #jun?)?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > (and now back to our
>>>> normal
>>>>>     > programming
>>>>>     >     > > ...)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > varints are nice. having
>>>> said
>>>>>     > that, its
>>>>>     >     > > >> adding
>>>>>     >     > > >> > >> >> >>> complexity
>>>>>     >     > > >> > >> >> >>> > >> (see
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>> https://github.com/addthis/
>>>>>     >     > > >> > >> >> stream-lib/blob/master/src/
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>> main/java/com/clearspring/
>>>>>     >     > > >> > >> >> analytics/util/Varint.java
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > as 1st google result)
>>>> and would
>>>>>     > require
>>>>>     >     > > >> anyone
>>>>>     >     > > >> > >> >> writing
>>>>>     >     > > >> > >> >> >>> > other
>>>>>     >     > > >> > >> >> >>> > >> > clients
>>>>>     >     > > >> > >> >> >>> > >> >     > (C?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > Python? Go? Bash? ;-) )
>>>> to
>>>>>     >     > get/implement
>>>>>     >     > > the
>>>>>     >     > > >> > >> same,
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> for
>>>>>     >     > > >> > >> >> >>> > >> > relatively
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > little gain (int vs
>>>> string is
>>>>>     > order of
>>>>>     >     > > >> > magnitude,
>>>>>     >     > > >> > >> >> this
>>>>>     >     > > >> > >> >> >>> > isnt).
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > int namespacing vs {int,
>>>> int}
>>>>>     >     > namespacing
>>>>>     >     > > >> are
>>>>>     >     > > >> > >> >> basically
>>>>>     >     > > >> > >> >> >>> > the
>>>>>     >     > > >> > >> >> >>> > >> > same
>>>>>     >     > > >> > >> >> >>> > >> >     > thing -
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > youre just namespacing
>>>> an int64
>>>>>     > and
>>>>>     >     > > giving
>>>>>     >     > > >> > people
>>>>>     >     > > >> > >> >> while
>>>>>     >     > > >> > >> >> >>> > 2^32
>>>>>     >     > > >> > >> >> >>> > >> > ranges
>>>>>     >     > > >> > >> >> >>> > >> >     > at a
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > time. the part i like
>>>> about this
>>>>>     > is
>>>>>     >     > > letting
>>>>>     >     > > >> > >> people
>>>>>     >     > > >> > >> >> >>> have a
>>>>>     >     > > >> > >> >> >>> > >> large
>>>>>     >     > > >> > >> >> >>> > >> >     > swath of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > numbers with one
>>>> registration so
>>>>>     > they
>>>>>     >     > > dont
>>>>>     >     > > >> > have
>>>>>     >     > > >> > >> to
>>>>>     >     > > >> > >> >> come
>>>>>     >     > > >> > >> >> >>> > back
>>>>>     >     > > >> > >> >> >>> > >> > for
>>>>>     >     > > >> > >> >> >>> > >> >     > every
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > single plugin/header
>>>> they want to
>>>>>     >     > > "reserve".
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > On Wed, Nov 9, 2016 at
>>>> 11:01 AM,
>>>>>     > Roger
>>>>>     >     > > >> Hoover
>>>>>     >     > > >> > <
>>>>>     >     > > >> > >> >> >>> > >> >     > roger.hoover@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > Since some of the
>>>> debate has
>>>>>     > been
>>>>>     >     > about
>>>>>     >     > > >> > >> overhead +
>>>>>     >     > > >> > >> >> >>> > >> > performance, I'm
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > wondering if we have
>>>>>     > considered a
>>>>>     >     > > varint
>>>>>     >     > > >> > >> encoding
>>>>>     >     > > >> > >> >> (
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>> https://developers.google.com/
>>>>>     >     > > >> > >> >> protocol-buffers/docs/
>>>>>     >     > > >> > >> >> >>> > >> >     > encoding#varints)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > for
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > the header length
>>>> field (int32
>>>>>     > in the
>>>>>     >     > > >> > proposal)
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> for
>>>>>     >     > > >> > >> >> >>> > >> > header
>>>>>     >     > > >> > >> >> >>> > >> >     > ids? If
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > you
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > don't use headers, the
>>>>>     > overhead would
>>>>>     >     > > be a
>>>>>     >     > > >> > >> single
>>>>>     >     > > >> > >> >> >>> byte
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> > for each
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > header
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > id < 128 would also
>>>> need only a
>>>>>     >     > single
>>>>>     >     > > >> byte?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > On Wed, Nov 9, 2016 at
>>>> 6:43 AM,
>>>>>     >     > radai <
>>>>>     >     > > >> > >> >> >>> > >> > radai.rosenblatt@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > @magnus - and very
>>>> dangerous
>>>>>     > (youre
>>>>>     >     > > >> > >> essentially
>>>>>     >     > > >> > >> >> >>> > >> > downloading and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > executing
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > arbitrary code off
>>>> the
>>>>>     > internet on
>>>>>     >     > > your
>>>>>     >     > > >> > >> servers
>>>>>     >     > > >> > >> >> ...
>>>>>     >     > > >> > >> >> >>> > bad
>>>>>     >     > > >> > >> >> >>> > >> > idea
>>>>>     >     > > >> > >> >> >>> > >> >     > without
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  a
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > sandbox, even with)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > as for it being a
>>>> purely
>>>>>     >     > > administrative
>>>>>     >     > > >> > task
>>>>>     >     > > >> > >> - i
>>>>>     >     > > >> > >> >> >>> > >> disagree.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > i wish it would,
>>>> really,
>>>>>     > because
>>>>>     >     > > then my
>>>>>     >     > > >> > >> earlier
>>>>>     >     > > >> > >> >> >>> > point on
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > complexity
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > the remapping
>>>> process would
>>>>>     > be
>>>>>     >     > > invalid,
>>>>>     >     > > >> > but
>>>>>     >     > > >> > >> at
>>>>>     >     > > >> > >> >> >>> > linkedin,
>>>>>     >     > > >> > >> >> >>> > >> > for
>>>>>     >     > > >> > >> >> >>> > >> >     > example,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > we
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (the team im in) run
>>>> kafka
>>>>>     > as a
>>>>>     >     > > service.
>>>>>     >     > > >> > we
>>>>>     >     > > >> > >> dont
>>>>>     >     > > >> > >> >> >>> > really
>>>>>     >     > > >> > >> >> >>> > >> > know
>>>>>     >     > > >> > >> >> >>> > >> >     > what our
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > users
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (developing
>>>> applications
>>>>>     > that use
>>>>>     >     > > kafka)
>>>>>     >     > > >> > are
>>>>>     >     > > >> > >> up
>>>>>     >     > > >> > >> >> to
>>>>>     >     > > >> > >> >> >>> at
>>>>>     >     > > >> > >> >> >>> > any
>>>>>     >     > > >> > >> >> >>> > >> > given
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  moment.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > is very possible
>>>> (given the
>>>>>     >     > > existance of
>>>>>     >     > > >> > >> headers
>>>>>     >     > > >> > >> >> >>> and a
>>>>>     >     > > >> > >> >> >>> > >> >     > corresponding
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugin
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > ecosystem) for some
>>>>>     > application to
>>>>>     >     > > >> "equip"
>>>>>     >     > > >> > >> their
>>>>>     >     > > >> > >> >> >>> > >> producers
>>>>>     >     > > >> > >> >> >>> > >> > and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > consumers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > with the required
>>>> plugin
>>>>>     > without us
>>>>>     >     > > >> > knowing.
>>>>>     >     > > >> > >> i
>>>>>     >     > > >> > >> >> dont
>>>>>     >     > > >> > >> >> >>> > mean
>>>>>     >     > > >> > >> >> >>> > >> > to imply
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  thats
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > bad, i just want to
>>>> make the
>>>>>     > point
>>>>>     >     > > that
>>>>>     >     > > >> > its
>>>>>     >     > > >> > >> not
>>>>>     >     > > >> > >> >> as
>>>>>     >     > > >> > >> >> >>> > simple
>>>>>     >     > > >> > >> >> >>> > >> >     > keeping it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  in
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > sync across a
>>>> large-enough
>>>>>     >     > > organization.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > On Wed, Nov 9, 2016
>>>> at 6:17
>>>>>     > AM,
>>>>>     >     > > Magnus
>>>>>     >     > > >> > >> Edenhill
>>>>>     >     > > >> > >> >> <
>>>>>     >     > > >> > >> >> >>> > >> >     > magnus@edenhill.se>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > I think there is a
>>>> piece
>>>>>     > missing
>>>>>     >     > in
>>>>>     >     > > >> the
>>>>>     >     > > >> > >> >> Strings
>>>>>     >     > > >> > >> >> >>> > >> > discussion,
>>>>>     >     > > >> > >> >> >>> > >> >     > where
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > pro-Stringers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > reason that by
>>>> providing
>>>>>     > unique
>>>>>     >     > > string
>>>>>     >     > > >> > >> >> >>> identifiers
>>>>>     >     > > >> > >> >> >>> > for
>>>>>     >     > > >> > >> >> >>> > >> > each
>>>>>     >     > > >> > >> >> >>> > >> >     > header
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > everything will
>>>> just
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > magically work for
>>>> all
>>>>>     > parts of
>>>>>     >     > the
>>>>>     >     > > >> > stream
>>>>>     >     > > >> > >> >> >>> pipeline.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > But the strings
>>>> dont mean
>>>>>     >     > anything
>>>>>     >     > > by
>>>>>     >     > > >> > >> >> themselves,
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> > while we
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  could
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > probably envision
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > some auto plugin
>>>> loader
>>>>>     > that
>>>>>     >     > > >> downloads,
>>>>>     >     > > >> > >> >> compiles,
>>>>>     >     > > >> > >> >> >>> > links
>>>>>     >     > > >> > >> >> >>> > >> > and
>>>>>     >     > > >> > >> >> >>> > >> >     > runs
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugins
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > on-demand
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > as soon as they're
>>>> seen by
>>>>>     > a
>>>>>     >     > > >> consumer, I
>>>>>     >     > > >> > >> dont
>>>>>     >     > > >> > >> >> >>> really
>>>>>     >     > > >> > >> >> >>> > >> see
>>>>>     >     > > >> > >> >> >>> > >> > a
>>>>>     >     > > >> > >> >> >>> > >> >     > use-case
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > for
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > something
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > so dynamic (and
>>>> fragile) in
>>>>>     >     > > practice.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > In the real world
>>>> an
>>>>>     > application
>>>>>     >     > > will
>>>>>     >     > > >> be
>>>>>     >     > > >> > >> >> >>> configured
>>>>>     >     > > >> > >> >> >>> > >> with
>>>>>     >     > > >> > >> >> >>> > >> > a set
>>>>>     >     > > >> > >> >> >>> > >> >     > of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugins
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > to either add
>>>> (producer)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > or read (consumer)
>>>> headers.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > This is an
>>>> administrative
>>>>>     > task
>>>>>     >     > > based
>>>>>     >     > > >> on
>>>>>     >     > > >> > >> what
>>>>>     >     > > >> > >> >> >>> > features a
>>>>>     >     > > >> > >> >> >>> > >> > client
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > needs/provides and
>>>> results
>>>>>     > in
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > some sort of
>>>> configuration
>>>>>     > to
>>>>>     >     > > enable
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> >> >>> configure
>>>>>     >     > > >> > >> >> >>> > the
>>>>>     >     > > >> > >> >> >>> > >> > desired
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > plugins.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > Since this needs
>>>> to be kept
>>>>>     >     > > somewhat
>>>>>     >     > > >> in
>>>>>     >     > > >> > >> sync
>>>>>     >     > > >> > >> >> >>> across
>>>>>     >     > > >> > >> >> >>> > an
>>>>>     >     > > >> > >> >> >>> > >> >     > organisation
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (there
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > is no point in
>>>> having
>>>>>     > producers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > add headers no
>>>> consumers
>>>>>     > will
>>>>>     >     > read,
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> vice
>>>>>     >     > > >> > >> >> >>> versa),
>>>>>     >     > > >> > >> >> >>> > >> the
>>>>>     >     > > >> > >> >> >>> > >> > added
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > complexity
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > of assigning an id
>>>>>     > namespace
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > for each plugin as
>>>> it is
>>>>>     > being
>>>>>     >     > > >> > configured
>>>>>     >     > > >> > >> >> should
>>>>>     >     > > >> > >> >> >>> be
>>>>>     >     > > >> > >> >> >>> > >> > tolerable.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > /Magnus
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > 2016-11-09 13:06
>>>> GMT+01:00
>>>>>     >     > Michael
>>>>>     >     > > >> > Pearce <
>>>>>     >     > > >> > >> >> >>> > >> >     > Michael.Pearce@ig.com>:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Just
>>>> following/catching
>>>>>     > up on
>>>>>     >     > > what
>>>>>     >     > > >> > seems
>>>>>     >     > > >> > >> to
>>>>>     >     > > >> > >> >> be
>>>>>     >     > > >> > >> >> >>> an
>>>>>     >     > > >> > >> >> >>> > >> > active
>>>>>     >     > > >> > >> >> >>> > >> >     > night :)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > @Radai sorry if
>>>> it may
>>>>>     > seem
>>>>>     >     > > obvious
>>>>>     >     > > >> > but
>>>>>     >     > > >> > >> what
>>>>>     >     > > >> > >> >> >>> does
>>>>>     >     > > >> > >> >> >>> > MD
>>>>>     >     > > >> > >> >> >>> > >> > stand
>>>>>     >     > > >> > >> >> >>> > >> >     > for?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > My take on
>>>> String vs Int:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > I will state
>>>> first I am
>>>>>     > pro Int
>>>>>     >     > > (16
>>>>>     >     > > >> or
>>>>>     >     > > >> > >> 32).
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > I do though
>>>> playing
>>>>>     > devils
>>>>>     >     > > advocate
>>>>>     >     > > >> > see a
>>>>>     >     > > >> > >> >> big
>>>>>     >     > > >> > >> >> >>> plus
>>>>>     >     > > >> > >> >> >>> > >> > with the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > argument
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > String keys,
>>>> this is
>>>>>     > around
>>>>>     >     > > >> > integrating
>>>>>     >     > > >> > >> >> into an
>>>>>     >     > > >> > >> >> >>> > >> > existing
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > eco-system.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > As many other
>>>> systems use
>>>>>     >     > String
>>>>>     >     > > >> based
>>>>>     >     > > >> > >> >> headers
>>>>>     >     > > >> > >> >> >>> > >> (Flume,
>>>>>     >     > > >> > >> >> >>> > >> > JMS)
>>>>>     >     > > >> > >> >> >>> > >> >     > it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > makes
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > much easier for
>>>> these to
>>>>>     > be
>>>>>     >     > > >> > >> >> >>> > incorporated/integrated
>>>>>     >     > > >> > >> >> >>> > >> > into.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > How with Int
>>>> based
>>>>>     > headers
>>>>>     >     > could
>>>>>     >     > > we
>>>>>     >     > > >> > >> provide
>>>>>     >     > > >> > >> >> a
>>>>>     >     > > >> > >> >> >>> > >> > way/guidence to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  make
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > this
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > integration
>>>> simple /
>>>>>     > easy with
>>>>>     >     > > >> > transition
>>>>>     >     > > >> > >> >> flows
>>>>>     >     > > >> > >> >> >>> > over
>>>>>     >     > > >> > >> >> >>> > >> to
>>>>>     >     > > >> > >> >> >>> > >> >     > kafka?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * tough luck
>>>> buddy
>>>>>     > you're on
>>>>>     >     > your
>>>>>     >     > > >> own
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * simply hash
>>>> the string
>>>>>     > into
>>>>>     >     > int
>>>>>     >     > > >> code
>>>>>     >     > > >> > >> and
>>>>>     >     > > >> > >> >> hope
>>>>>     >     > > >> > >> >> >>> > for
>>>>>     >     > > >> > >> >> >>> > >> no
>>>>>     >     > > >> > >> >> >>> > >> >     > collisions
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > (how
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > convert back
>>>> though?)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * http2 style as
>>>>>     > mentioned by
>>>>>     >     > > nacho.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > cheers,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Mike
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     > ______________________________
>>>>>     >     > > >> > __________
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > From: radai <
>>>>>     >     > > >> > radai.rosenblatt@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Sent: Wednesday,
>>>>>     > November 9,
>>>>>     >     > 2016
>>>>>     >     > > >> > 8:12 AM
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > To:
>>>> dev@kafka.apache.org
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Subject: Re:
>>>> [DISCUSS]
>>>>>     > KIP-82 -
>>>>>     >     > > Add
>>>>>     >     > > >> > >> Record
>>>>>     >     > > >> > >> >> >>> Headers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > thinking about
>>>> it some
>>>>>     > more,
>>>>>     >     > the
>>>>>     >     > > >> best
>>>>>     >     > > >> > >> way to
>>>>>     >     > > >> > >> >> >>> > transmit
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > header
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > remapping
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > data to
>>>> consumers would
>>>>>     > be to
>>>>>     >     > > put it
>>>>>     >     > > >> > in
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> MD
>>>>>     >     > > >> > >> >> >>> > >> response
>>>>>     >     > > >> > >> >> >>> > >> >     > payload,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  so
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > maybe
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > it should be
>>>> discussed
>>>>>     > now.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > On Wed, Nov 9,
>>>> 2016 at
>>>>>     > 12:09
>>>>>     >     > AM,
>>>>>     >     > > >> > radai <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  radai.rosenblatt@gmail.com
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > im not opposed
>>>> to the
>>>>>     > idea of
>>>>>     >     > > >> > namespace
>>>>>     >     > > >> > >> >> >>> mapping.
>>>>>     >     > > >> > >> >> >>> > >> all
>>>>>     >     > > >> > >> >> >>> > >> > im
>>>>>     >     > > >> > >> >> >>> > >> >     > saying
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  is
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > its
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > not part of
>>>> the "mvp"
>>>>>     > and,
>>>>>     >     > > since
>>>>>     >     > > >> it
>>>>>     >     > > >> > >> >> requires
>>>>>     >     > > >> > >> >> >>> no
>>>>>     >     > > >> > >> >> >>> > >> wire
>>>>>     >     > > >> > >> >> >>> > >> > format
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > change,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > can
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > always be
>>>> added later.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > also, its not
>>>> as
>>>>>     > simple as
>>>>>     >     > just
>>>>>     >     > > >> > >> >> configuring
>>>>>     >     > > >> > >> >> >>> MM
>>>>>     >     > > >> > >> >> >>> > to
>>>>>     >     > > >> > >> >> >>> > >> do
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > transform:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > lets
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > say i've
>>>> implemented
>>>>>     > large
>>>>>     >     > > message
>>>>>     >     > > >> > >> >> support as
>>>>>     >     > > >> > >> >> >>> > >> > {666,1} and
>>>>>     >     > > >> > >> >> >>> > >> >     > on
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  some
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > mirror
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > target cluster
>>>> its been
>>>>>     >     > > remapped
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> >> {999,1}.
>>>>>     >     > > >> > >> >> >>> the
>>>>>     >     > > >> > >> >> >>> > >> > consumer
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  plugin
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > code
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > would
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > also need to
>>>> be told
>>>>>     > to look
>>>>>     >     > > for
>>>>>     >     > > >> the
>>>>>     >     > > >> > >> large
>>>>>     >     > > >> > >> >> >>> > message
>>>>>     >     > > >> > >> >> >>> > >> > "part X
>>>>>     >     > > >> > >> >> >>> > >> >     > of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Y"
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > header
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > under {999,1}.
>>>> doable,
>>>>>     > but
>>>>>     >     > > tricky.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > On Tue, Nov 8,
>>>> 2016 at
>>>>>     > 10:29
>>>>>     >     > > PM,
>>>>>     >     > > >> > Gwen
>>>>>     >     > > >> > >> >> >>> Shapira <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  gwen@confluent.io
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> While you can
>>>> do
>>>>>     > whatever
>>>>>     >     > you
>>>>>     >     > > >> want
>>>>>     >     > > >> > >> with a
>>>>>     >     > > >> > >> >> >>> > >> namespace
>>>>>     >     > > >> > >> >> >>> > >> > and
>>>>>     >     > > >> > >> >> >>> > >> >     > your
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > code,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> what I'd
>>>> expect is
>>>>>     > for each
>>>>>     >     > > app
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> >> >>> namespaces
>>>>>     >     > > >> > >> >> >>> > >> >     > configurable...
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> So if I
>>>> accidentally
>>>>>     > used
>>>>>     >     > 666
>>>>>     >     > > for
>>>>>     >     > > >> > my
>>>>>     >     > > >> > >> HR
>>>>>     >     > > >> > >> >> >>> > >> department,
>>>>>     >     > > >> > >> >> >>> > >> > and
>>>>>     >     > > >> > >> >> >>> > >> >     > still
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > want
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> run RadaiApp,
>>>> I can
>>>>>     > config
>>>>>     >     > > >> > >> "namespace=42"
>>>>>     >     > > >> > >> >> >>> for
>>>>>     >     > > >> > >> >> >>> > >> > RadaiApp and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > everything
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> will look
>>>> normal.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> This means
>>>> you only
>>>>>     > need to
>>>>>     >     > > sync
>>>>>     >     > > >> > usage
>>>>>     >     > > >> > >> >> >>> inside
>>>>>     >     > > >> > >> >> >>> > your
>>>>>     >     > > >> > >> >> >>> > >> > own
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > organization.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> Still hard,
>>>> but
>>>>>     > somewhat
>>>>>     >     > > easier
>>>>>     >     > > >> > than
>>>>>     >     > > >> > >> >> syncing
>>>>>     >     > > >> > >> >> >>> > with
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > entire
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > world.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> On Tue, Nov
>>>> 8, 2016
>>>>>     > at 10:07
>>>>>     >     > > PM,
>>>>>     >     > > >> > >> radai <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>> radai.rosenblatt@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > and we can
>>>> start
>>>>>     > with
>>>>>     >     > > >> {namespace,
>>>>>     >     > > >> > >> id}
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> no
>>>>>     >     > > >> > >> >> >>> > >> > re-mapping
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > support
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> always
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > add it
>>>> later on
>>>>>     > if/when
>>>>>     >     > > >> > collisions
>>>>>     >     > > >> > >> >> >>> actually
>>>>>     >     > > >> > >> >> >>> > >> > happen (i
>>>>>     >     > > >> > >> >> >>> > >> >     > dont
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > think
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > they'd
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> be
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > a problem).
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > every
>>>> interested
>>>>>     > party (so
>>>>>     >     > > orgs
>>>>>     >     > > >> > or
>>>>>     >     > > >> > >> >> >>> > individuals)
>>>>>     >     > > >> > >> >> >>> > >> > could
>>>>>     >     > > >> > >> >> >>> > >> >     > then
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > register
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > a
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > prefix (0 =
>>>>>     > reserved, 1 =
>>>>>     >     > > >> > confluent
>>>>>     >     > > >> > >> ...
>>>>>     >     > > >> > >> >> >>> 666
>>>>>     >     > > >> > >> >> >>> > = me
>>>>>     >     > > >> > >> >> >>> > >> > :-) )
>>>>>     >     > > >> > >> >> >>> > >> >     > and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  do
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > whatever
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> with
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > the 2nd ID
>>>> - so once
>>>>>     >     > > linkedin
>>>>>     >     > > >> > >> >> registers,
>>>>>     >     > > >> > >> >> >>> say
>>>>>     >     > > >> > >> >> >>> > 3,
>>>>>     >     > > >> > >> >> >>> > >> > then
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  linkedin
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > devs
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > are
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> free
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > to use {3,
>>>> *} with a
>>>>>     >     > > reasonable
>>>>>     >     > > >> > >> >> >>> expectation
>>>>>     >     > > >> > >> >> >>> > to
>>>>>     >     > > >> > >> >> >>> > >> to
>>>>>     >     > > >> > >> >> >>> > >> >     > collide
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  with
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > anything
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > else.
>>>> further
>>>>>     > partitioning
>>>>>     >     > > of
>>>>>     >     > > >> > that *
>>>>>     >     > > >> > >> >> >>> becomes
>>>>>     >     > > >> > >> >> >>> > >> > linkedin's
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > problem,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > but
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > "upstream
>>>>>     > registration"
>>>>>     >     > of a
>>>>>     >     > > >> > >> namespace
>>>>>     >     > > >> > >> >> >>> only
>>>>>     >     > > >> > >> >> >>> > has
>>>>>     >     > > >> > >> >> >>> > >> to
>>>>>     >     > > >> > >> >> >>> > >> >     > happen
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > once.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > On Tue, Nov
>>>> 8, 2016
>>>>>     > at
>>>>>     >     > 9:03
>>>>>     >     > > PM,
>>>>>     >     > > >> > >> James
>>>>>     >     > > >> > >> >> >>> Cheng <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > wushujames@gmail.com
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > On Nov
>>>> 8, 2016,
>>>>>     > at 5:54
>>>>>     >     > > PM,
>>>>>     >     > > >> > Gwen
>>>>>     >     > > >> > >> >> >>> Shapira <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > gwen@confluent.io>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Thank
>>>> you so
>>>>>     > much for
>>>>>     >     > > this
>>>>>     >     > > >> > clear
>>>>>     >     > > >> > >> and
>>>>>     >     > > >> > >> >> >>> fair
>>>>>     >     > > >> > >> >> >>> > >> > summary of
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > arguments.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > I'm in
>>>> favor of
>>>>>     > ints.
>>>>>     >     > > Not a
>>>>>     >     > > >> > >> >> >>> deal-breaker,
>>>>>     >     > > >> > >> >> >>> > but
>>>>>     >     > > >> > >> >> >>> > >> > in
>>>>>     >     > > >> > >> >> >>> > >> >     > favor.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Even
>>>> more in
>>>>>     > favor of
>>>>>     >     > > >> Magnus's
>>>>>     >     > > >> > >> >> >>> > decentralized
>>>>>     >     > > >> > >> >> >>> > >> >     > suggestion
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > with
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Roger's
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > tweak:
>>>> add a
>>>>>     > namespace
>>>>>     >     > > for
>>>>>     >     > > >> > >> headers.
>>>>>     >     > > >> > >> >> >>> This
>>>>>     >     > > >> > >> >> >>> > will
>>>>>     >     > > >> > >> >> >>> > >> > allow
>>>>>     >     > > >> > >> >> >>> > >> >     > each
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > app
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > just
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > use
>>>> whatever IDs
>>>>>     > it
>>>>>     >     > wants
>>>>>     >     > > >> > >> >> internally,
>>>>>     >     > > >> > >> >> >>> and
>>>>>     >     > > >> > >> >> >>> > >> then
>>>>>     >     > > >> > >> >> >>> > >> > let
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > admin
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> deploying
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > the app
>>>> figure
>>>>>     > out an
>>>>>     >     > > >> > available
>>>>>     >     > > >> > >> >> >>> namespace
>>>>>     >     > > >> > >> >> >>> > ID
>>>>>     >     > > >> > >> >> >>> > >> > for the
>>>>>     >     > > >> > >> >> >>> > >> >     > app
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > live
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > in.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > So
>>>>>     >     > > >> > io.confluent.schema-registry
>>>>>     >     > > >> > >> can
>>>>>     >     > > >> > >> >> be
>>>>>     >     > > >> > >> >> >>> > >> > namespace
>>>>>     >     > > >> > >> >> >>> > >> >     > 0x01 on
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  my
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> deployment
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > and 0x57
>>>> on
>>>>>     > yours, and
>>>>>     >     > > the
>>>>>     >     > > >> > poor
>>>>>     >     > > >> > >> guys
>>>>>     >     > > >> > >> >> >>> > >> > developing the
>>>>>     >     > > >> > >> >> >>> > >> >     > app
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > don't
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > need
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > worry
>>>> about that.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> Gwen, if I
>>>>>     > understand
>>>>>     >     > your
>>>>>     >     > > >> > example
>>>>>     >     > > >> > >> >> >>> right, an
>>>>>     >     > > >> > >> >> >>> > >> >     > application
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > deployer
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > might
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> decide to
>>>> use 0x01
>>>>>     > in one
>>>>>     >     > > >> > >> deployment,
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> > that
>>>>>     >     > > >> > >> >> >>> > >> > means
>>>>>     >     > > >> > >> >> >>> > >> >     > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > once
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> message
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> is written
>>>> into the
>>>>>     >     > > broker, it
>>>>>     >     > > >> > >> will be
>>>>>     >     > > >> > >> >> >>> > saved on
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > broker
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > with
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> specific
>>>> namespace
>>>>>     >     > (0x01).
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> If you
>>>> were to
>>>>>     > mirror
>>>>>     >     > that
>>>>>     >     > > >> > message
>>>>>     >     > > >> > >> >> into
>>>>>     >     > > >> > >> >> >>> > another
>>>>>     >     > > >> > >> >> >>> > >> >     > cluster,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > 0x01
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > would
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> accompany
>>>> the
>>>>>     > message,
>>>>>     >     > > right?
>>>>>     >     > > >> > What
>>>>>     >     > > >> > >> if
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> > >> > deployers of
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > same
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > app
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > in
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> other
>>>> cluster uses
>>>>>     > 0x57?
>>>>>     >     > > They
>>>>>     >     > > >> > won't
>>>>>     >     > > >> > >> >> >>> > understand
>>>>>     >     > > >> > >> >> >>> > >> > each
>>>>>     >     > > >> > >> >> >>> > >> >     > other?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> I'm not
>>>> sure
>>>>>     > that's an
>>>>>     >     > > >> avoidable
>>>>>     >     > > >> > >> >> >>> problem. I
>>>>>     >     > > >> > >> >> >>> > >> > think it
>>>>>     >     > > >> > >> >> >>> > >> >     > simply
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > means
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> in
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> order to
>>>> share
>>>>>     > data, you
>>>>>     >     > > have
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> also
>>>>>     >     > > >> > >> >> >>> have a
>>>>>     >     > > >> > >> >> >>> > >> > shared
>>>>>     >     > > >> > >> >> >>> > >> >     > (agreed
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > upon)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>> understanding of
>>>>>     > what the
>>>>>     >     > > >> > >> namespaces
>>>>>     >     > > >> > >> >> >>> mean.
>>>>>     >     > > >> > >> >> >>> > >> Which
>>>>>     >     > > >> > >> >> >>> > >> > I
>>>>>     >     > > >> > >> >> >>> > >> >     > think
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > makes
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > sense,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> because the
>>>>>     > alternate
>>>>>     >     > > (sharing
>>>>>     >     > > >> > >> >> *nothing*
>>>>>     >     > > >> > >> >> >>> at
>>>>>     >     > > >> > >> >> >>> > >> all)
>>>>>     >     > > >> > >> >> >>> > >> > would
>>>>>     >     > > >> > >> >> >>> > >> >     > mean
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > there
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> would be
>>>> no way to
>>>>>     >     > > understand
>>>>>     >     > > >> > each
>>>>>     >     > > >> > >> >> other.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> -James
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Gwen
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > On Tue,
>>>> Nov 8,
>>>>>     > 2016 at
>>>>>     >     > > 4:23
>>>>>     >     > > >> > PM,
>>>>>     >     > > >> > >> >> radai <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>> radai.rosenblatt@gmail.com
>>>>>     > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >> +1 for
>>>> sean's
>>>>>     >     > document.
>>>>>     >     > > it
>>>>>     >     > > >> > >> covers
>>>>>     >     > > >> > >> >> >>> pretty
>>>>>     >     > > >> > >> >> >>> > >> much
>>>>>     >     > > >> > >> >> >>> > >> > all
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > trade-offs
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >> provides
>>>>>     > concrete
>>>>>     >     > > figures
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> argue
>>>>>     >     > > >> > >> >> >>> about
>>>>>     >     > > >> > >> >> >>> > :-)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >>
>>>> (nit-picking -
>>>>>     > used
>>>>>     >     > the
>>>>>     >     > > >> same
>>>>>     >     > > >> > >> xkcd
>>>>>     >     > > >> > >> >> >>> twice,
>>>>>     >     > > >> > >> >> >>> > >> also
>>>>>     >     > > >> > >> >> >>> > >> > trove
>>>>>     >     > > >> > >> >> >>> > >> >     > has
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > been
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> superceded
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> >
>>>>>     >     > > >> >
>>>>>     >     > > >> >
>>>>>     >     > > >> > --
>>>>>     >     > > >> > Gwen Shapira
>>>>>     >     > > >> > Product Manager | Confluent
>>>>>     >     > > >> > 650.450.2760 | @gwenshap
>>>>>     >     > > >> > Follow us: Twitter | blog
>>>>>     >     > > >> >
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >> --
>>>>>     >     > > >> *Todd Palino*
>>>>>     >     > > >> Staff Site Reliability Engineer
>>>>>     >     > > >> Data Infrastructure Streaming
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >> linkedin.com/in/toddpalino
>>>>>     >     > > >>
>>>>>     >     > >
>>>>>     >     > >
>>>>>     >     > >
>>>>>     >     > > --
>>>>>     >     > > Gwen Shapira
>>>>>     >     > > Product Manager | Confluent
>>>>>     >     > > 650.450.2760 | @gwenshap
>>>>>     >     > > Follow us: Twitter | blog
>>>>>     >     > >
>>>>>     >     >
>>>>>     >
>>>>>     >
>>>>>     > The information contained in this email is strictly confidential
>>>> and for
>>>>>     > the use of the addressee only, unless otherwise indicated. If you
>>>> are not
>>>>>     > the intended recipient, please do not read, copy, use or disclose
>>>> to others
>>>>>     > this message or any attachment. Please also notify the sender by
>>>> replying
>>>>>     > to this email or by telephone (+44(020 7896 0011) and then delete
>>>> the email
>>>>>     > and any copies of it. Opinions, conclusion (etc) that do not
>>>> relate to the
>>>>>     > official business of this company shall be understood as neither
>>>> given nor
>>>>>     > endorsed by it. IG is a trading name of IG Markets Limited (a
>>>> company
>>>>>     > registered in England and Wales, company number 04008957) and IG
>>>> Index
>>>>>     > Limited (a company registered in England and Wales, company number
>>>>>     > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
>>>> Hill,
>>>>>     > London EC4R 2YA. Both IG Markets Limited (register number 195355)
>>>> and IG
>>>>>     > Index Limited (register number 114059) are authorised and
>>>> regulated by the
>>>>>     > Financial Conduct Authority.
>>>>>     >
>>>>>
>>>>>
>>>>> The information contained in this email is strictly confidential and for
>>>> the use of the addressee only, unless otherwise indicated. If you are not
>>>> the intended recipient, please do not read, copy, use or disclose to others
>>>> this message or any attachment. Please also notify the sender by replying
>>>> to this email or by telephone (+44(020 7896 0011) and then delete the email
>>>> and any copies of it. Opinions, conclusion (etc) that do not relate to the
>>>> official business of this company shall be understood as neither given nor
>>>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>>>> registered in England and Wales, company number 04008957) and IG Index
>>>> Limited (a company registered in England and Wales, company number
>>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>>>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>>>> Index Limited (register number 114059) are authorised and regulated by the
>>>> Financial Conduct Authority.
>>>>>
>>>>
>>>>
>>>
>>
> 
> 
>