You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by David Ormsbee <da...@datadoghq.com> on 2011/11/24 00:32:12 UTC

Guide to Writing a Client for Kafka

Hi folks,

Inspired by the Wire Format wiki entry, I recently created a draft of
"Writing a Client for Kafka":

  http://readthedocs.org/docs/brod/en/latest/spec.html

I tried to make it the document that I wish we had at Datadog when we
were total Kafka newbies writing client code. It's still very rough
and littered with "FIXME" notes where I've written things from my
understanding without verifying them with tests. I should be filling
all these gaps in soon, as we're planning to add a lot of
functionality to our Python client in the coming weeks.

Comments and corrections would be greatly appreciated. :-)

Thank you very much to the Kafka devs. We've been using Kafka a great
deal at Datadog, we're extremely happy with it, and we're looking
forward to contributing in our own way. We'll make an announcement as
soon as we feel our Python client is ready for community use.

Take care.

Dave

Re: Guide to Writing a Client for Kafka

Posted by Pierre-Yves <py...@smallrivers.com>.

Thanks a lot Dave, this is quite helpful.

  - pyr

On Wed, 23 Nov 2011 18:32:12 -0500
David Ormsbee <da...@datadoghq.com> wrote:

> Hi folks,
> 
> Inspired by the Wire Format wiki entry, I recently created a draft of
> "Writing a Client for Kafka":
> 
>   http://readthedocs.org/docs/brod/en/latest/spec.html
> 
> I tried to make it the document that I wish we had at Datadog when we
> were total Kafka newbies writing client code. It's still very rough
> and littered with "FIXME" notes where I've written things from my
> understanding without verifying them with tests. I should be filling
> all these gaps in soon, as we're planning to add a lot of
> functionality to our Python client in the coming weeks.
> 
> Comments and corrections would be greatly appreciated. :-)
> 
> Thank you very much to the Kafka devs. We've been using Kafka a great
> deal at Datadog, we're extremely happy with it, and we're looking
> forward to contributing in our own way. We'll make an announcement as
> soon as we feel our Python client is ready for community use.
> 
> Take care.
> 
> Dave
>

Re: Guide to Writing a Client for Kafka

Posted by David Ormsbee <da...@datadoghq.com>.

Hi Joe,

> One thing I wanted to float is "client" vs "client" meaning a new (or
> existing) language implementing a new API vs wrapping an existing API and
> confusion between the two.
>
> I think your post is geared towards the former which is really awesome
> because it helps devs jump into Kafka (I found it really helpful myself
> learned some things strait up, always good).

I was definitely gearing it towards people who were implementing new
client APIs in Python, Erlang, Io, or whatever non-JVM language. I'm
not sure how I should phrase it to make the distinction clearer. Did
you have something specific in mind? "Writing a Client API for Kafka"?

> Either way I want to contribute it back into the codebase however the
> examples directory seems to only be for Java (doh), I was not sure how we
> could deal with this so we could add examples for scala, python, ruby, cpp,
> pyp, go, c#, etc (javascript? node.js? =8^) )  I bring this up because I
> think it is an important part of adoption and have found while moving Kafka
> into my own development cycles these examples/samples (maybe create a new
> samples directory, dunno) would have been REALLY helpful.  Documentation is
> great, code is just as good if not better to look at it IMHO

I definitely agree with you on the importance of examples, and we'll
try to have plenty of them in brod. The PHP client (/clients/php in
the Kafka repo) has its own examples directory. I imagine that should
be the convention for examples in other languages. Just to be clear
though -- I'm not in any way affiliated with the Kafka project, and
our Python client is not in any way officially blessed. So someone
else on this list will have to chime in with the official word on
these things. :-)

If you'd like to send a patch or pull request for the Python client
we're working on at Datadog (brod), it's located at:
  https://github.com/datadog/brod

That being said, we're likely going to be making some substantial
changes in the very near future, so I can't guarantee that we'll be
able to accept your contribution.

Glad you found the client doc useful. :-) Thank you, and take care.

Dave

Re: Guide to Writing a Client for Kafka

Posted by Joe Stein <cr...@gmail.com>.

Hi Dave, this is really helpful, useful, informative and cool.

One thing I wanted to float is "client" vs "client" meaning a new (or
existing) language implementing a new API vs wrapping an existing API and
confusion between the two.

I think your post is geared towards the former which is really awesome
because it helps devs jump into Kafka (I found it really helpful myself
learned some things strait up, always good).

I was toying this weekend about adding
https://github.com/andrix/python-snappy to the python client and building a
python example for that client.... I am in progress building a producer to
stream our data but not sure yet if python might make more sense for me
than scala but I have to go through it all some more (nothing Kafka
specific just in my implementation needs and if I do it in scala will do
the example in scala).

Either way I want to contribute it back into the codebase however the
examples directory seems to only be for Java (doh), I was not sure how we
could deal with this so we could add examples for scala, python, ruby, cpp,
pyp, go, c#, etc (javascript? node.js? =8^) )  I bring this up because I
think it is an important part of adoption and have found while moving Kafka
into my own development cycles these examples/samples (maybe create a new
samples directory, dunno) would have been REALLY helpful.  Documentation is
great, code is just as good if not better to look at it IMHO

/*
Joe Stein
http://www.linkedin.com/in/charmalloc
Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
*/

On Wed, Nov 23, 2011 at 6:32 PM, David Ormsbee <da...@datadoghq.com> wrote:

> Hi folks,
>
> Inspired by the Wire Format wiki entry, I recently created a draft of
> "Writing a Client for Kafka":
>
>  http://readthedocs.org/docs/brod/en/latest/spec.html
>
> I tried to make it the document that I wish we had at Datadog when we
> were total Kafka newbies writing client code. It's still very rough
> and littered with "FIXME" notes where I've written things from my
> understanding without verifying them with tests. I should be filling
> all these gaps in soon, as we're planning to add a lot of
> functionality to our Python client in the coming weeks.
>
> Comments and corrections would be greatly appreciated. :-)
>
> Thank you very much to the Kafka devs. We've been using Kafka a great
> deal at Datadog, we're extremely happy with it, and we're looking
> forward to contributing in our own way. We'll make an announcement as
> soon as we feel our Python client is ready for community use.
>
> Take care.
>
> Dave
>

Re: Guide to Writing a Client for Kafka

Posted by Chris Burroughs <ch...@gmail.com>.

Great work!

Having a solid doc/spec for the wire protocol would also likely decrease
the need to have as many clients in-tree.  I think at this point that
would be a good thing.

On 11/23/2011 06:32 PM, David Ormsbee wrote:
> Hi folks,
> 
> Inspired by the Wire Format wiki entry, I recently created a draft of
> "Writing a Client for Kafka":
> 
>   http://readthedocs.org/docs/brod/en/latest/spec.html
> 
> I tried to make it the document that I wish we had at Datadog when we
> were total Kafka newbies writing client code. It's still very rough
> and littered with "FIXME" notes where I've written things from my
> understanding without verifying them with tests. I should be filling
> all these gaps in soon, as we're planning to add a lot of
> functionality to our Python client in the coming weeks.
> 
> Comments and corrections would be greatly appreciated. :-)
> 
> Thank you very much to the Kafka devs. We've been using Kafka a great
> deal at Datadog, we're extremely happy with it, and we're looking
> forward to contributing in our own way. We'll make an announcement as
> soon as we feel our Python client is ready for community use.
> 
> Take care.
> 
> Dave

Re: Guide to Writing a Client for Kafka

Posted by David Ormsbee <da...@datadoghq.com>.

Hi Taylor,

I've added a mention about Kafka's lack of an index to the client/driver
doc, since it might confuse new users. I'll include your methods on how to
cope when I write more end-user documentation.

FWIW, we ended up going with option 1, storing the history in a DB. Unlike
your N-messages need, our need was primarily time based ("re-process all
the messages received from time X to time Y", where X and Y may be
separated by hours). In that respect, we'll be quite happy when this one
gets implemented:

   https://issues.apache.org/jira/browse/KAFKA-87

Please pardon the lack of updates to the doc in the past week. I haven't
abandoned it -- we just really need to get ZooKeeper aware
producers/consumers working properly in brod, and that's where much of my
time has gone in the last week.

Thank you.

Dave

On Thu, Dec 1, 2011 at 10:22 PM, Taylor Gautier <tg...@tagged.com> wrote:

> One thing we should make clear somewhere is that while Kafka has a history
> mechanism, it doesn't provide an index.
>
> I probably moved forward in my implementation (and selection) to use Kafka
> for 3-4 weeks before realizing that I would not be able to efficiently
> query Kafka for the N-1000th message.
>
> This was nearly a deal killer for us, but there are several available
> workarounds/solutions:
>
>   - Keep the history somewhere, outside of Kafka, e.g. in a DB, memcache,
>   in memory, whatever, if you need to rewind N messages ago.  This kind of
>   assumes you have clients that are always making forward progress and
>   working against the Kafka stream.  If you have ephemeral clients that
> come
>   and go, and don't have history with the stream, it doesn't work so well
>   - Make a minor modification to Kafka to have it implement a reverse
>   linked list - where each message also stores the offset of the previous
>   message
>   - Make a medium change to Kafka to have it store an index of message
>   offsets in a secondary topic
>
> We went with option #3...
>
> On Tue, Nov 29, 2011 at 9:06 AM, David Ormsbee <da...@datadoghq.com> wrote:
>
> > Hi Taylor,
> >
> > Yeah, Joe brought up the need for this distinction as well. When I
> > move the doc over to the wiki, I'll try to consistently use "driver"
> > to clear up ambiguities. The bits that are more higher-level client
> > oriented are really just there for context, to explain why the network
> > protocol is what it is. Things like the fetch and offsets requests are
> > much easier to explain if you show how it connects to the
> > implementation in the back. I wanted to create a single document that
> > would take people 90% of the way there to writing a driver while
> > assuming minimal prior knowledge, because it's the document I really
> > wish I had last month.
> >
> > I always intended to write a separate document that would more
> > comprehensively cover how to use our Python driver, but I imagine that
> > part will vary substantially from one implementation to the next. I
> > haven't started on that one yet just because our driver's API likely
> > won't stabilize for another couple of weeks.
> >
> > Thank you.
> >
> > Dave
> >
> >
> > On Tue, Nov 29, 2011 at 10:40 AM, Taylor Gautier <tg...@tagged.com>
> > wrote:
> > > Just wanted to add my $0.02 - I'm glad David wrote this - excellent job
> > sir!
> > >
> > > My comment is this (I think it might have already been mentioned,
> > however I
> > > will re-iterate it):  the document as is covers two audiences - those
> > that
> > > are writing Kafka "drivers" and those that are writing clients that
> > publish
> > > and consume to Kafka (using a "driver").  Most of the document is
> geared
> > > for the former, however there are some bits that are meant for or are
> > > useful also to the latter.
> > >
> > > I would like to suggest that we split the document up and address each
> > > audience separately.  As great as it is that David wrote a lot of great
> > > information for the "driver" writers, the need for that will slowly
> > > decline, as the drivers slowly become more available and more stable
> > > (there's only so many languages in the world).
> > >
> > > On the other hand, people will be writing their own "clients" using the
> > > drivers far more often, so the latter audience will, assuming Kafka
> > becomes
> > > wildly successful, increase in need.  Beefing up this part of the
> > document
> > > - by focusing on that audience, will be incredibly useful to new
> > adopters.
> > >
> > > Incidentally, it might behoove us as a community to have strong
> language
> > > that separates these two activities.  I used "driver" and "client" - I
> am
> > > not necessarily advocating for these terms but rather just that there
> is
> > a
> > > need for terms that are distinct - it is important to separate the
> > concepts
> > > using language/syntax so that people do not get confused.
> > >
> > > On Tue, Nov 29, 2011 at 7:27 AM, David Ormsbee <da...@datadoghq.com>
> > wrote:
> > >
> > >> HI Jay,
> > >>
> > >> >   1. Would you be willing to add this to the kafka wiki so we could
> > make
> > >> >   this the official howto doc?
> > >>
> > >> Absolutely.
> > >>
> > >> >   2. It might be good to add a "how to contribute your client"
> > section.
> > >> >   This would be hard to write right now because we haven't given
> > anyone
> > >> any
> > >> >   guidelines for doing it. We have been pretty liberal in accepting
> > >> clients
> > >> >   kind of proceeding on the "something is better than nothing"
> theory.
> > >> But
> > >> >   this leads to clients of mixed quality and little documentation,
> as
> > >> you and
> > >> >   Joe noted. I will break this into a separate thread to broaden the
> > >> >   discussion.
> > >>
> > >> I'll be happy to add it as soon as we have consensus on what the
> > >> guidelines should be.
> > >>
> > >> Thank you.
> > >>
> > >> Dave
> > >>
> > >
> >
>

Re: Guide to Writing a Client for Kafka

Posted by Taylor Gautier <tg...@tagged.com>.

One thing we should make clear somewhere is that while Kafka has a history
mechanism, it doesn't provide an index.

I probably moved forward in my implementation (and selection) to use Kafka
for 3-4 weeks before realizing that I would not be able to efficiently
query Kafka for the N-1000th message.

This was nearly a deal killer for us, but there are several available
workarounds/solutions:

   - Keep the history somewhere, outside of Kafka, e.g. in a DB, memcache,
   in memory, whatever, if you need to rewind N messages ago.  This kind of
   assumes you have clients that are always making forward progress and
   working against the Kafka stream.  If you have ephemeral clients that come
   and go, and don't have history with the stream, it doesn't work so well
   - Make a minor modification to Kafka to have it implement a reverse
   linked list - where each message also stores the offset of the previous
   message
   - Make a medium change to Kafka to have it store an index of message
   offsets in a secondary topic

We went with option #3...

On Tue, Nov 29, 2011 at 9:06 AM, David Ormsbee <da...@datadoghq.com> wrote:

> Hi Taylor,
>
> Yeah, Joe brought up the need for this distinction as well. When I
> move the doc over to the wiki, I'll try to consistently use "driver"
> to clear up ambiguities. The bits that are more higher-level client
> oriented are really just there for context, to explain why the network
> protocol is what it is. Things like the fetch and offsets requests are
> much easier to explain if you show how it connects to the
> implementation in the back. I wanted to create a single document that
> would take people 90% of the way there to writing a driver while
> assuming minimal prior knowledge, because it's the document I really
> wish I had last month.
>
> I always intended to write a separate document that would more
> comprehensively cover how to use our Python driver, but I imagine that
> part will vary substantially from one implementation to the next. I
> haven't started on that one yet just because our driver's API likely
> won't stabilize for another couple of weeks.
>
> Thank you.
>
> Dave
>
>
> On Tue, Nov 29, 2011 at 10:40 AM, Taylor Gautier <tg...@tagged.com>
> wrote:
> > Just wanted to add my $0.02 - I'm glad David wrote this - excellent job
> sir!
> >
> > My comment is this (I think it might have already been mentioned,
> however I
> > will re-iterate it):  the document as is covers two audiences - those
> that
> > are writing Kafka "drivers" and those that are writing clients that
> publish
> > and consume to Kafka (using a "driver").  Most of the document is geared
> > for the former, however there are some bits that are meant for or are
> > useful also to the latter.
> >
> > I would like to suggest that we split the document up and address each
> > audience separately.  As great as it is that David wrote a lot of great
> > information for the "driver" writers, the need for that will slowly
> > decline, as the drivers slowly become more available and more stable
> > (there's only so many languages in the world).
> >
> > On the other hand, people will be writing their own "clients" using the
> > drivers far more often, so the latter audience will, assuming Kafka
> becomes
> > wildly successful, increase in need.  Beefing up this part of the
> document
> > - by focusing on that audience, will be incredibly useful to new
> adopters.
> >
> > Incidentally, it might behoove us as a community to have strong language
> > that separates these two activities.  I used "driver" and "client" - I am
> > not necessarily advocating for these terms but rather just that there is
> a
> > need for terms that are distinct - it is important to separate the
> concepts
> > using language/syntax so that people do not get confused.
> >
> > On Tue, Nov 29, 2011 at 7:27 AM, David Ormsbee <da...@datadoghq.com>
> wrote:
> >
> >> HI Jay,
> >>
> >> >   1. Would you be willing to add this to the kafka wiki so we could
> make
> >> >   this the official howto doc?
> >>
> >> Absolutely.
> >>
> >> >   2. It might be good to add a "how to contribute your client"
> section.
> >> >   This would be hard to write right now because we haven't given
> anyone
> >> any
> >> >   guidelines for doing it. We have been pretty liberal in accepting
> >> clients
> >> >   kind of proceeding on the "something is better than nothing" theory.
> >> But
> >> >   this leads to clients of mixed quality and little documentation, as
> >> you and
> >> >   Joe noted. I will break this into a separate thread to broaden the
> >> >   discussion.
> >>
> >> I'll be happy to add it as soon as we have consensus on what the
> >> guidelines should be.
> >>
> >> Thank you.
> >>
> >> Dave
> >>
> >
>

Re: Guide to Writing a Client for Kafka

Posted by David Ormsbee <da...@datadoghq.com>.

Hi Taylor,

Yeah, Joe brought up the need for this distinction as well. When I
move the doc over to the wiki, I'll try to consistently use "driver"
to clear up ambiguities. The bits that are more higher-level client
oriented are really just there for context, to explain why the network
protocol is what it is. Things like the fetch and offsets requests are
much easier to explain if you show how it connects to the
implementation in the back. I wanted to create a single document that
would take people 90% of the way there to writing a driver while
assuming minimal prior knowledge, because it's the document I really
wish I had last month.

I always intended to write a separate document that would more
comprehensively cover how to use our Python driver, but I imagine that
part will vary substantially from one implementation to the next. I
haven't started on that one yet just because our driver's API likely
won't stabilize for another couple of weeks.

Thank you.

Dave


On Tue, Nov 29, 2011 at 10:40 AM, Taylor Gautier <tg...@tagged.com> wrote:
> Just wanted to add my $0.02 - I'm glad David wrote this - excellent job sir!
>
> My comment is this (I think it might have already been mentioned, however I
> will re-iterate it):  the document as is covers two audiences - those that
> are writing Kafka "drivers" and those that are writing clients that publish
> and consume to Kafka (using a "driver").  Most of the document is geared
> for the former, however there are some bits that are meant for or are
> useful also to the latter.
>
> I would like to suggest that we split the document up and address each
> audience separately.  As great as it is that David wrote a lot of great
> information for the "driver" writers, the need for that will slowly
> decline, as the drivers slowly become more available and more stable
> (there's only so many languages in the world).
>
> On the other hand, people will be writing their own "clients" using the
> drivers far more often, so the latter audience will, assuming Kafka becomes
> wildly successful, increase in need.  Beefing up this part of the document
> - by focusing on that audience, will be incredibly useful to new adopters.
>
> Incidentally, it might behoove us as a community to have strong language
> that separates these two activities.  I used "driver" and "client" - I am
> not necessarily advocating for these terms but rather just that there is a
> need for terms that are distinct - it is important to separate the concepts
> using language/syntax so that people do not get confused.
>
> On Tue, Nov 29, 2011 at 7:27 AM, David Ormsbee <da...@datadoghq.com> wrote:
>
>> HI Jay,
>>
>> >   1. Would you be willing to add this to the kafka wiki so we could make
>> >   this the official howto doc?
>>
>> Absolutely.
>>
>> >   2. It might be good to add a "how to contribute your client" section.
>> >   This would be hard to write right now because we haven't given anyone
>> any
>> >   guidelines for doing it. We have been pretty liberal in accepting
>> clients
>> >   kind of proceeding on the "something is better than nothing" theory.
>> But
>> >   this leads to clients of mixed quality and little documentation, as
>> you and
>> >   Joe noted. I will break this into a separate thread to broaden the
>> >   discussion.
>>
>> I'll be happy to add it as soon as we have consensus on what the
>> guidelines should be.
>>
>> Thank you.
>>
>> Dave
>>
>

Re: Guide to Writing a Client for Kafka

Posted by David Ormsbee <da...@datadoghq.com>.

I thought that in this context, "driver" would extend to cover the
Clojure-friendly wrapper, and "client programs" would be reserved for
something higher level still, like command-line or GUI interfaces,
utilities that do certain complex operations (e.g. loading a text file
as a series of messages, one per line), etc. MySQL uses that kind of
distinction.

Honestly, I don't really care what we call it so long as it doesn't
cause confusion for others. At Datadog, we've always just referred to
what we were working on as a Kafka client.

Dave


On Tue, Nov 29, 2011 at 1:53 PM, Jay Kreps <ja...@gmail.com> wrote:
> This terminology is a little confusing to me, what is the concrete
> difference between a driver and a client? Are we saying that the current
> scala client code is a "driver" and say, a Clojure-friendly wrapper for the
> scala or java driver would be an example of a "client"? If so do we need to
> call out that distinction? Writing a wrapper/"client" should be a fairly
> trivial thing to do, right? Does it need special terminology and a guide?
>
> Or are we saying that implementing a the network api is a "driver" and
> dealing with cluster awareness is a "client". If so can't we combine these
> into one piece of documentation and call the whole thing a client.
>
> -Jay
>
> On Tue, Nov 29, 2011 at 7:40 AM, Taylor Gautier <tg...@tagged.com> wrote:
>
>> Just wanted to add my $0.02 - I'm glad David wrote this - excellent job
>> sir!
>>
>> My comment is this (I think it might have already been mentioned, however I
>> will re-iterate it):  the document as is covers two audiences - those that
>> are writing Kafka "drivers" and those that are writing clients that publish
>> and consume to Kafka (using a "driver").  Most of the document is geared
>> for the former, however there are some bits that are meant for or are
>> useful also to the latter.
>>
>> I would like to suggest that we split the document up and address each
>> audience separately.  As great as it is that David wrote a lot of great
>> information for the "driver" writers, the need for that will slowly
>> decline, as the drivers slowly become more available and more stable
>> (there's only so many languages in the world).
>>
>> On the other hand, people will be writing their own "clients" using the
>> drivers far more often, so the latter audience will, assuming Kafka becomes
>> wildly successful, increase in need.  Beefing up this part of the document
>> - by focusing on that audience, will be incredibly useful to new adopters.
>>
>> Incidentally, it might behoove us as a community to have strong language
>> that separates these two activities.  I used "driver" and "client" - I am
>> not necessarily advocating for these terms but rather just that there is a
>> need for terms that are distinct - it is important to separate the concepts
>> using language/syntax so that people do not get confused.
>>
>> On Tue, Nov 29, 2011 at 7:27 AM, David Ormsbee <da...@datadoghq.com> wrote:
>>
>> > HI Jay,
>> >
>> > >   1. Would you be willing to add this to the kafka wiki so we could
>> make
>> > >   this the official howto doc?
>> >
>> > Absolutely.
>> >
>> > >   2. It might be good to add a "how to contribute your client" section.
>> > >   This would be hard to write right now because we haven't given anyone
>> > any
>> > >   guidelines for doing it. We have been pretty liberal in accepting
>> > clients
>> > >   kind of proceeding on the "something is better than nothing" theory.
>> > But
>> > >   this leads to clients of mixed quality and little documentation, as
>> > you and
>> > >   Joe noted. I will break this into a separate thread to broaden the
>> > >   discussion.
>> >
>> > I'll be happy to add it as soon as we have consensus on what the
>> > guidelines should be.
>> >
>> > Thank you.
>> >
>> > Dave
>> >
>>
>

Re: Guide to Writing a Client for Kafka

Posted by Jay Kreps <ja...@gmail.com>.

This terminology is a little confusing to me, what is the concrete
difference between a driver and a client? Are we saying that the current
scala client code is a "driver" and say, a Clojure-friendly wrapper for the
scala or java driver would be an example of a "client"? If so do we need to
call out that distinction? Writing a wrapper/"client" should be a fairly
trivial thing to do, right? Does it need special terminology and a guide?

Or are we saying that implementing a the network api is a "driver" and
dealing with cluster awareness is a "client". If so can't we combine these
into one piece of documentation and call the whole thing a client.

-Jay

On Tue, Nov 29, 2011 at 7:40 AM, Taylor Gautier <tg...@tagged.com> wrote:

> Just wanted to add my $0.02 - I'm glad David wrote this - excellent job
> sir!
>
> My comment is this (I think it might have already been mentioned, however I
> will re-iterate it):  the document as is covers two audiences - those that
> are writing Kafka "drivers" and those that are writing clients that publish
> and consume to Kafka (using a "driver").  Most of the document is geared
> for the former, however there are some bits that are meant for or are
> useful also to the latter.
>
> I would like to suggest that we split the document up and address each
> audience separately.  As great as it is that David wrote a lot of great
> information for the "driver" writers, the need for that will slowly
> decline, as the drivers slowly become more available and more stable
> (there's only so many languages in the world).
>
> On the other hand, people will be writing their own "clients" using the
> drivers far more often, so the latter audience will, assuming Kafka becomes
> wildly successful, increase in need.  Beefing up this part of the document
> - by focusing on that audience, will be incredibly useful to new adopters.
>
> Incidentally, it might behoove us as a community to have strong language
> that separates these two activities.  I used "driver" and "client" - I am
> not necessarily advocating for these terms but rather just that there is a
> need for terms that are distinct - it is important to separate the concepts
> using language/syntax so that people do not get confused.
>
> On Tue, Nov 29, 2011 at 7:27 AM, David Ormsbee <da...@datadoghq.com> wrote:
>
> > HI Jay,
> >
> > >   1. Would you be willing to add this to the kafka wiki so we could
> make
> > >   this the official howto doc?
> >
> > Absolutely.
> >
> > >   2. It might be good to add a "how to contribute your client" section.
> > >   This would be hard to write right now because we haven't given anyone
> > any
> > >   guidelines for doing it. We have been pretty liberal in accepting
> > clients
> > >   kind of proceeding on the "something is better than nothing" theory.
> > But
> > >   this leads to clients of mixed quality and little documentation, as
> > you and
> > >   Joe noted. I will break this into a separate thread to broaden the
> > >   discussion.
> >
> > I'll be happy to add it as soon as we have consensus on what the
> > guidelines should be.
> >
> > Thank you.
> >
> > Dave
> >
>

Re: Guide to Writing a Client for Kafka

Posted by Taylor Gautier <tg...@tagged.com>.

Just wanted to add my $0.02 - I'm glad David wrote this - excellent job sir!

My comment is this (I think it might have already been mentioned, however I
will re-iterate it):  the document as is covers two audiences - those that
are writing Kafka "drivers" and those that are writing clients that publish
and consume to Kafka (using a "driver").  Most of the document is geared
for the former, however there are some bits that are meant for or are
useful also to the latter.

I would like to suggest that we split the document up and address each
audience separately.  As great as it is that David wrote a lot of great
information for the "driver" writers, the need for that will slowly
decline, as the drivers slowly become more available and more stable
(there's only so many languages in the world).

On the other hand, people will be writing their own "clients" using the
drivers far more often, so the latter audience will, assuming Kafka becomes
wildly successful, increase in need.  Beefing up this part of the document
- by focusing on that audience, will be incredibly useful to new adopters.

Incidentally, it might behoove us as a community to have strong language
that separates these two activities.  I used "driver" and "client" - I am
not necessarily advocating for these terms but rather just that there is a
need for terms that are distinct - it is important to separate the concepts
using language/syntax so that people do not get confused.

On Tue, Nov 29, 2011 at 7:27 AM, David Ormsbee <da...@datadoghq.com> wrote:

> HI Jay,
>
> >   1. Would you be willing to add this to the kafka wiki so we could make
> >   this the official howto doc?
>
> Absolutely.
>
> >   2. It might be good to add a "how to contribute your client" section.
> >   This would be hard to write right now because we haven't given anyone
> any
> >   guidelines for doing it. We have been pretty liberal in accepting
> clients
> >   kind of proceeding on the "something is better than nothing" theory.
> But
> >   this leads to clients of mixed quality and little documentation, as
> you and
> >   Joe noted. I will break this into a separate thread to broaden the
> >   discussion.
>
> I'll be happy to add it as soon as we have consensus on what the
> guidelines should be.
>
> Thank you.
>
> Dave
>

Re: Guide to Writing a Client for Kafka

Posted by David Ormsbee <da...@datadoghq.com>.

HI Jay,

>   1. Would you be willing to add this to the kafka wiki so we could make
>   this the official howto doc?

Absolutely.

>   2. It might be good to add a "how to contribute your client" section.
>   This would be hard to write right now because we haven't given anyone any
>   guidelines for doing it. We have been pretty liberal in accepting clients
>   kind of proceeding on the "something is better than nothing" theory. But
>   this leads to clients of mixed quality and little documentation, as you and
>   Joe noted. I will break this into a separate thread to broaden the
>   discussion.

I'll be happy to add it as soon as we have consensus on what the
guidelines should be.

Thank you.

Dave

Re: Guide to Writing a Client for Kafka

Posted by Jay Kreps <ja...@gmail.com>.

Dave,

This is really great. Two things:

   1. Would you be willing to add this to the kafka wiki so we could make
   this the official howto doc?
   2. It might be good to add a "how to contribute your client" section.
   This would be hard to write right now because we haven't given anyone any
   guidelines for doing it. We have been pretty liberal in accepting clients
   kind of proceeding on the "something is better than nothing" theory. But
   this leads to clients of mixed quality and little documentation, as you and
   Joe noted. I will break this into a separate thread to broaden the
   discussion.

-Jay

On Wed, Nov 23, 2011 at 3:32 PM, David Ormsbee <da...@datadoghq.com> wrote:

> Hi folks,
>
> Inspired by the Wire Format wiki entry, I recently created a draft of
> "Writing a Client for Kafka":
>
>  http://readthedocs.org/docs/brod/en/latest/spec.html
>
> I tried to make it the document that I wish we had at Datadog when we
> were total Kafka newbies writing client code. It's still very rough
> and littered with "FIXME" notes where I've written things from my
> understanding without verifying them with tests. I should be filling
> all these gaps in soon, as we're planning to add a lot of
> functionality to our Python client in the coming weeks.
>
> Comments and corrections would be greatly appreciated. :-)
>
> Thank you very much to the Kafka devs. We've been using Kafka a great
> deal at Datadog, we're extremely happy with it, and we're looking
> forward to contributing in our own way. We'll make an announcement as
> soon as we feel our Python client is ready for community use.
>
> Take care.
>
> Dave
>